炸裂函数的使用
合并函数
concat---合并--ABB concat_ws---带分隔符 合并{ A,B,B 只要string字符串才能用到 } concat_list---不去重合并---A,B,B concat_set--去重合并
explode
select class,student_name,student_score from test lateral view explode(split(student,',')) sn as student_name lateral view explode(split(score,',')) sc as student_score
select class,student_name,student_score from test lateral view posexplode(split(student,',')) sn as student_index_sn,student_name lateral view posexplode(split(score,',')) sc as student_index_sc,student_score where student_index_sn=student_index_sc
炸裂恢复
collect_list 收集并形成list集合,结果不去重 语法:collect_list(col) 返回值:array 说明:将某分组内该字段的所有值收集成为一个数组,结果不去重 hive> select sex, collect_list(job) from employee group by sex 结果: 女 ["行政","研发","行政","前台"] 男 ["销售","研发","销售","前台"] 2)collect_set 收集并形成set集合,结果去重 语法:collect_set(col) 返回值:array 说明:将某分组内该字段的所有值收集成为一个数组,结果去重 hive> select sex, collect_set(job) from employee group by sex 结果: 女 ["行政","研发","前台"] 男 ["销售","研发","前台"] 类型转换函数: cast 语法: cast(expr as <type>) 返回值: Expected "=" to follow "type" 说明: 返回转换后的数据类型 hive> select cast(1 as bigint) from iteblog; 1
select student,collect_list(concat_ws(',',class)) as class, collect_list(concat_ws(',',cast(score as string))) as score from ( select 'a' as class,'yuxing' as student,100 as score union all select 'c' as class,'yuxing1' as student,60 as score union all select 'b' as class,'yuxing2' as student,80 as score union all select 'b' as class,'yuxing1' as student,80 as score union all select 'c' as class,'yuxing2' as student,80 as score)t group by student
知识
行转列(split+explode+laterview) 列转行(concat_ws+collect_list/set) 多行换成一行 列转行 行转列 炸开 列转行 合并 cast转化为数据格式 判定函数decode(判定字段,值1,结果1,值2,结果2.默认值) 对判定字段的值进行判定,如果值为1,函数的结果为1,与值2相等 coalesce(a,b,c,d)合并 Union all 是两个表格上下两段相加 带all不去重 union是将结果集合并 【上下合并】 join是将表合并 【左右字段合并】
每一门课大于60分的学生的所有科目成绩
select t0.student_name,t2.class_name,t1.score from join sc t1 on t0.id=t1.id join class t2 on t1.cid=t2.id (select sid,min(score) as min_score from sc group by sid having min(score)>60)t3
某平台七日存留计算
select a.date as date, sum((case when datediff(b.date,a.date)=1 then 1 else 0 end))/count(distinct a.uid) as lst1date_rate, --'次日留存率' sum((case when datediff(b.date,a.date)=3 then 1 else 0 end))/count(distinct a.uid) as lst3date_rate, --'三日留存率' sum((case when datediff(b.date,a.date)=7 then 1 else 0 end))/count(distinct a.uid) as lst7date_rate --'七日留存率' from (select uid,substr(datetime,1,10) as date from table group by uid,substr(datatime,1,10)) a) left join ( select uid,substr(datatime,1,10) as date from table group by uid,substr(datetime,1,10) )b on a.uid=b.uid and a.date<b.date group by a.date
查询首次下单后第二天连续下单的用户比率
从订单信息表(order_info)中查询首次下单后第二天仍然下单的用户占所有下单用户的比例,结果保留一位小数,使用百分数显示,期望结果如下:
期望结果如下:
percentage <string> |
70.0% |
需要用到的表:
订单信息表:order_info
order_id (订单id) | user_id (用户id) | create_date (下单日期) | total_amount (订单金额) |
1 | 101 | 2021-09-30 | 29000.00 |
10 | 103 | 2020-10-02 | 28000.00 |
select concat(sum(if(datediff(second_create_date,first_create_date)=1,1,0))/count(*)*100,'%') percentage from (select user_id, min(create_date) first_create_date, max(create_date) second_create_date from (select user_id, create_date from(select user_id, create_date, rank() over ( partition by user_id order by create_date ) rk from ( select distinct create_date, user_id from order_info ) t1 )t2 where rk<=2 )t3 group by user_id)t1;
从订单明细表(order_detail)统计每个商品销售首年的年份,销售数量和销售总额
select sku_id, year(create_date) year, sum(sku_num) order_num, sum(price*sku_num) order_amount from ( select order_id, sku_id, price, sku_num, create_date, rank() over (partition by sku_id order by year(create_date)) rk from order_detail ) t1 where rk = 1 group by sku_id,year(create_date);