少年游

欲买桂花同载酒,终不似,少年游。

0%

hive 保序concat

hive保序concat

品类里面取TOP2并按score分值连接成字符串

  1. row_number() 首先要排序。
  2. where rank < N 其次要进行过滤。
    以上两步缺一不可
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
select
uid,
prefer_type,
sum(watch_proba) as watch_proba,
concat_ws(";", collect_list(item)) as type_item
from
(
select
uid,
gt,
prefer_type,
watch_proba,
watch_proba_normalize,
concat(gt, ":", substr(cast(watch_proba_normalize as decimal(32,30)), 0, 6)) as item, -- 排除科学计数法,小数分值截断
row_number() over ( partition by uid, prefer_type order by watch_proba desc ) as rank
from
(
select
uid,
gt,
prefer_type,
watch_proba,
watch_proba / sum(watch_proba) over ( partition by uid, prefer_type ) as watch_proba_normalize -- 归一化
from recom_app.prefer_type
) t11
where watch_proba_normalize >= 0.0001
) t12
where rank <= 2
group by uid, prefer_type