hive创建空map
- 众所周知,hive默认的空类型的map对应的类型是
map<string, string>
1
2
3
4
5
6
7beeline> select map() as empty_map;
+------------+
| empty_map |
+------------+
| {} |
+------------+
1 row selected (0.583 seconds)
如何才能创建各种类型的map呢? 比如map<string, bigint>
, map<string, boolean>
, map<string, array<string>>
创建任意类型的空map, 要啥有啥
- 先创建一个带类型map,然后删除key。
1 | add jar hdfs:///lib/brickhouse-0.7.1-SNAPSHOT.jar; |
通过spark程序创建
- val spark_udf_function = spark.udf.register()
1
2
3
4
5
6spark.udf.register("boolean_empty_map", () => Map[String, Boolean]())
spark.udf.register("bigint_empty_map", () => Map[String, Long]())
spark.udf.register("double_empty_map", () => Map[String, Double]())
spark.udf.register("string_empty_map", () => Map[String, String]())
spark.udf.register("list_empty_map", () => Map[String, Array[String]]())
spark.sql("SELECT boolean_empty_map(), bigint_empty_map(), double_empty_map(), string_empty_map(), list_empty_map()").printSchema()
map合并union_map
union_map能够相同类型的map
1
2
3
4add jar hdfs:///lib/brickhouse-0.7.1-SNAPSHOT.jar;
CREATE TEMPORARY FUNCTION union_map AS 'brickhouse.udf.collect.UnionUDAF';
select id, union_map(m) from (select 1 as id, map('a','1', 'b','2') as m union all select 1 as id, map('a', 0, 'c','3', 'd','4') as m) t group by id;小 bug, union_map 在合并 key 带有冲突的情况,由于并行原因,不能实现新加入的 key 覆盖。
1
2
3
4
5
6
7
8select id,union_map(m) from
(
select 1 as id, map('a','1', 'b','2') as m
union all
select 1 as id, map('a', '1', 'c','3', 'd','4') as m
union all
select 1 as id, map('a', '0', 'c','4', 'd','5') as m
) t group by id;
参考
[1]. brickhouse udf