admin 管理员组文章数量: 1184232
2024年1月24日发(作者:matlab2021a中文包)
union
1"""2Return the union of this RDD and another one.3
4>>> rdd = elize([1, 1, 2, 3])5>>> (rdd).collect()6[1, 1, 2, 3, 1, 1, 2, 3]7"""8
distinct
1"""2Return a new RDD containing the distinct elements in this RDD.3
4>>> sorted(elize([1, 1, 2, 3]).distinct().collect())5[1, 2, 3]6"""7
join
1>>> a = elize([("A", "a1"), ("C", "c1"), ("D", "d1"), ("F", "f1"), ("F", "f2")])2>>> b = elize([("A", "a2"), ("C", "c2"), ("C", "c3"), ("E", "e1")])3>>> (b).collect()4[('C', ('c1', 'c2')), ('C', ('c1', 'c3')), ('A', ('a1', 'a2'))]5
leftOuterJoin
1>>> terJoin(b).collect()2[('F', ('f1', None)), ('F', ('f2', None)), ('D', ('d1', None)), ('C', ('c1', 'c2')), ('C', ('c1', 'c3')), ('A', ('a1', 'a2'))]3
rightOuterJoin
1>>> uterJoin(b).collect()2[('E', (None, 'e1')), ('C', ('c1', 'c2')), ('C', ('c1', 'c3')), ('A', ('a1', 'a2'))]3
fullOuterJoin
1>>> terJoin(b).collect()2[('F', ('f1', None)), ('F', ('f2', None)), ('D', ('d1', None)), ('E', (None, 'e1')), ('C', ('c1', 'c2')), ('C', ('c1', 'c3')), ('A', ('a1', 'a2'))]3>>>4
版权声明:本文标题:Spark系列:Python版Spark编程指南 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://roclinux.cn/b/1706032244a498939.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论