列合并#

group 合并

函数

std::unique_ptr<cudf::table> merge(std::vector<table_view> const &tables_to_merge, std::vector<cudf::size_type> const &key_cols, std::vector<cudf::order> const &column_order, std::vector<cudf::null_order> const &null_precedence = {}, rmm::cuda_stream_view stream = cudf::get_default_stream(), rmm::device_async_resource_ref mr = cudf::get_current_device_resource_ref())#

合并一组已排序的表。

将已排序的表合并为一个包含所有表数据的已排序表。每个表的键列必须根据为该列指定的参数(cudf::column_order 和 cudf::null_order)进行排序。

Example 1:
input:
table 1 => col 1 {0, 1, 2, 3}
           col 2 {4, 5, 6, 7}
table 2 => col 1 {1, 2}
           col 2 {8, 9}
table 3 => col 1 {2, 4}
           col 2 {8, 9}
output:
table => col 1 {0, 1, 1, 2, 2, 2, 3, 4}
         col 2 {4, 5, 8, 6, 8, 9, 7, 9}
Example 2:
input:
table 1 => col 0 {1, 0}
           col 1 {'c', 'b'}
           col 2 {RED, GREEN}


table 2 => col 0 {1}
           col 1 {'a'}
           col 2 {NULL}

 with key_cols[] = {0,1}
 and  asc_desc[] = {ASC, ASC};

 Lex-sorting is on columns {0,1}; hence, lex-sorting of ((L0 x L1) V (R0 x R1)) is:
 (0,'b', GREEN), (1,'a', NULL), (1,'c', RED)

 (third column, the "color", just "goes along for the ride";
  meaning it is permuted according to the data movements dictated
  by lexicographic ordering of columns 0 and 1)

  with result columns:

  Res0 = {0,1,1}
  Res1 = {'b', 'a', 'c'}
  Res2 = {GREEN, NULL, RED}

抛出:
参数:
  • tables_to_merge[in] 要合并的非空表列表

  • key_cols[in] 用作比较标准的键列索引

  • column_order[in] 由 key_cols 索引的列的排序类型

  • null_precedence[in] 数组,指示索引列 (key_cols) 中空值相对于非空值的顺序

  • stream – 用于设备内存操作和内核启动的 CUDA 流

  • mr – 用于分配返回表的设备内存的设备内存资源

返回:

一个包含来自所有输入表的已排序数据的表