ai.rapids.cudf.Table 类的用法 (cudfjni 25.04.0 API)

使用 Table 的包
包描述

ai.rapids.cudf

ai.rapids.cudf.ast

使用 Table 的包
包	描述
ai.rapids.cudf
ai.rapids.cudf.ast

ai.rapids.cudf 中 Table 的用法

返回 Table 的 ai.rapids.cudf 中的方法
修饰符和类型	方法和描述
`Table`	Table.GroupByOperation.`aggregate(GroupByAggregationOnColumn... aggregates)` 对由索引表示的列组进行聚合用法: aggregate(count(), max(2),...); 示例: 输入: 1, 1, 1 1, 2, 1 2, 4, 5 table.groupBy(0, 2).count() col0, col1 输出: 1, 1 1, 2 2, 1 ==> 聚合计数
`Table`	Table.GroupByOperation.`aggregateWindows(AggregationOverWindow... windowAggregates)` 根据参数中指定的窗口，在 Table/投影上计算基于行的窗口聚合函数。
`Table`	Table.GroupByOperation.`aggregateWindowsOverRanges(AggregationOverWindow... windowAggregates)` 根据参数中指定的窗口，在 Table/投影上计算基于范围的窗口聚合函数。
`Table`	Table.TestBuilder.`build()`
`static Table`	Aggregation128Utils.`combineInt64SumChunks(Table chunks, DType type)` 从一个包含四个 64 位整数列的表中重新组装一列 128 位值并检查溢出。
`static Table`	Table.`concatenate(Table... tables)` 将多个表连接起来形成一个单一的表。
`Table`	Table.`crossJoin(Table right)` 连接两个表，左边所有行与右边所有行连接。
`Table`	Table.`dropDuplicates(int[] keyColumns, Table.DuplicateKeepOption keep, boolean nullsEqual)` 将当前表的行复制到一个输出表，以忽略键列中的重复行（即，只复制重复行中的一行）。
`Table`	Table.`explode(int index)` 展开一个列表列的元素。
`Table`	Table.`explodeOuter(int index)` 展开一个列表列的元素。
`Table`	Table.`explodeOuterPosition(int index)` 展开一个列表列的元素，保留任何 null 条目或空列表，并包含一个位置列。
`Table`	Table.`explodePosition(int index)` 展开一个列表列的元素并包含一个位置列。
`Table`	ColumnView.`extractRe(RegexProgram regexProg)` 对于给定正则表达式程序中指定的每个捕获组，返回表中的一列。
`Table`	ColumnView.`extractRe(String pattern)` 已废弃。
`Table`	Table.`filter(ColumnView mask)` 使用布尔值列作为掩码过滤此表，返回一个新表。
`static Table`	Table.`fromPackedTable(ByteBuffer metadata, DeviceMemoryBuffer data)` 从打包表示构造一个表。
`Table`	Table.`gather(ColumnView gatherMap)` 根据 `gatherMap` 收集此表的行，使得结果表中列的行“i”将包含此表的行“gatherMap[i]”。
`Table`	Table.`gather(ColumnView gatherMap, OutOfBoundsPolicy outOfBoundsPolicy)` 根据 `gatherMap` 收集此表的行，使得结果表中列的行“i”将包含此表的行“gatherMap[i]”。
`Table`	StreamedTableReader.`getNextIfAvailable()` 如果可用，获取下一个表。
`Table`	StreamedTableReader.`getNextIfAvailable(int rowTarget)` 如果可用，获取下一个表。
`Table`	JCudfSerialization.TableAndRowCountPair.`getTable()` 获取反序列化的 Table，如果没有数据（例如：没有列的行），则返回 null。
`Table`	PartitionedTable.`getTable()`
`Table`	ContiguousTable.`getTable()` 获取表实例，如果必要，从元数据中重建它。
`Table`	ContigSplitGroupByResult.`getUniqKeyTable()` 获取键表，键表中的每一行对应一个组。
`static Table`	Table.`merge(List<Table> tables, OrderByArg... args)` 合并多个已排序的表，保持相同的排序顺序。
`static Table`	Table.`merge(Table[] tables, OrderByArg... args)` 合并多个已排序的表，保持相同的排序顺序。
`Table`	Table.`orderBy(OrderByArg... args)` 使用排序键对表进行排序，返回一个新分配的表。
`static Table`	JCudfSerialization.`readAndConcat(JCudfSerialization.SerializedTableHeader[] headers, HostMemoryBuffer[] dataBuffers)`
`static Table`	Table.`readAvro(AvroOptions opts, byte[] buffer)` 读取 Avro 格式数据。
`static Table`	Table.`readAvro(AvroOptions opts, byte[] buffer, long offset, long len)`
`static Table`	Table.`readAvro(AvroOptions opts, byte[] buffer, long offset, long len, HostMemoryAllocator hostMemoryAllocator)` 读取 Avro 格式数据。
`static Table`	Table.`readAvro(AvroOptions opts, DataSource ds)`
`static Table`	Table.`readAvro(AvroOptions opts, File path)` 读取 Avro 文件。
`static Table`	Table.`readAvro(AvroOptions opts, HostMemoryBuffer buffer, long offset, long len)` 读取 Avro 格式数据。
`static Table`	Table.`readAvro(byte[] buffer)` 读取 Avro 格式数据。
`static Table`	Table.`readAvro(File path)` 使用默认 AvroOptions 读取 Avro 文件。
`Table`	ParquetChunkedReader.`readChunk()` 读取给定 Parquet 文件中的行块，以便返回数据的总大小不超过给定的读取限制。
`Table`	ORCChunkedReader.`readChunk()` 读取给定 ORC 文件中的行块，以便返回数据的总大小不超过给定的读取限制。
`static Table`	Table.`readCSV(Schema schema, byte[] buffer)` 使用默认 CSVOptions 读取 CSV 格式数据。
`static Table`	Table.`readCSV(Schema schema, CSVOptions opts, byte[] buffer)` 读取 CSV 格式数据。
`static Table`	Table.`readCSV(Schema schema, CSVOptions opts, byte[] buffer, long offset, long len)`
`static Table`	Table.`readCSV(Schema schema, CSVOptions opts, byte[] buffer, long offset, long len, HostMemoryAllocator hostMemoryAllocator)` 读取 CSV 格式数据。
`static Table`	Table.`readCSV(Schema schema, CSVOptions opts, DataSource ds)`
`static Table`	Table.`readCSV(Schema schema, CSVOptions opts, File path)` 读取 CSV 文件。
`static Table`	Table.`readCSV(Schema schema, CSVOptions opts, HostMemoryBuffer buffer, long offset, long len)` 读取 CSV 格式数据。
`static Table`	Table.`readCSV(Schema schema, File path)` 使用默认 CSVOptions 读取 CSV 文件。
`static Table`	Table.`readJSON(Schema schema, byte[] buffer)` 使用默认 JSONOptions 读取 JSON 格式数据。
`static Table`	Table.`readJSON(Schema schema, File path)` 使用默认 JSONOptions 读取 JSON 文件。
`static Table`	Table.`readJSON(Schema schema, JSONOptions opts, byte[] buffer)` 读取 JSON 格式数据。
`static Table`	Table.`readJSON(Schema schema, JSONOptions opts, byte[] buffer, long offset, long len)`
`static Table`	Table.`readJSON(Schema schema, JSONOptions opts, byte[] buffer, long offset, long len, HostMemoryAllocator hostMemoryAllocator)` 读取 JSON 格式数据。
`static Table`	Table.`readJSON(Schema schema, JSONOptions opts, byte[] buffer, long offset, long len, HostMemoryAllocator hostMemoryAllocator, int emptyRowCount)` 已废弃。此方法已废弃，因为 emptyRowCount 未使用。请改用不带 emptyRowCount 的方法。
`static Table`	Table.`readJSON(Schema schema, JSONOptions opts, byte[] buffer, long offset, long len, int emptyRowCount)`
`static Table`	Table.`readJSON(Schema schema, JSONOptions opts, DataSource ds)` 读取 JSON 格式数据。
`static Table`	Table.`readJSON(Schema schema, JSONOptions opts, DataSource ds, int emptyRowCount)` 已废弃。此方法已废弃，因为 emptyRowCount 未使用。请改用不带 emptyRowCount 的方法。
`static Table`	Table.`readJSON(Schema schema, JSONOptions opts, File path)` 读取 JSON 文件。
`static Table`	Table.`readJSON(Schema schema, JSONOptions opts, HostMemoryBuffer buffer, long offset, long len)` 读取 JSON 格式数据。
`static Table`	Table.`readJSON(Schema schema, JSONOptions opts, HostMemoryBuffer buffer, long offset, long len, int emptyRowCount)` 已废弃。此方法已废弃，因为 emptyRowCount 未使用。请改用不带 emptyRowCount 的方法。
`static Table`	Table.`readORC(byte[] buffer)` 读取 ORC 格式数据。
`static Table`	Table.`readORC(File path)` 使用默认 ORCOptions 读取 ORC 文件。
`static Table`	Table.`readORC(ORCOptions opts, byte[] buffer)` 读取 ORC 格式数据。
`static Table`	Table.`readORC(ORCOptions opts, byte[] buffer, long offset, long len)`
`static Table`	Table.`readORC(ORCOptions opts, byte[] buffer, long offset, long len, HostMemoryAllocator hostMemoryAllocator)` 读取 ORC 格式数据。
`static Table`	Table.`readORC(ORCOptions opts, DataSource ds)`
`static Table`	Table.`readORC(ORCOptions opts, File path)` 读取 ORC 文件。
`static Table`	Table.`readORC(ORCOptions opts, HostMemoryBuffer buffer, long offset, long len)` 读取 ORC 格式数据。
`static Table`	Table.`readParquet(byte[] buffer)` 读取 Parquet 格式数据。
`static Table`	Table.`readParquet(File path)` 使用默认 ParquetOptions 读取 Parquet 文件。
`static Table`	Table.`readParquet(ParquetOptions opts, byte[] buffer)` 读取 Parquet 格式数据。
`static Table`	Table.`readParquet(ParquetOptions opts, byte[] buffer, long offset, long len)` 读取 Parquet 格式数据。
`static Table`	Table.`readParquet(ParquetOptions opts, byte[] buffer, long offset, long len, HostMemoryAllocator hostMemoryAllocator)` 读取 Parquet 格式数据。
`static Table`	Table.`readParquet(ParquetOptions opts, DataSource ds)` 读取 Parquet 格式数据。
`static Table`	Table.`readParquet(ParquetOptions opts, File path)` 读取 Parquet 文件。
`static Table`	Table.`readParquet(ParquetOptions opts, HostMemoryBuffer... buffers)` 读取 Parquet 格式数据。
`static Table`	Table.`readParquet(ParquetOptions opts, HostMemoryBuffer buffer, long offset, long len)` 读取 Parquet 格式数据。
`Table`	TableWithMeta.`releaseTable()` 从此元数据中获取表。
`Table`	Table.`repeat(ColumnView counts)` 通过重复此表的每一行来创建一个新表。
`Table`	Table.`repeat(int count)` 将此表的每一行重复 count 次。
`Table`	Table.GroupByOperation.`replaceNulls(ReplacePolicyWithColumn... replacements)`
`Table`	Table.`sample(long n, boolean replacement, long seed)` 从表中随机抽取 `n` 个样本注意: 不保留顺序示例: 输入: {col1: {1, 2, 3, 4, 5}, col2: {6, 7, 8, 9, 10}} n: 3 replacement: false 输出: {col1: {3, 1, 4}, col2: {8, 6, 9}} replacement: true 输出: {col1: {3, 1, 1}, col2: {8, 6, 6}} 如果 `n` > 表行数且 `replacement` == FALSE，则抛出 "logic_error"。
`Table`	Table.GroupByOperation.`scan(GroupByScanAggregationOnColumn... aggregates)`
`Table`	Table.`scatter(ColumnView scatterMap, Table target)` 将源表中的值散射到目标表中，但不替换原位置，返回一个新结果表。
`static Table`	Table.`scatter(Scalar[] source, ColumnView scatterMap, Table target)` 将源行中的值散射到目标表中，但不替换原位置，返回一个新结果表。
`Table`	ColumnView.`stringSplit(RegexProgram regexProg)` 通过使用指定的正则表达式程序模式分割每个字符串来返回列的列表。
`Table`	ColumnView.`stringSplit(RegexProgram regexProg, int limit)` 通过使用指定的正则表达式程序模式分割每个字符串来返回列的列表。
`Table`	ColumnView.`stringSplit(String delimiter)` 通过使用指定的字符串字面量分隔符分割每个字符串来返回列的列表。
`Table`	ColumnView.`stringSplit(String pattern, boolean splitByRegex)` 已废弃。
`Table`	ColumnView.`stringSplit(String delimiter, int limit)` 通过使用指定的字符串字面量分隔符分割每个字符串来返回列的列表。
`Table`	ColumnView.`stringSplit(String pattern, int limit, boolean splitByRegex)` 已废弃。

ai.rapids.cudf 中参数类型为 Table 的方法
修饰符和类型	方法和描述
`static Table`	Aggregation128Utils.`combineInt64SumChunks(Table chunks, DType type)` 从一个包含四个 64 位整数列的表中重新组装一列 128 位值并检查溢出。
`static Table`	Table.`concatenate(Table... tables)` 将多个表连接起来形成一个单一的表。
`GatherMap[]`	Table.`conditionalFullJoinGatherMaps(Table rightTable, CompiledExpression condition)` 当条件表达式为 true 时，计算可用于展示两个表之间 full join 结果的 gather maps。
`GatherMap[]`	Table.`conditionalInnerJoinGatherMaps(Table rightTable, CompiledExpression condition)` 当条件表达式为 true 时，计算可用于展示两个表之间 inner join 结果的 gather maps。
`GatherMap[]`	Table.`conditionalInnerJoinGatherMaps(Table rightTable, CompiledExpression condition, long outputRowCount)` 当条件表达式为 true 时，计算可用于展示两个表之间 inner join 结果的 gather maps。
`long`	Table.`conditionalInnerJoinRowCount(Table rightTable, CompiledExpression condition)` 当条件表达式为 true 时，计算两个表之间 inner join 结果的行数。
`GatherMap`	Table.`conditionalLeftAntiJoinGatherMap(Table rightTable, CompiledExpression condition)` 当条件表达式为 true 时，计算可用于展示两个表之间 left anti join 结果的 gather map。
`GatherMap`	Table.`conditionalLeftAntiJoinGatherMap(Table rightTable, CompiledExpression condition, long outputRowCount)` 当条件表达式为 true 时，计算可用于展示两个表之间 left anti join 结果的 gather map。
`long`	Table.`conditionalLeftAntiJoinRowCount(Table rightTable, CompiledExpression condition)` 当条件表达式为 true 时，计算两个表之间 left anti join 结果的行数。
`GatherMap[]`	Table.`conditionalLeftJoinGatherMaps(Table rightTable, CompiledExpression condition)` 当条件表达式为 true 时，计算可用于展示两个表之间 left join 结果的 gather maps。
`GatherMap[]`	Table.`conditionalLeftJoinGatherMaps(Table rightTable, CompiledExpression condition, long outputRowCount)` 当条件表达式为 true 时，计算可用于展示两个表之间 left join 结果的 gather maps。
`long`	Table.`conditionalLeftJoinRowCount(Table rightTable, CompiledExpression condition)` 当条件表达式为 true 时，计算两个表之间 left join 结果的行数。
`GatherMap`	Table.`conditionalLeftSemiJoinGatherMap(Table rightTable, CompiledExpression condition)` 当条件表达式为 true 时，计算可用于展示两个表之间 left semi join 结果的 gather map。
`GatherMap`	Table.`conditionalLeftSemiJoinGatherMap(Table rightTable, CompiledExpression condition, long outputRowCount)` 当条件表达式为 true 时，计算可用于展示两个表之间 left semi join 结果的 gather map。
`long`	Table.`conditionalLeftSemiJoinRowCount(Table rightTable, CompiledExpression condition)` 当条件表达式为 true 时，计算两个表之间 left semi join 结果的行数。
`Table`	Table.`crossJoin(Table right)` 连接两个表，左边所有行与右边所有行连接。
`void`	TableDebug.`debug(String name, Table table)` 打印表的内容。
`void`	ArrowIPCWriterOptions.DoneOnGpu.`doneWithTheGpu(Table table)` 一个回调，指示表已从 GPU 卸载并且可以关闭，即使所有数据尚未写入完成。
`GatherMap[]`	Table.`fullJoinGatherMaps(Table rightKeys, boolean compareNullsEqual)` 计算可用于展示两个表之间 full equi-join 结果的 gather maps。
`GatherMap[]`	Table.`innerDistinctJoinGatherMaps(Table rightKeys, boolean compareNullsEqual)` 计算可用于展示两个表之间 inner equi-join 结果的 gather maps，其中右表保证不包含任何重复的 join 键。
`GatherMap[]`	Table.`innerJoinGatherMaps(Table rightKeys, boolean compareNullsEqual)` 计算可用于展示两个表之间 inner equi-join 结果的 gather maps。
`GatherMap`	Table.`leftAntiJoinGatherMap(Table rightKeys, boolean compareNullsEqual)` 计算可用于展示两个表之间 left anti-join 结果的 gather map。
`GatherMap`	Table.`leftDistinctJoinGatherMap(Table rightKeys, boolean compareNullsEqual)` 计算一个 gather map，可用于展示两个表之间 left equi-join 结果，其中右表保证不包含任何重复的 join 键。
`GatherMap[]`	Table.`leftJoinGatherMaps(Table rightKeys, boolean compareNullsEqual)` 计算可用于展示两个表之间 left equi-join 结果的 gather maps。
`GatherMap`	Table.`leftSemiJoinGatherMap(Table rightKeys, boolean compareNullsEqual)` 计算可用于展示两个表之间 left semi-join 结果的 gather map。
`ColumnVector`	Table.`lowerBound(boolean[] areNullsSmallest, Table valueTable, boolean[] descFlags)` 在已排序表中查找应插入值以保持顺序的最小索引。
`ColumnVector`	Table.`lowerBound(Table valueTable, OrderByArg... args)` 在已排序表中查找应插入值以保持顺序的最小索引。
`static Table`	Table.`merge(Table[] tables, OrderByArg... args)` 合并多个已排序的表，保持相同的排序顺序。
`static GatherMap[]`	Table.`mixedFullJoinGatherMaps(Table leftKeys, Table rightKeys, Table leftConditional, Table rightConditional, CompiledExpression condition, NullEquality nullEquality)` 计算可用于展示两个表之间使用 equality 和 inequality 条件混合进行 full join 结果的 gather maps。
`static GatherMap[]`	Table.`mixedInnerJoinGatherMaps(Table leftKeys, Table rightKeys, Table leftConditional, Table rightConditional, CompiledExpression condition, NullEquality nullEquality)` 计算可用于展示两个表之间使用 equality 和 inequality 条件混合进行 inner join 结果的 gather maps。
`static GatherMap[]`	Table.`mixedInnerJoinGatherMaps(Table leftKeys, Table rightKeys, Table leftConditional, Table rightConditional, CompiledExpression condition, NullEquality nullEquality, MixedJoinSize joinSize)` 计算可用于展示两个表之间使用 equality 和 inequality 条件混合进行 inner join 结果的 gather maps。
`static MixedJoinSize`	Table.`mixedInnerJoinSize(Table leftKeys, Table rightKeys, Table leftConditional, Table rightConditional, CompiledExpression condition, NullEquality nullEquality)` 计算两个表之间使用 equality 和 inequality 条件混合进行 inner join 的输出大小信息。
`static GatherMap`	Table.`mixedLeftAntiJoinGatherMap(Table leftKeys, Table rightKeys, Table leftConditional, Table rightConditional, CompiledExpression condition, NullEquality nullEquality)` 计算可用于展示两个表之间使用 equality 和 inequality 条件混合进行 left anti join 结果的 gather map。
`static GatherMap[]`	Table.`mixedLeftJoinGatherMaps(Table leftKeys, Table rightKeys, Table leftConditional, Table rightConditional, CompiledExpression condition, NullEquality nullEquality)` 计算可用于展示两个表之间使用 equality 和 inequality 条件混合进行 left join 结果的 gather maps。
`static GatherMap[]`	Table.`mixedLeftJoinGatherMaps(Table leftKeys, Table rightKeys, Table leftConditional, Table rightConditional, CompiledExpression condition, NullEquality nullEquality, MixedJoinSize joinSize)` 计算可用于展示两个表之间使用 equality 和 inequality 条件混合进行 left join 结果的 gather maps。
`static MixedJoinSize`	Table.`mixedLeftJoinSize(Table leftKeys, Table rightKeys, Table leftConditional, Table rightConditional, CompiledExpression condition, NullEquality nullEquality)` 计算两个表之间使用 equality 和 inequality 条件混合进行 left join 的输出大小信息。
`static GatherMap`	Table.`mixedLeftSemiJoinGatherMap(Table leftKeys, Table rightKeys, Table leftConditional, Table rightConditional, CompiledExpression condition, NullEquality nullEquality)` 计算可用于展示两个表之间使用 equality 和 inequality 条件混合进行 left semi join 结果的 gather map。
`Table`	Table.`scatter(ColumnView scatterMap, Table target)` 将源表中的值散射到目标表中，但不替换原位置，返回一个新结果表。
`static Table`	Table.`scatter(Scalar[] source, ColumnView scatterMap, Table target)` 将源行中的值散射到目标表中，但不替换原位置，返回一个新结果表。
`ColumnVector`	Table.`upperBound(boolean[] areNullsSmallest, Table valueTable, boolean[] descFlags)` 在已排序表中查找应插入值以保持顺序的最大索引。
`ColumnVector`	Table.`upperBound(Table valueTable, OrderByArg... args)` 在已排序表中查找应插入值以保持顺序的最大索引。
`abstract void`	TableWriter.`write(Table table)` 写入一个表。
`static void`	JCudfSerialization.`writeToStream(Table t, OutputStream out, long rowOffset, long numRows)` 以内部格式写入表的一部分或全部内容。

ai.rapids.cudf 中类型参数为 Table 的方法参数
修饰符和类型	方法和描述
`static Table`	Table.`merge(List<Table> tables, OrderByArg... args)` 合并多个已排序的表，保持相同的排序顺序。

ai.rapids.cudf 中参数类型为 Table 的构造器
构造器和描述
`HashJoin(Table buildKeys, boolean compareNulls)` 从表示 join 中右侧表的 join key 列的表中构建 join 的哈希表。

ai.rapids.cudf.ast 中 Table 的用法

ai.rapids.cudf.ast 中参数类型为 Table 的方法
修饰符和类型	方法和描述
`ColumnVector`	CompiledExpression.`computeColumn(Table table)` 通过将此 AST 表达式应用于指定的表来计算新列。

类使用ai.rapids.cudf.Table

ai.rapids.cudf 中 Table 的用法

ai.rapids.cudf.ast 中 Table 的用法

类使用
ai.rapids.cudf.Table