文件
文件	reduction.hpp

枚举
枚举类	cudf::scan_type : bool { INCLUSIVE , EXCLUSIVE }
	描述扫描操作类型的枚举。

函数
std::unique_ptr< scalar >	cudf::reduce (column_view const &col, reduce_aggregation const &agg, data_type output_dtype, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
	计算列中所有行的值的归约。更多...

std::unique_ptr< scalar >	cudf::reduce (column_view const &col, reduce_aggregation const &agg, data_type output_dtype, std::optional< std::reference_wrapper< scalar const >> init, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
	计算具有初始值的列中所有行的值的归约。更多...

std::unique_ptr< column >	cudf::segmented_reduce (column_view const &segmented_values, device_span< size_type const > offsets, segmented_reduce_aggregation const &agg, data_type output_dtype, null_policy null_handling, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
	计算输入列中每个段的归约。更多...

std::unique_ptr< column >	cudf::segmented_reduce (column_view const &segmented_values, device_span< size_type const > offsets, segmented_reduce_aggregation const &agg, data_type output_dtype, null_policy null_handling, std::optional< std::reference_wrapper< scalar const >> init, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
	计算具有初始值的输入列中每个段的归约。仅支持 SUM、PRODUCT、MIN、MAX、ANY 和 ALL 聚合。更多...

std::unique_ptr< column >	cudf::scan (column_view const &input, scan_aggregation const &agg, scan_type inclusive, null_policy null_handling=null_policy::EXCLUDE, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
	计算列的扫描。更多...

std::pair< std::unique_ptr< scalar >, std::unique_ptr< scalar > >	cudf::minmax (column_view const &col, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
	确定列的最小值和最大值。更多...

详细描述

函数文档

◆ minmax()

std::pair<std::unique_ptr<scalar>, std::unique_ptr<scalar> > cudf::minmax	(	column_view const &	col,
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`,
		rmm::device_async_resource_ref	mr = `cudf::get_current_device_resource_ref()`
	)

确定列的最小值和最大值。

参数

col	计算最小/最大值的列
stream	用于设备内存操作和核函数启动的 CUDA stream
mr	用于分配返回列的设备内存的设备内存资源

返回值: 一个 std::pair 标量，其中第一个标量是输入列的最小值，第二个标量是最大值。

◆ reduce() [1/2]

std::unique_ptr<scalar> cudf::reduce	(	column_view const &	col,
		reduce_aggregation const &	agg,
		data_type	output_dtype,
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`,
		rmm::device_async_resource_ref	mr = `cudf::get_current_device_resource_ref()`
	)

计算列中所有行的值的归约。

此函数不检测归约中的溢出。当 output_dtype 与 col.type() 不匹配时，它们的值可能会被提升为 int64_t 或 double 以计算聚合，然后在返回之前转换为 output_dtype。

对于非算术类型（例如时间戳或字符串）的归约，仅支持 min 和 max 操作。

操作中跳过任何 null 值。

如果列为空或包含所有 null 条目 col.size()==col.null_count()，则对于 any 归约类型，输出标量值为 false；对于 all 归约类型，输出标量值为 true。对于所有其他归约，输出标量的 is_valid()==false。

如果输入列是算术类型，则 output_dtype 可以是任何算术类型。如果输入列是非算术类型（例如时间戳或字符串），则 output_dtype 必须与 col.type() 匹配。如果归约类型是 any 或 all，则 output_dtype 必须是 BOOL8 类型。

如果归约失败，则输出标量的 is_valid()==false。

异常

cudf::logic_error	如果对非算术输出类型调用归约，且操作符不是 `min` 和 `max`。
cudf::logic_error	如果输入列数据类型无法转换为 `output_dtype`。
cudf::logic_error	如果调用了 `min` 或 `max` 归约，并且输出类型与输入列数据类型不匹配。
cudf::logic_error	如果调用了 `any` 或 `all` 归约，并且输出类型不是 BOOL8。
cudf::logic_error	如果调用了 `mean`、`var` 或 `std` 归约，并且 `output_dtype` 不是浮点类型。

参数

col	输入列视图
agg	归约应用的聚合操作符
output_dtype	输出标量类型
stream	用于设备内存操作和核函数启动的 CUDA stream
mr	用于分配返回标量设备内存的设备内存资源

返回值: 包含归约结果的输出标量

◆ reduce() [2/2]

std::unique_ptr<scalar> cudf::reduce	(	column_view const &	col,
		reduce_aggregation const &	agg,
		data_type	output_dtype,
		std::optional< std::reference_wrapper< scalar const >>	init,
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`,
		rmm::device_async_resource_ref	mr = `cudf::get_current_device_resource_ref()`
	)

计算具有初始值的列中所有行的值的归约。

仅支持 sum、product、min、max、any 和 all 归约。

异常

cudf::logic_error 如果归约不是 sum、product、min、max、any 或 all，并且指定了 init。

参数

col	输入列视图
agg	归约应用的聚合操作符
output_dtype	输出标量类型
init	归约的初始值
stream	用于设备内存操作和核函数启动的 CUDA stream
mr	用于分配返回标量设备内存的设备内存资源

返回值: 包含归约结果的输出标量

◆ scan()

std::unique_ptr<column> cudf::scan	(	column_view const &	input,
		scan_aggregation const &	agg,
		scan_type	inclusive,
		null_policy	null_handling = `null_policy::EXCLUDE`，
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`,
		rmm::device_async_resource_ref	mr = `cudf::get_current_device_resource_ref()`
	)

计算列的扫描。

操作中跳过 null 值，如果索引 i 处的输入元素为 null，则索引 i 处的输出元素也将为 null。

异常

cudf::logic_error 如果列数据类型不是数值类型。

参数

[in]	input	用于扫描的输入列视图
[in]	agg	扫描应用的聚合操作符的 unique_ptr
[in]	inclusive	如果为 scan_type::INCLUSIVE，则应用包容性扫描的标志；如果为 scan_type::EXCLUSIVE，则应用排他性扫描的标志。
[in]	null_handling	如果为 null_policy::EXCLUDE，则在计算结果时排除 null 值。如果为 null_policy::INCLUDE，则包含 null 值。任何与 null 相关的操作结果均为 null。
[in]	stream	用于设备内存操作和核函数启动的 CUDA stream
[in]	mr	用于分配返回标量设备内存的设备内存资源

返回值: 扫描的输出列

◆ segmented_reduce() [1/2]

std::unique_ptr<column> cudf::segmented_reduce	(	column_view const &	segmented_values,
		device_span< size_type const >	offsets,
		segmented_reduce_aggregation const &	agg,
		data_type	output_dtype,
		null_policy	null_handling,
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`,
		rmm::device_async_resource_ref	mr = `cudf::get_current_device_resource_ref()`
	)

计算输入列中每个段的归约。

此函数不检测归约中的溢出。当 output_dtype 与 segmented_values.type() 不匹配时，它们的值可能会被提升为 int64_t 或 double 以计算聚合，然后在返回之前转换为 output_dtype。

在归约期间，null 值被视为恒等元。

如果段为空，则对应于该段结果的行将为 null。

如果 offsets 中的任何索引超出 segmented_values 的边界，则行为未定义。

如果输入列是算术类型，则 output_dtype 可以是任何算术类型。如果输入列是非算术类型，例如时间戳，则必须指定相同的输出类型。

如果输入不为空，则结果始终可为 null。

异常

cudf::logic_error	如果对非算术输出类型调用归约，且操作符不是 `min` 和 `max`。
cudf::logic_error	如果输入列数据类型无法转换为 `output_dtype` 类型。
cudf::logic_error	如果调用了 `min` 或 `max` 归约，并且 `output_dtype` 与输入列数据类型不匹配。
cudf::logic_error	如果调用了 `any` 或 `all` 归约，并且 `output_dtype` 不是 BOOL8。

参数

segmented_values	分段输入的列视图
offsets	`segmented_values` 中每个段的偏移量。一个大小为 `num_segments + 1` 的偏移量列表。第 `i` 个段的大小为 `offsets[i+1] - offsets[i]`。
agg	归约应用的聚合操作符
output_dtype	输出列类型
null_handling	如果为 `INCLUDE`，则仅当段中的所有元素都有效时，归约才有效，否则为 null。如果为 `EXCLUDE`，则仅当段中的任何元素有效时，归约才有效，否则为 null。
stream	用于设备内存操作和核函数启动的 CUDA stream
mr	用于分配返回标量设备内存的设备内存资源

返回值: 包含分段归约结果的输出列

◆ segmented_reduce() [2/2]

std::unique_ptr<column> cudf::segmented_reduce	(	column_view const &	segmented_values,
		device_span< size_type const >	offsets,
		segmented_reduce_aggregation const &	agg,
		data_type	output_dtype,
		null_policy	null_handling,
		std::optional< std::reference_wrapper< scalar const >>	init,
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`,
		rmm::device_async_resource_ref	mr = `cudf::get_current_device_resource_ref()`
	)

计算具有初始值的输入列中每个段的归约。仅支持 SUM、PRODUCT、MIN、MAX、ANY 和 ALL 聚合。

参数

segmented_values	分段输入的列视图
offsets	`segmented_values` 中每个段的偏移量。一个大小为 `num_segments + 1` 的偏移量列表。第 `i` 个段的大小为 `offsets[i+1] - offsets[i]`。
agg	归约应用的聚合操作符
output_dtype	输出列类型
null_handling	如果为 `INCLUDE`，则仅当段中的所有元素都有效时，归约才有效，否则为 null。如果为 `EXCLUDE`，则仅当段中的任何元素有效时，归约才有效，否则为 null。
init	归约的初始值
stream	用于设备内存操作和核函数启动的 CUDA stream
mr	用于分配返回标量设备内存的设备内存资源

返回值: 包含分段归约结果的输出列。

文件

枚举

函数

详细描述

函数文档

◆ minmax()

◆ reduce() [1/2]

◆ reduce() [2/2]

◆ scan()

◆ segmented_reduce() [1/2]

◆ segmented_reduce() [2/2]