IVF-PQ#

IVF-PQ 方法是一种 ANN 算法。与 IVF-Flat 类似，IVF-PQ 将点分割成多个聚类（也由参数 n_lists 指定），并搜索最近的聚类以计算最近邻（也由参数 n_probes 指定），但它使用一种称为乘积量化的技术来压缩向量大小。

#include <raft/neighbors/ivf_pq.h>

索引构建参数#

enum codebook_gen#

指定如何创建 PQ 码本的类型。

值

enumerator PER_SUBSPACE#

enumerator PER_CLUSTER#

typedef struct cuvsIvfPqIndexParams *cuvsIvfPqIndexParams_t#

cuvsError_t cuvsIvfPqIndexParamsCreate( cuvsIvfPqIndexParams_t *index_params )#

分配 IVF-PQ 索引参数，并填充默认值。

参数：: index_params – [in] 要分配的 cuvsIvfPqIndexParams_t
返回：: cuvsError_t

cuvsError_t cuvsIvfPqIndexParamsDestroy( cuvsIvfPqIndexParams_t index_params )#

释放 IVF-PQ 索引参数。

参数：: index_params – [in]
返回：: cuvsError_t

struct cuvsIvfPqIndexParams#

#include <ivf_pq.h>

构建 IVF-PQ 索引的补充参数。

公共成员

cuvsDistanceType metric#: 距离类型。

float metric_arg#: 某些距离度量使用的参数。

bool add_data_on_build#

是否将数据集内容添加到索引，即

true 表示在调用 build 后，索引填充了数据集向量并已准备好进行搜索。
false 表示 build 只训练底层模型（例如量化器或聚类），但索引保持为空；之后需要对索引调用 extend 来填充它。

uint32_t n_lists#

倒排列表（聚类）的数量

提示：每个聚类的向量数量（n_rows/n_lists）应约为 1,000 到 10,000。

uint32_t kmeans_n_iters#: 搜索 Kmeans 中心（索引构建）的迭代次数。

double kmeans_trainset_fraction#: 在迭代 Kmeans 构建期间使用的数据比例。

uint32_t pq_bits#

通过 PQ 压缩后向量元素的比特长度。

可能的值：[4, 5, 6, 7, 8]。

提示：‘pq_bits’ 越小，索引大小越小，搜索性能越好，但召回率越低。

uint32_t pq_dim#

通过 PQ 压缩后向量的维度。当为零时，使用启发式方法选择最优值。

注意：pq_dim * pq_bits 必须是 8 的倍数。

提示：较小的 ‘pq_dim’ 会导致索引大小更小，搜索性能更好，但召回率较低。如果 ‘pq_bits’ 为 8，‘pq_dim’ 可以设置为任何数字，但为了获得良好性能，最好是 8 的倍数。如果 ‘pq_bits’ 不为 8，‘pq_dim’ 应为 8 的倍数。为了获得良好性能，最好 ‘pq_dim’ 是 32 的倍数。理想情况下，‘pq_dim’ 也应是数据集维度的约数。

enum codebook_gen codebook_kind#: 如何创建 PQ 码本。

bool force_random_rotation#

即使 dim % pq_dim == 0，也对输入数据和查询应用随机旋转矩阵。

注意：如果 dim 不是 pq_dim 的倍数，则总是对输入数据和查询应用随机旋转以将工作空间从 dim 转换为 rot_dim，rot_dim 可能比原始空间稍大且是 pq_dim 的倍数（rot_dim % pq_dim == 0）。但是，当 dim 是 pq_dim 的倍数时，此转换不是必需的（dim == rot_dim，因此无需添加“额外”数据列/特征）。

默认情况下，如果 dim == rot_dim，旋转变换用单位矩阵初始化。当 force_random_rotation == true 时，无论 dim 和 pq_dim 的值如何，都会生成一个随机正交变换矩阵。

bool conservative_memory_allocation#

默认情况下，该算法为单个聚类（list_data）分配的空间多于必需的空间。这允许分摊内存分配成本，并减少重复调用 extend（扩展数据库）期间的数据复制次数。

另一种方案是保守分配行为；启用后，该算法总是分配存储给定数量记录所需的最小内存量。如果您希望尽可能少地使用 GPU 内存用于数据库，请将此标志设置为 true。

uint32_t max_train_points_per_pq_code#: PQ 码本训练期间每个 PQ 代码使用的最大数据点数。每个 PQ 代码使用更多数据点可能会提高 PQ 码本的质量，但也可能增加构建时间。该参数适用于两种 PQ 码本生成方法，即 PER_SUBSPACE 和 PER_CLUSTER。在这两种情况下，我们将使用 pq_book_size * max_train_points_per_pq_code 个训练点来训练每个码本。

索引搜索参数#

typedef struct cuvsIvfPqSearchParams *cuvsIvfPqSearchParams_t#

cuvsError_t cuvsIvfPqSearchParamsCreate( cuvsIvfPqSearchParams_t *params )#

分配 IVF-PQ 搜索参数，并填充默认值。

参数：: params – [in] 要分配的 cuvsIvfPqSearchParams_t
返回：: cuvsError_t

cuvsError_t cuvsIvfPqSearchParamsDestroy( cuvsIvfPqSearchParams_t params )#

释放 IVF-PQ 搜索参数。

参数：: params – [in]
返回：: cuvsError_t

struct cuvsIvfPqSearchParams#

#include <ivf_pq.h>

搜索 IVF-PQ 索引的补充参数。

公共成员

uint32_t n_probes#: 要搜索的聚类数量。

cudaDataType_t lut_dtype#

搜索时动态创建的查找表的数据类型。

可能的值：[CUDA_R_32F, CUDA_R_16F, CUDA_R_8U]

使用低精度类型可以减少搜索时所需的共享内存量，因此即使对于维度较大的数据集，也可以使用快速共享内存内核。请注意，选择低精度类型时，召回率会略有下降。

cudaDataType_t internal_distance_dtype#

搜索时计算的距离/相似度的存储数据类型。

可能的值：[CUDA_R_16F, CUDA_R_32F]

如果在搜索时性能瓶颈是设备内存访问，选择 FP16 会稍微提高性能。

double preferred_shmem_carveout#

优先使用 SM 统一内存/L1 缓存作为共享内存的比例。

可能的值：[0.0 - 1.0]，作为 sharedMemPerMultiprocessor 的比例。

希望增加划拨（carveout）以确保主搜索内核有良好的 GPU 占用率，但又不希望太高，以便留出一些内存作为 L1 缓存使用。请注意，此值仅被解释为提示。此外，GPU 通常只允许固定的缓存配置集，因此提供的值将向上取整到最接近的配置。请参阅目标 GPU 架构的 NVIDIA 调优指南。

注意，这是一个低级调优参数，如果调整不当，可能会对搜索性能产生严重负面影响。

索引#

typedef cuvsIvfPqIndex *cuvsIvfPqIndex_t#

cuvsError_t cuvsIvfPqIndexCreate(cuvsIvfPqIndex_t *index)#

分配 IVF-PQ 索引。

参数：: index – [in] 要分配的 cuvsIvfPqIndex_t
返回：: cuvsError_t

cuvsError_t cuvsIvfPqIndexDestroy(cuvsIvfPqIndex_t index)#

释放 IVF-PQ 索引。

参数：: index – [in] 要释放的 cuvsIvfPqIndex_t

struct cuvsIvfPqIndex#: #include <ivf_pq.h>

用于保存 cuvs::neighbors::ivf_pq::index 地址及其活动训练数据类型的结构体。

索引构建#

cuvsError_t cuvsIvfPqBuild( cuvsResources_t res, cuvsIvfPqIndexParams_t params, DLManagedTensor *dataset, cuvsIvfPqIndex_t index )#

使用底层 DLDeviceType 等于 kDLCUDA, kDLCUDAHost, kDLCUDAManaged 或 kDLCPU 的 DLManagedTensor 构建 IVF-PQ 索引。此外，可接受的底层类型是

kDLDataType.code == kDLFloat 且 kDLDataType.bits = 32
kDLDataType.code == kDLFloat 且 kDLDataType.bits = 16
kDLDataType.code == kDLInt 且 kDLDataType.bits = 8
kDLDataType.code == kDLUInt 且 kDLDataType.bits = 8

#include <cuvs/core/c_api.h>
#include <cuvs/neighbors/ivf_pq.h>

// Create cuvsResources_t
cuvsResources_t res;
cuvsError_t res_create_status = cuvsResourcesCreate(&res);

// Assume a populated `DLManagedTensor` type here
DLManagedTensor dataset;

// Create default index params
cuvsIvfPqIndexParams_t index_params;
cuvsError_t params_create_status = cuvsIvfPqIndexParamsCreate(&index_params);

// Create IVF-PQ index
cuvsIvfPqIndex_t index;
cuvsError_t index_create_status = cuvsIvfPqIndexCreate(&index);

// Build the IVF-PQ Index
cuvsError_t build_status = cuvsIvfPqBuild(res, index_params, &dataset, index);

// de-allocate `index_params`, `index` and `res`
cuvsError_t params_destroy_status = cuvsIvfPqIndexParamsDestroy(index_params);
cuvsError_t index_destroy_status = cuvsIvfPqIndexDestroy(index);
cuvsError_t res_destroy_status = cuvsResourcesDestroy(res);

参数：

res – [in] cuvsResources_t 不透明 C 句柄
params – [in] 用于构建 IVF-PQ 索引的 cuvsIvfPqIndexParams_t
dataset – [in] DLManagedTensor* 训练数据集
index – [out] 新构建的 IVF-PQ 索引 cuvsIvfPqIndex_t

返回：

cuvsError_t

索引搜索#

cuvsError_t cuvsIvfPqSearch( cuvsResources_t res, cuvsIvfPqSearchParams_t search_params, cuvsIvfPqIndex_t index, DLManagedTensor *queries, DLManagedTensor *neighbors, DLManagedTensor *distances )#

使用底层 DLDeviceType 等于 kDLCUDA, kDLCUDAHost, kDLCUDAManaged 的 DLManagedTensor 搜索 IVF-PQ 索引。还需要注意的是，IVF-PQ 索引必须使用与 queries 相同的类型构建，以便 index.dtype.code == queries.dl_tensor.dtype.code 输入类型为

queries: kDLDataType.code == kDLFloat 且 kDLDataType.bits = 32 或 kDLDataType.bits = 16
neighbors: kDLDataType.code == kDLUInt 且 kDLDataType.bits = 32
distances: kDLDataType.code == kDLFloat 且 kDLDataType.bits = 32

#include <cuvs/core/c_api.h>
#include <cuvs/neighbors/ivf_pq.h>

// Create cuvsResources_t
cuvsResources_t res;
cuvsError_t res_create_status = cuvsResourcesCreate(&res);

// Assume a populated `DLManagedTensor` type here
DLManagedTensor dataset;
DLManagedTensor queries;
DLManagedTensor neighbors;

// Create default search params
cuvsIvfPqSearchParams_t search_params;
cuvsError_t params_create_status = cuvsIvfPqSearchParamsCreate(&search_params);

// Search the `index` built using `cuvsIvfPqBuild`
cuvsError_t search_status = cuvsIvfPqSearch(res, search_params, index, &queries, &neighbors,
&distances);

// de-allocate `search_params` and `res`
cuvsError_t params_destroy_status = cuvsIvfPqSearchParamsDestroy(search_params);
cuvsError_t res_destroy_status = cuvsResourcesDestroy(res);

参数：

res – [in] cuvsResources_t 不透明 C 句柄
search_params – [in] 用于搜索 IVF-PQ 索引的 cuvsIvfPqSearchParams_t
index – [in] 由 cuvsIvfPqBuild 返回的 cuvsIvfPqIndex
queries – [in] DLManagedTensor* 要搜索的查询数据集
neighbors – [out] DLManagedTensor* 输出查询的 k 个最近邻
distances – [out] DLManagedTensor* 输出查询的 k 个距离

索引序列化#

警告

doxygengroup: 无法在项目“cuvs”的 doxygen xml 输出目录 ../../cpp/doxygen/_xml/ 中找到组“ivf_pq_c_index_serialize”