All neighbor implementations can be invoked via the main NearestNeighbors class, which exhibits a similar structure as the corresponding class from the scikit-learn package:

class bufferkdtree.neighbors.NearestNeighbors(n_neighbors=5, algorithm='buffer_kd_tree', float_type='float', tree_depth=None, leaf_size=30, splitting_type='cyclic', n_train_chunks=1, plat_dev_ids={0: [0]}, allowed_train_mem_percent_chunk=0.15, allowed_test_mem_percent=0.55, n_jobs=1, verbose=0, **kwargs)

The ‘NearestNeighbors’ provides access to all nearest neighbor implementations. It has simmilar parameters as the corresponding implementation of the scikit-learn package.

The main method is the “buffer_kd_tree”, which can be seen as mix between the “brute” and the “kd_tree” implementations.


n_neighbors : int (default 5)

Number of neighbors used

algorithm : {“brute”, “kd_tree”, “buffer_kd_tree”}, optional (default=”buffer_kd_tree”)

The algorithm that shall be used to compute the nearest neighbors. One of - ‘brute’: brute-force search - ‘kd_tree’: k-d tree based search - ‘buffer_kd_tree’: buffer k-d tree based search (with GPUs)

tree_depth : int or None, optional (default=None)

Passed to the ‘kd_tree’ and ‘buffer_kd_tree’ implementation. In case ‘tree_depth’ is specified, a tree of such a depth is built (‘tree_depth’ has priority over ‘leaf_size’).

leaf_size : int, optional (default=30)

Passed to the ‘kd_tree’ and ‘buffer_kd_tree’ implementation. In case ‘leaf_size’ is set, the corresponding tree depth is computed (is ignored in case tree_depth is not None).

splitting_type : {‘cyclic’}, optional (default=’cyclic’)

Passed to the ‘kd_tree’ and ‘buffer_kd_tree’ implementation. The splitting rule that shall be used to construct the kd tree. Currently, only “cyclic” is supported.

n_train_chunks : int, optional (default=1)

Passed to the ‘buffer_kd_tree’ implementation. The number of chunks the training patterns shall be processed in; only needed in case the training patterns do not fit on the GPU (in case n_train_chunks is too small, it is increased automatically).

plat_dev_ids : dict, optional (default={0:[0]})

Passed to the ‘brute’ and the ‘buffer_kd_tree’ implementation. The platforms and devices that shall be used. E.g., plat_dev_ids={0:[0,1]} makes use of platform 0 and the first two devices.

allowed_train_mem_percent_chunk : float, optional (default=0.15)

Passed to the ‘buffer_kd_tree’ implementation. The amount of memory (OpenCL) used for the training patterns (in percent).

allowed_test_mem_percent : float, optional (default=0.55)

Passed to the ‘buffer_kd_tree’ implementation. The amount of memory (OpenCL) used for the test/query patterns (in percent).

n_jobs : int, optional (default=1)

Passed to the ‘kd_tree’ implementation. The number of threads used for the querying phase.

verbose : int, optional (default=0)

The verbosity level (0=no output, 1=output)


The brute-force implementatation is only used for comparison in relatively low-dimensional spaces; the performance is suboptimal for higher dimensional feature spaces (but even superior over other matrix based implementations making use e.g., CUBLAS).

The performance of the GPU implementations depends on the corresponding architecture. An important ingredient is the support of automatic hardware caches.


>>> import numpy
>>> from bufferkdtree.neighbors import NearestNeighbors
>>> X = numpy.random.uniform(low=-1,high=1,size=(10000,10))
>>> nbrs = NearestNeighbors(n_neighbors=10, algorithm="buffer_kd_tree", tree_depth=9, plat_dev_ids={0:[0]})    
>>> dists, inds = nbrs.kneighbors(X)   

Fit the model to the given data.


X : array-like, shape (n_samples, n_features)

The set of training/reference points, where ‘n_samples’ is the number points and ‘n_features’ the number of features.


self : instance of NearestNeighbors

The object itself

kneighbors(X=None, n_neighbors=None, return_distance=True)

Finds the nearest neighbors for a given set of points.


X : array-like, shape (n_samples, n_features)

The set of query points. If not provided, the neighbors of each point in the training data are returned (in this case, the query point itself is not considered its own neighbor.

n_neighbors : int or None, optional (default=None)

The number of nearest neighbors that shall be returned for each query points. If ‘None’, then the default values provided to the constructor is used.

return_distance : bool, optional (default=True)

If False, then the distances associated with each query will not be returned. Otherwise, they will be returned.


dist : array

The array containing the distances

idx : array

The array containing the indices

Adapting Buffer K-D Trees

If you wish to adapt the buffer k-d tree implementation, then you might want have a look at the C and OpenCL code that is available in the bufferkdtree/src directory.

Developer C API

A documentation of the underlying C code can be found here.

A good starting point for diving into the details of the underlying implementation is the base.c file in bufferkdtree/src/neighbors/buffer_kdtree directory.