The author of this code is Matthew Kennel. According to the publication where it was introduced (back in 2004!) (https://arxiv.org/pdf/physics/0408067.pdf) it is provided under terms of the Academic Free License. I have done some refactoring of it before in a private fork.
The Scipy library has a useful kd-tree which is described briefly in their Nature Methods paper (see Fig. 3 in https://www.nature.com/articles/s41592-019-0686-2). The latest version is actually written in C++:
The benchmark is to construct a tree of n = 10000 uniformly distributed points, and query the nearest neighbors for r = 1000 random vectors in m = 3, 8, and 16 dimensions. The exact benchmark setup can be found here: https://github.com/scipy/scipy/blob/10c8c8de491a607378d6dca7602b0a3ed440a4b4/benchmarks/benchmarks/spatial.py#L110-L139
I have repeated this test with the kdtree2
code from Matthew Kennel:
program kdtree_benchmark
use kdtree2_module
implicit none
type(kdtree2), pointer :: tree
real(kdkind), dimension(:,:), allocatable :: data, queries
type(kdtree2_result) :: results(1)
real :: t0, t1, t2
integer, parameter :: m(3) = [3, 8, 16]
integer, parameter :: n = 10000, r = 1000
integer :: idxs(r), idx
integer :: i
do i = 1, 3
write(*,*) "Dimension = ", m(i)
allocate(data(m(i),n))
call random_number(data)
allocate(queries(m(i),r))
call random_number(queries)
! Populate tree
call cpu_time(t0)
tree => kdtree2_create(data,sort=.true.,rearrange=.true.)
call cpu_time(t1)
! Query random vectors
do idx = 1, r
call kdtree2_n_nearest(tp=tree,qv=queries(:,idx),nn=1,results=results)
idxs(idx) = results(1)%idx
end do
call cpu_time(t2)
write(*,*) "Tree build (s): ", t1 - t0
write(*,*) "Tree query (s): ", t2 - t1
call kdtree2_destroy(tree)
deallocate(data)
deallocate(queries)
end do
end program
Running the program on a >5 year old ThinkPad T530 with an Intel® Core™ i5-3320M CPU:
$ gfortran -O3 src/kdtree2_precision_module.f90 src/kdtree2_priority_queue_module.f90 src/kdtree2_module.f90 tests/time.f90 -o time
$ ./time
Dimension = 3
Tree build (s): 3.41600017E-03
Tree query (s): 1.05800014E-03
Dimension = 8
Tree build (s): 4.14900016E-03
Tree query (s): 1.39079997E-02
Dimension = 16
Tree build (s): 5.63300028E-03
Tree query (s): 0.218334988
The 16 year old Fortran code delivers similar performance to the latest Scipy version!
Had Bob from the blog post known about it, he could have told those Python youngsters…