1 class Node:
2 pass
3
4 def kdtree(pointList, depth=0):
5 if not pointList:
6 return
7
8 # Select axis based on depth so that axis cycles through all valid values
9 k = len(pointList[0]) # Assumes all points have the same dimension
10 axis = depth % k
11
12 # Sort point list to select median
13 pointList.sort(cmp=lambda x, y: cmp(x[axis], y[axis]))
14 median = len(pointList)//2 # Choose median
15
16 # Create node and construct subtrees
17 node = Node()
18 node.location = pointList[median]
19 node.leftChild = kdtree(pointList[:median], depth+1)
20 node.rightChild = kdtree(pointList[median+1:], depth+1)
21 #self.root = node
22 return node
We start with a simple list of R3 tuples. This list is passed into the function "kdtree" as the parameter "pointList".
The function is generic enough to handle any length of tuple (as long as all tuples in the list are of the same length) and this length is determined on line 9 of the above snippet.
The list in our example is:
[(1, 2, 3), (1, 3, 4), (-3, 2, 8), (6, 8, 9), (-5, 2, 8), (3, 2, 9), (6, 4, 7), (8, 2, 1)]
We have eight (8) elements/tuples, each three (3) in length. So our kd-tree will be a 3d-tree. Or k=3.
On line 13, we sort the list based upon the "axis" element. Meaning, if we are at depth of 2, and k is 3, then the axis is depth mod k, or 2 mod 3 which is 2. Should depth be 4, then the axis would be 4 mod 3 = 1.
The sorting method can take an anonymous function (the lambda function implemented inline) which takes two (2) parameters (two tuples) and compares the two based upon the values of their x'th element, where x is the axis. This means that when axis is zero (0), the first element of the tuples will be used to determine order.
I've added an additional debug line to the above code to output the values of "pointList" at various stages of the programme's execution:
0: [(-5, 2, 8), (-3, 2, 8), (1, 2, 3), (1, 3, 4), (3, 2, 9), (6, 8, 9), (6, 4, 7), (8, 2, 1)] 1: [(-5, 2, 8), (-3, 2, 8), (1, 2, 3), (1, 3, 4)] 2: [(-5, 2, 8), (-3, 2, 8)] 3: [(-5, 2, 8)] 2: [(1, 3, 4)] 1: [(8, 2, 1), (6, 4, 7), (6, 8, 9)] 2: [(8, 2, 1)] 2: [(6, 8, 9)]
The left column indicates the rank of the tree, or depth of the recursive execution. The recursion first executes of the left branch of each node, and then again for the right. This is why there are two sets of depth two (2) and three (3).
All of the lists show the pointList after sorting has been done.
On the first line, the length is eight (8), so the median is 8 / 2 = 4. And the 4th element (the list is indexed starting from zero (0)) happens to be (3,2,9).
The list is subsected into the left half, from the start to the median (but not including the median) and that list is passed back into the function with the depth value incremented.
The sublist is now sorted on the 1 mod 3 element (one (1), but as we index from zero (0), it's the second element in the tuple) giving us the list on the second line ( [(-5, 2, 8), (-3, 2, 8), (1, 2, 3), (1, 3, 4)] ).