The quadtree data structure is a tree structure that retains data in an efficient way and optimizes lookups for values that are stored according to a specific configuration. This configuration is often spacial and is well suited to GPS coordinate systems because the individual x-y coordinates are placed in the tree based on their location. It is also used frequently in two-dimensional collision detection, because the constant lookups are only performed in specific portions of the quadtree to reduce lookup time.

The quadtree's strength in storing and retrieving data lies in its ability to subdivide itself into quadrants continually, depending on a threshold of coordinates contained within a quadrant. For example, if one section of a map has many coordinate points while another has few, the quadtree will optimize for storing the data such that the lookup on a loosely populated quadrant will not suffer from the traversal and lookup of the other branches of the tree structure. This allows fast lookup and offers 0(n log n) time complexity, greatly optimizing the process for real-time updating of dynamic data.

My team used a quadtree to store all of the GPS coordinate points for our app, Ripple, because the data was updated every one minute and needed constant accessing. For this reason we decided to store the data in memory, and clear it out every few minutes if an update was not received for a given coordinate/user id pair. The coordinates would also need to be accessed and compared against other coordinate points to find the distance between them, as our app filtered certain actions on a given coordinate if it was “outside” of the boundaries of a circle.

For the fast lookups and efficient storage, the most important operation our quadtree would need to perform was subdivision of itself, creating new quadtree nodes each composed of 1/4th the area of the parent quadtree node. It would need to do this when certain criteria was met, such as exceeding the maximum number of coordinate points for a given quadrant. This check was performed whenever a coordinate point was inserted into the quadtree, and the code to subdivide involved calculation the total coverage of each quadrant:

Quadtree.prototype.subDivide = function() {  
  var root = this;
  var depth = this.depth + 1;
  var width = this.boundaries.width / 2;
  var height = this.boundaries.height / 2;
  var x = this.boundaries.x;
  var y = this.boundaries.y;

  // top left quadrant
  this.quadrants[0] = new Quadtree({
    x: x,
    y: y + height,
    width: width,
    height: height
  }, this.maxChildren, root, depth);

with the quadrant setting process repeated three more times for each quadrant. Additionally, we coded an unfolding function that will collapse the quadrants into the parent node if the coordinates are too few to justify the subdivisions, and redistribute the coordinates within the parent node:

Quadtree.prototype.unfold = function(quad) {

  // check if root 
  if (quad.children.length && !quad.quadrants.length) {
    quad.root.unfold(quad.root);
  }

  var count = 0;

  quad.quadrants.forEach(function(quadrant) {
    count += quadrant.children.length;
  });

  if (count < quad.maxChildren / 2) {
    var nodes = quad.quadrants.reduce(function(acc, quadrant) {
      if (quadrant.children.length) {
        acc = acc.concat(quadrant.children);
      }
      return acc;
    }, []);

    quad.quadrants = [];

    nodes.forEach(function(node) {
      quad.put(node);
    });
  }
 };

Our app, Ripple, also allowed users to broadcast images to nearby users, depending on the GPS coordinates. These images would not be sent to users outside of a specific vicinity, which required that the system lookup a coordinate to find nearby nodes without needing check distant node. To increase lookup times and eliminate unnecessary comparisons, our quadtree only traverses through quadrants within the radius of the broadcast event, and performs this check/recurses only when it is required:

Quadtree.prototype.get = function(item, callback) {  
  if (this.quadrants.length) {

    if (this.checkRange(item)) {
      var index = this.findIndex(item);
      return this.quadrants[index].get(item);
    }

    else if (callback) {
      callback(this.quadrants[index].get(item));
    }

    else {
      return this;
    }
  }

  else if (!this.quadrants.length && this.children.length) {

    if (callback) {
      callback(this);
    }

    else {
      return this;
    }
  }

  else {
    console.log('returning root');
    return this.root || this;
  }
};

The checkDistance function will simply add the supplied vector length to the coordinate in 4 directions to find whether the radius is “in-bounds” or “out-of-bounds” of the quadrant before recursing further. If the distance is not “in-bounds” of the subquadrant, our function will not recurse further, but will simply traverse through all the quadrants and subquadrants of the node on top of the call stack, and perform the check on all of those coordinates.

Our distance checking function used the Haversine formula to calculate the distance between any two points, accounting for the great-circle arc of the earth's surface:

calculateDist: function(item1, nodes) {

// var R is the mean radius of the earth in kilometers
    var R = 6371;
    var lat1 = +item1.x;
    var lon1 = +item1.y;
    var lat2, lon2, dlat, dlon;
    var result = [];

    nodes.forEach(function(item2) {
      lat2 = +item2.x;
      lon2 = +item2.y;
      dLat = (lat2 - lat1).toRad();
      dLon = (lon2 - lon1).toRad();
      var a = Math.sin(dLat/2) * Math.sin(dLat/2) +
              Math.cos(lat1.toRad()) * Math.cos(lat2.toRad()) *
              Math.sin(dLon/2) * Math.sin(dLon/2);
      var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a));
      var d = R * c;
      d = d * 0.621371;

      if (d < item1.radius) {
        result.push({userId: item2.userId, y: item2.y, x: item2.x});
      }
    });
    return result;
  }

Using this code, we can only find the coordinates that match our distance, and we will only look up the nodes that we absolutely have to. If we only include the amount of users currently using the app, the lookups are not terrible. However, once we begin testing the quadtree with one million, two million or 10 million users, we begin to see the value of the optimization, and can rest more easily knowing that our system will be able to handle millions of users demanding instant feedback and response from Ripple.