## Archive for **May 2010**

## Tree Drawing: Force-based Algorithm

I’ve written a couple times on different data visualization techniques for viewing hierarchical information. One method that I haven’t discussed is the force-based algorithm approach. After digging around a bit, it looks like this method was originally developed in Birbil’s and Fang’s 2003 publication [pdf], “An Electromagnetism-like Mechanism for Global Optimization”. In terms of tree drawing, the idea is fairly simple: treat the tree as a dynamical system composed of springs and point charges, apply the appropriate physical laws and over time, you will get a nicely laid out tree. My implementation in the clip below demonstrates this approach.

Let’s think about why we want a dynamical system. An idealized tree drawing is aesthetically pleasing, compact, nodes are uniformly distributed and edges between nodes do not cross. Under this proposed system, we can’t guarantee these criteria, but we can get “close enough” to produce something that is both nice on paper and sensible in practice. Treating the nodes as point charges will make the nodes repel from one another. The magnitude of the repulsion is determined by Column’s Law which states that the force applied to a point charge by another point charge follows an inverse square law. If all we had was Coulomb’s Law, then all the nodes will fly away from one another and finally come to rest, but the nodes would be too distantly spread across the screen. To alleviate this, we can treat the edges between nodes as springs. Following Hooke’s Law we can tether the nodes together to preserve the compactness criteria. The magnitude of this force is directly linear with respect to the distance between the nodes.

Let’s jump into some quick physics. To determine the net force applied on a given node, we need to calculate the spring force applied to the node, and then we need to calculate all the point charge forces applied to the node. The key to getting this right is making sure that we get the direction and magnitude of the forces correct. To make sure that we have the direction correct, let be the vector representing the distance between the i’th and j’th nodes ( representing the i’th node’s location). Let be the normalized vector representing the direction that the force that needs to be applied. For the spring force we are looking at where is the spring constant, is the length of the spring at rest and is the force between the i’th and j’th node. For the point charge we are interested in . is the charge of the i’th node and is the coulomb constant. (I default all constants to 1.0 in my implementation).

We’ve got the net force on each node figured out, so now it is necessary to figure out the velocity of the nodes as well as their location. To do this, we recall that by Newton’s Laws, a force is the product of a mass and acceleration . If we know an acceleration, we can derive (integrate mathematically) the velocity and location each iteration. Thus, if we know a node’s present velocity , then we know that . If the node has current location , then its next location will be . Once we’ve determined these values between time steps , we will need to remember to zero out the net force on each node in our implementation. In my implementation I chose a value of 0.1 for my time step) One important aspect to note for this algorithm, is the need to apply dampening to the velocity over time, otherwise the structure continue to move in space and we run into the possibility of running out of numbers on the machine resulting in an overflow error.

To go about implementing all of this you’ll find the time complexity of the algorithm should be on the order . The square order comes from apply Coulomb’s law. By Newton’s third law (equal and opposite forces), you can reduce this down to , but for significantly sized you are still going to be looking at . You could improve this by applying a binary space partitioning (in particular the R-tree and its variants) data structure to the tree and only use those nodes within a certain radius of the node when calculating the Coulomb forces. Testing my implementation (using Lists), I found things get a little sluggish for a real-time render around 100 nodes on my 2.4GHz laptop. Depending on your application (domain and implementation), you millage may vary. Following are some test cases I used in writing my solution.

Fan | Flat |
---|---|

Each node a number of nodes equivalent to its number of siblings minus one | Tree containing one level of nodes |

Complete Binary | Random |

Every node has exactly two children | Randomly generated tree of depth four |

I chose to implement my solution in C# using the Microsoft .net 3.5 Framework (as a side note, 4.0 was recently released on the 12th of April). I went with my custom mathematics library and the WinForms library of the Framework (in the future I would like to migrate this over to WPF) for displaying and end-user manipulation of the algorithm’s variables. I went with a vanilla UserControl that contains a setter for the tree data structure, the algorithm to apply to the structure and the settings for that algorithm. Upon setting the tree, a Timer instance kicks off (non-reentrant) to invoke the algorithm below every 50ms. To get the tree to display correctly on the control, the algorithm calculates the boundary of the nodes and then the control performs a linear map from the model space to the view space. The first implementation used the DoubleBuffered property of the UserControl, but I found that it was ineffective in reducing flickering so I implemented custom double buffering using the BufferedGraphicsContext class. It’s worth noting that most implementations track the kinetic energy of the system to determine when to terminate the algorithm. I chose not to do this, as I didn’t find value in adding in the additional calculation.

using System; using System.Collections.Generic; using System.Drawing; using Library.Mathematics; namespace ForceBasedTreeLayout { public class ForceBasedTreeLayout { private LayoutSettings settings; public ForceBasedTreeLayout(LayoutSettings settings) { this.settings = settings; } public RectangleF Evaluate(Node root) { List<Node> nodes = new List<Node>(); foreachNode(root, (x) => nodes.Add(x)); // Apply Hooke's law // F = -k x foreachNode(root, (parent) => { parent.Children.ForEach((child) => { Vector dist = parent.Location - child.Location; Vector restingDist = settings.MinimumSpringLength * dist.Normalize(); Vector force = -settings.SpringConstant * (dist - restingDist); parent.Acceleration += force; child.Acceleration -= force; }); }); // Apply Coulomb's Law // F = Q1 Q1 / d^2 for (int n = 0; n < nodes.Count; n++) { for (int m = n + 1; m < nodes.Count; m++) { Vector dist = nodes[n].Location - nodes[m].Location; Vector norm = dist.Normalize(); Vector force = new Vector(2, (i) => norm[i] / Math.Pow(dist.Norm() + 1.0, 2.0)); nodes[n].Acceleration += force; nodes[m].Acceleration -= force; } } Vector xExtrema = new Vector(2); xExtrema[0] = double.MaxValue; xExtrema[1] = double.MinValue; Vector yExtrema = new Vector(2); yExtrema[0] = double.MaxValue; yExtrema[1] = double.MinValue; // update the locations & velocity && figure out new bounding region foreach (Node node in nodes) { // p = a0t^2/2 + v0t + p0 node.Location = (settings.TimeStep * settings.TimeStep * 0.5) * node.Acceleration + (settings.TimeStep) * node.Velocity + node.Location; // v = at + v0 node.Velocity = (settings.TimeStep) * node.Acceleration + node.Velocity; node.Velocity = (settings.VelocityDampening) * node.Velocity; node.Acceleration = new Vector(2, (i) => 0.0); xExtrema[0] = Math.Min(xExtrema[0], node.Location[0]); xExtrema[1] = Math.Max(xExtrema[1], node.Location[0]); yExtrema[0] = Math.Min(yExtrema[0], node.Location[1]); yExtrema[1] = Math.Max(yExtrema[1], node.Location[1]); } RectangleF R = new RectangleF(); R.X = (float)xExtrema[0]; R.Y = (float)yExtrema[0]; R.Width = (float)(xExtrema[1] - xExtrema[0]); R.Height = (float)(yExtrema[1] - yExtrema[0]); R.X -= R.Width / 2; R.Y -= R.Height / 2; R.Width *= 2; R.Height *= 2; return R; } private void foreachNode(Node root, Action<Node> act) { Stack<Node> stack = new Stack<Node>(); stack.Push(root); while (stack.Count > 0) { Node node = stack.Pop(); act(node); node.Children.ForEach((x) => stack.Push(x)); } } } }

**Edit: 2010-10-21**

By popular demand, here is the vector class:

public class Vector { private double[] V; public int Dimension { get { return V.Length; } } public double this[int n] { get { if (n < 0 || n >= Dimension) throw new Exception(string.Format("{0} must be between 0 and {1}", n, Dimension)); return V[n]; } set { if (n < 0 || n >= Dimension) throw new Exception(string.Format("{0} must be between 0 and {1}", n, Dimension)); V[n] = value; } } public Vector() : this(0) { } public Vector(int n) { V = new double[n]; } public Vector(int n, VectorInitializer initializer) : this(n) { for (int i = 0; i < Dimension; i++) V[i] = initializer(i); } public double dot(Vector y) { if (Dimension != y.Dimension) throw new Exception(); double d = 0.0; for (int n = 0; n < Dimension; n++) d += this[n] * y[n]; return d; } public override bool Equals(object obj) { Vector x = obj as Vector; if (x == null || x.Dimension != Dimension) return false; for (int n = 0; n < Dimension; n++) if (this[n] != x[n]) return false; return true; } static public Vector operator +(Vector x, Vector y) { if (x.Dimension != y.Dimension) throw new Exception(); Vector z = new Vector(x.Dimension); for (int n = 0; n < z.Dimension; n++) z[n] = x[n] + y[n]; return z; } static public Vector operator -(Vector x, Vector y) { if (x.Dimension != y.Dimension) throw new Exception(); Vector z = new Vector(x.Dimension); for (int n = 0; n < z.Dimension; n++) z[n] = x[n] - y[n]; return z; } static public double operator *(Vector x, Vector y) { return x.dot(y); } static public Vector operator *(double c, Vector x) { Vector y = new Vector(x.Dimension); for (int n = 0; n < y.Dimension; n++) y[n] = c*x[n]; return y; } }