## Tree Drawing: Force-based Algorithm

I’ve written a couple times on different data visualization techniques for viewing hierarchical information. One method that I haven’t discussed is the force-based algorithm approach. After digging around a bit, it looks like this method was originally developed in Birbil’s and Fang’s 2003 publication [pdf], “An Electromagnetism-like Mechanism for Global Optimization”. In terms of tree drawing, the idea is fairly simple: treat the tree as a dynamical system composed of springs and point charges, apply the appropriate physical laws and over time, you will get a nicely laid out tree. My implementation in the clip below demonstrates this approach.

Let’s think about why we want a dynamical system. An idealized tree drawing is aesthetically pleasing, compact, nodes are uniformly distributed and edges between nodes do not cross. Under this proposed system, we can’t guarantee these criteria, but we can get “close enough” to produce something that is both nice on paper and sensible in practice. Treating the nodes as point charges will make the nodes repel from one another. The magnitude of the repulsion is determined by Column’s Law which states that the force applied to a point charge by another point charge follows an inverse square law. If all we had was Coulomb’s Law, then all the nodes will fly away from one another and finally come to rest, but the nodes would be too distantly spread across the screen. To alleviate this, we can treat the edges between nodes as springs. Following Hooke’s Law we can tether the nodes together to preserve the compactness criteria. The magnitude of this force is directly linear with respect to the distance between the nodes.

Let’s jump into some quick physics. To determine the net force applied on a given node, we need to calculate the spring force applied to the node, and then we need to calculate all the point charge forces applied to the node. The key to getting this right is making sure that we get the direction and magnitude of the forces correct. To make sure that we have the direction correct, let be the vector representing the distance between the i’th and j’th nodes ( representing the i’th node’s location). Let be the normalized vector representing the direction that the force that needs to be applied. For the spring force we are looking at where is the spring constant, is the length of the spring at rest and is the force between the i’th and j’th node. For the point charge we are interested in . is the charge of the i’th node and is the coulomb constant. (I default all constants to 1.0 in my implementation).

We’ve got the net force on each node figured out, so now it is necessary to figure out the velocity of the nodes as well as their location. To do this, we recall that by Newton’s Laws, a force is the product of a mass and acceleration . If we know an acceleration, we can derive (integrate mathematically) the velocity and location each iteration. Thus, if we know a node’s present velocity , then we know that . If the node has current location , then its next location will be . Once we’ve determined these values between time steps , we will need to remember to zero out the net force on each node in our implementation. In my implementation I chose a value of 0.1 for my time step) One important aspect to note for this algorithm, is the need to apply dampening to the velocity over time, otherwise the structure continue to move in space and we run into the possibility of running out of numbers on the machine resulting in an overflow error.

To go about implementing all of this you’ll find the time complexity of the algorithm should be on the order . The square order comes from apply Coulomb’s law. By Newton’s third law (equal and opposite forces), you can reduce this down to , but for significantly sized you are still going to be looking at . You could improve this by applying a binary space partitioning (in particular the R-tree and its variants) data structure to the tree and only use those nodes within a certain radius of the node when calculating the Coulomb forces. Testing my implementation (using Lists), I found things get a little sluggish for a real-time render around 100 nodes on my 2.4GHz laptop. Depending on your application (domain and implementation), you millage may vary. Following are some test cases I used in writing my solution.

Fan | Flat |
---|---|

Each node a number of nodes equivalent to its number of siblings minus one | Tree containing one level of nodes |

Complete Binary | Random |

Every node has exactly two children | Randomly generated tree of depth four |

I chose to implement my solution in C# using the Microsoft .net 3.5 Framework (as a side note, 4.0 was recently released on the 12th of April). I went with my custom mathematics library and the WinForms library of the Framework (in the future I would like to migrate this over to WPF) for displaying and end-user manipulation of the algorithm’s variables. I went with a vanilla UserControl that contains a setter for the tree data structure, the algorithm to apply to the structure and the settings for that algorithm. Upon setting the tree, a Timer instance kicks off (non-reentrant) to invoke the algorithm below every 50ms. To get the tree to display correctly on the control, the algorithm calculates the boundary of the nodes and then the control performs a linear map from the model space to the view space. The first implementation used the DoubleBuffered property of the UserControl, but I found that it was ineffective in reducing flickering so I implemented custom double buffering using the BufferedGraphicsContext class. It’s worth noting that most implementations track the kinetic energy of the system to determine when to terminate the algorithm. I chose not to do this, as I didn’t find value in adding in the additional calculation.

using System; using System.Collections.Generic; using System.Drawing; using Library.Mathematics; namespace ForceBasedTreeLayout { public class ForceBasedTreeLayout { private LayoutSettings settings; public ForceBasedTreeLayout(LayoutSettings settings) { this.settings = settings; } public RectangleF Evaluate(Node root) { List<Node> nodes = new List<Node>(); foreachNode(root, (x) => nodes.Add(x)); // Apply Hooke's law // F = -k x foreachNode(root, (parent) => { parent.Children.ForEach((child) => { Vector dist = parent.Location - child.Location; Vector restingDist = settings.MinimumSpringLength * dist.Normalize(); Vector force = -settings.SpringConstant * (dist - restingDist); parent.Acceleration += force; child.Acceleration -= force; }); }); // Apply Coulomb's Law // F = Q1 Q1 / d^2 for (int n = 0; n < nodes.Count; n++) { for (int m = n + 1; m < nodes.Count; m++) { Vector dist = nodes[n].Location - nodes[m].Location; Vector norm = dist.Normalize(); Vector force = new Vector(2, (i) => norm[i] / Math.Pow(dist.Norm() + 1.0, 2.0)); nodes[n].Acceleration += force; nodes[m].Acceleration -= force; } } Vector xExtrema = new Vector(2); xExtrema[0] = double.MaxValue; xExtrema[1] = double.MinValue; Vector yExtrema = new Vector(2); yExtrema[0] = double.MaxValue; yExtrema[1] = double.MinValue; // update the locations & velocity && figure out new bounding region foreach (Node node in nodes) { // p = a0t^2/2 + v0t + p0 node.Location = (settings.TimeStep * settings.TimeStep * 0.5) * node.Acceleration + (settings.TimeStep) * node.Velocity + node.Location; // v = at + v0 node.Velocity = (settings.TimeStep) * node.Acceleration + node.Velocity; node.Velocity = (settings.VelocityDampening) * node.Velocity; node.Acceleration = new Vector(2, (i) => 0.0); xExtrema[0] = Math.Min(xExtrema[0], node.Location[0]); xExtrema[1] = Math.Max(xExtrema[1], node.Location[0]); yExtrema[0] = Math.Min(yExtrema[0], node.Location[1]); yExtrema[1] = Math.Max(yExtrema[1], node.Location[1]); } RectangleF R = new RectangleF(); R.X = (float)xExtrema[0]; R.Y = (float)yExtrema[0]; R.Width = (float)(xExtrema[1] - xExtrema[0]); R.Height = (float)(yExtrema[1] - yExtrema[0]); R.X -= R.Width / 2; R.Y -= R.Height / 2; R.Width *= 2; R.Height *= 2; return R; } private void foreachNode(Node root, Action<Node> act) { Stack<Node> stack = new Stack<Node>(); stack.Push(root); while (stack.Count > 0) { Node node = stack.Pop(); act(node); node.Children.ForEach((x) => stack.Push(x)); } } } }

**Edit: 2010-10-21**

By popular demand, here is the vector class:

public class Vector { private double[] V; public int Dimension { get { return V.Length; } } public double this[int n] { get { if (n < 0 || n >= Dimension) throw new Exception(string.Format("{0} must be between 0 and {1}", n, Dimension)); return V[n]; } set { if (n < 0 || n >= Dimension) throw new Exception(string.Format("{0} must be between 0 and {1}", n, Dimension)); V[n] = value; } } public Vector() : this(0) { } public Vector(int n) { V = new double[n]; } public Vector(int n, VectorInitializer initializer) : this(n) { for (int i = 0; i < Dimension; i++) V[i] = initializer(i); } public double dot(Vector y) { if (Dimension != y.Dimension) throw new Exception(); double d = 0.0; for (int n = 0; n < Dimension; n++) d += this[n] * y[n]; return d; } public override bool Equals(object obj) { Vector x = obj as Vector; if (x == null || x.Dimension != Dimension) return false; for (int n = 0; n < Dimension; n++) if (this[n] != x[n]) return false; return true; } static public Vector operator +(Vector x, Vector y) { if (x.Dimension != y.Dimension) throw new Exception(); Vector z = new Vector(x.Dimension); for (int n = 0; n < z.Dimension; n++) z[n] = x[n] + y[n]; return z; } static public Vector operator -(Vector x, Vector y) { if (x.Dimension != y.Dimension) throw new Exception(); Vector z = new Vector(x.Dimension); for (int n = 0; n < z.Dimension; n++) z[n] = x[n] - y[n]; return z; } static public double operator *(Vector x, Vector y) { return x.dot(y); } static public Vector operator *(double c, Vector x) { Vector y = new Vector(x.Dimension); for (int n = 0; n < y.Dimension; n++) y[n] = c*x[n]; return y; } }

very cool article. Can you post a complete solution or a little bit more code? Thanks.

koseto2010-06-03 at 12:14 pm

Koseto, thanks for the interest. I usually only post what I consider to be the most relevant aspects of getting an implementation working. Is there a particular aspect that you’d like to see?

lewellen2010-06-05 at 1:13 pm

Great stuff! I look forward to trying to duplicate your efforts. This is fun stuff. I did have a question about your Vector class. I’m trying to use your code as a guide in a Silverlight project I am working on. I have a Vector class I’ve used in the past, but it does not define all the necessary methods yours does. If you don’t mind, could you point me to your Vector class. Specifically the constructor and the Norm() function. (if you have time)

Thank you very much for this great work!

Adam2010-07-07 at 7:17 am

Hi Adam, here’s the code for the Vector class that you’re after.

lewellen2010-07-07 at 10:44 pm

Is there any chance you could help me port your code to actionscript?

Kenny2010-07-26 at 5:45 pm

For the snippets of the vector class code that you provided to compile. I also need definitions of Vector.dot() function, and the operator overload for multiplying a Double with a Vector.

Giles Bathgate2010-08-11 at 3:24 am

lewellen2010-08-11 at 10:59 pm

I am still struggling to make your example work or even compile, could you please provide the source code in full.

Giles Bathgate2010-08-25 at 2:59 am

Hello,

Thanks for article. I have few questions.

Let’s say that vector is 2 dimensional vector.

1. Little question about dist.

Vector dist = nodes[n].Location – nodes[m].Location

let’s say we have node[n]->(x1,y1) and node[m]->(x2,y2) and as a result should be: (x1-x2, y1-y2)? Or dist is length between nodes? (formula:

sqrt((x1-x2)^2 + (y1-y2)^2)). As I understand dist is length between nodes. Am I right?

2. If we are trying to find out norm vector the Formula is:

norm_vector = 1/|magnitude| * [x,y]. There [x,y] is vector end coordinate. As a result we get vector (x0,y0). This norm vector length should be 1. If I clearly understand your solution, the Normalize() function is main formula as I wrote. So, Norm() function returns vector length?

I understand that norm_vector says the direction and as a result we have to get minus or plus length in Hook’s law (x multiplicand) F = -kx.

3. Finally, I see that Hook’s force is a vector, but this force dimension = 1?

P.S. sorry for my english. I hope you understand my questions. Besides, it would be very useful to see full Vector Class for more clear.

Thanks.

Don2010-10-20 at 4:40 am

Don-

1) Given two, two dimensional vectors , the difference between the two will be . The magnitude of the resultant vector will be the distance between the two points.

2) Correct, norm is synonymous with length in this situation.

3) I first calculate the vector that represents the distance between the two nodes (dist), then calculate restingDist as a vector parallel to dist (which is why I call dist.Normalize()). The difference between dist and restingDist gives the force to apply.

Good luck.

lewellen2010-10-21 at 6:56 pm

[...] http://antimatroid.wordpress.com/2010/05/01/tree-drawing-force-based-algorithm/ [...]

Force-based Algorithms | iamharshsinha.com2010-12-23 at 10:53 am

Hello.

Thanks for the article it was a basis for us in out project. i would like to ask if its feasible to use it in graphing project networks i assume you are familiar with the skeletal diagram. we are implementing Force base Algorithm in our thesis, i find it difficult to code force base algorithm in our project. i hope u can answer my question. your responses is a great help!

stevenking2011-02-06 at 11:51 am

For project networks, you may want to try a suite of different graphing algorithms to best determine the most suitable one. Force-based algorithms will work, but if you have deterministic, visual requirements (e.g., always display from left to right), then you may want to try an alternative.

lewellen2011-02-06 at 1:25 pm

thank you for replying i appreciate it very much. i had a another question how the system notice the edges crossing and how it process to eliminate it. thanks. sorry but i just can’t figure it out by myself! thank you very much!

stevenking2011-02-07 at 9:30 am

The algorithm is looking for a configuration that minimizes the energy in the system- which may or may not be aesthetically appealing. One way you can trick the the algorithm is by increasing the spring constant forcing the nodes closer together, which can minimize the chance of lines crossing.

lewellen2011-02-08 at 11:21 pm

Hi,

Thanks for taking the time to upload your program for users. I presume you are using your custom Node class? Would you mind uploading it?.

P.s. this isn’t me being lazy, I am very new to programming and find most programs difficult to decipher.

Cheers

Eaxrl12011-02-16 at 10:10 am

Also, have you missed out the definition for the VectorInitializer class?

I am still struggling to figure out how you have constructed your node class?! :-/

Eaxrl12011-03-01 at 12:03 pm

Eaxlr1- the Node is a basic class that contains a list of other Nodes giving way to a Rose Tree.

The VectorInitializer is a delegate method that is invoked for each index of the vector. It is equivalent to the following code:

I won't be uploading the code as it's one of those things you'll be able to deduce yourself. Good luck!

lewellen2011-03-17 at 7:45 pm