# Antimatroid, The

thoughts on computer science, electronics, mathematics

## Tree Drawing: Force-based Algorithm

I’ve written a couple times on different data visualization techniques for viewing hierarchical information. One method that I haven’t discussed is the force-based algorithm approach. After digging around a bit, it looks like this method was originally developed in Birbil’s and Fang’s 2003 publication [pdf], “An Electromagnetism-like Mechanism for Global Optimization”. In terms of tree drawing, the idea is fairly simple: treat the tree as a dynamical system composed of springs and point charges, apply the appropriate physical laws and over time, you will get a nicely laid out tree. My implementation in the clip below demonstrates this approach.

Let’s think about why we want a dynamical system. An idealized tree drawing is aesthetically pleasing, compact, nodes are uniformly distributed and edges between nodes do not cross. Under this proposed system, we can’t guarantee these criteria, but we can get “close enough” to produce something that is both nice on paper and sensible in practice. Treating the nodes as point charges will make the nodes repel from one another. The magnitude of the repulsion is determined by Column’s Law which states that the force applied to a point charge by another point charge follows an inverse square law. If all we had was Coulomb’s Law, then all the nodes will fly away from one another and finally come to rest, but the nodes would be too distantly spread across the screen. To alleviate this, we can treat the edges between nodes as springs. Following Hooke’s Law we can tether the nodes together to preserve the compactness criteria. The magnitude of this force is directly linear with respect to the distance between the nodes.

Let’s jump into some quick physics. To determine the net force applied on a given node, we need to calculate the spring force applied to the node, and then we need to calculate all the point charge forces applied to the node. The key to getting this right is making sure that we get the direction and magnitude of the forces correct. To make sure that we have the direction correct, let $\bold{d}_{i,j} = \bold{x}_{i} - \bold{x}_{j}$ be the vector representing the distance between the i’th and j’th nodes ($\bold{x}_{i}$ representing the i’th node’s location). Let $\hat{\bold{d}}_{i,j} = \frac{\bold{d}_{i,j}}{\lVert \bold{d}_{i,j} \rVert}$ be the normalized vector representing the direction that the force that needs to be applied. For the spring force we are looking at $\bold{F}_{i, j} = -k (\lVert \bold{d}_{i,j} \rVert - r) \hat{\bold{d}}_{i,j}$ where $k$ is the spring constant, $r$ is the length of the spring at rest and $\bold{F}_{i,j}$ is the force between the i’th and j’th node. For the point charge we are interested in $\bold{F}_{i,j} = k \frac{ Q_{i} Q_{j} }{ \lVert \bold{d}_{i,j} \rVert ^{2} } \hat{\bold{d}}_{i,j}$. $Q_{i}$ is the charge of the i’th node and $k$ is the coulomb constant. (I default all constants to 1.0 in my implementation).

We’ve got the net force on each node figured out, so now it is necessary to figure out the velocity of the nodes as well as their location. To do this, we recall that by Newton’s Laws, a force is the product of a mass and acceleration $F = ma$. If we know an acceleration, we can derive (integrate mathematically) the velocity and location each iteration. Thus, if we know a node’s present velocity $v_{n}$, then we know that $v_{n+1} = a h + v_{n}$. If the node has current location $p_{n}$, then its next location will be $p_{n+1} = \frac{h^{2}}{2} a + h v_{n} + p_n$. Once we’ve determined these values between time steps $h$, we will need to remember to zero out the net force on each node in our implementation. In my implementation I chose a value of 0.1 for my time step) One important aspect to note for this algorithm, is the need to apply dampening to the velocity over time, otherwise the structure continue to move in space and we run into the possibility of running out of numbers on the machine resulting in an overflow error.

To go about implementing all of this you’ll find the time complexity of the algorithm should be on the order $O(\lVert{V}\rVert^2)$. The square order comes from apply Coulomb’s law. By Newton’s third law (equal and opposite forces), you can reduce this down to $\frac{ \lVert V \lVert^2 - \lVert V \lVert }{2}$, but for significantly sized $V$ you are still going to be looking at $O(\lVert{V}\rVert^{2})$. You could improve this by applying a binary space partitioning (in particular the R-tree and its variants) data structure to the tree and only use those nodes within a certain radius of the node when calculating the Coulomb forces. Testing my implementation (using Lists), I found things get a little sluggish for a real-time render around 100 nodes on my 2.4GHz laptop. Depending on your application (domain and implementation), you millage may vary. Following are some test cases I used in writing my solution.

Fan Flat
Each node a number of nodes equivalent to its number of siblings minus one Tree containing one level of nodes
Complete Binary Random
Every node has exactly two children Randomly generated tree of depth four

I chose to implement my solution in C# using the Microsoft .net 3.5 Framework (as a side note, 4.0 was recently released on the 12th of April). I went with my custom mathematics library and the WinForms library of the Framework (in the future I would like to migrate this over to WPF) for displaying and end-user manipulation of the algorithm’s variables. I went with a vanilla UserControl that contains a setter for the tree data structure, the algorithm to apply to the structure and the settings for that algorithm. Upon setting the tree, a Timer instance kicks off (non-reentrant) to invoke the algorithm below every 50ms. To get the tree to display correctly on the control, the algorithm calculates the boundary of the nodes and then the control performs a linear map from the model space to the view space. The first implementation used the DoubleBuffered property of the UserControl, but I found that it was ineffective in reducing flickering so I implemented custom double buffering using the BufferedGraphicsContext class. It’s worth noting that most implementations track the kinetic energy of the system to determine when to terminate the algorithm. I chose not to do this, as I didn’t find value in adding in the additional calculation.

using System;
using System.Collections.Generic;
using System.Drawing;
using Library.Mathematics;

namespace ForceBasedTreeLayout {
public class ForceBasedTreeLayout {
private LayoutSettings settings;

public ForceBasedTreeLayout(LayoutSettings settings) {
this.settings = settings;
}

public RectangleF Evaluate(Node root) {
List<Node> nodes = new List<Node>();

// Apply Hooke's law
// F = -k x
foreachNode(root, (parent) => {
parent.Children.ForEach((child) => {
Vector dist = parent.Location - child.Location;
Vector restingDist = settings.MinimumSpringLength * dist.Normalize();
Vector force = -settings.SpringConstant * (dist - restingDist);

parent.Acceleration += force;
child.Acceleration -= force;
});
});

// Apply Coulomb's Law
// F = Q1 Q1 / d^2
for (int n = 0; n < nodes.Count; n++) {
for (int m = n + 1; m < nodes.Count; m++) {
Vector dist = nodes[n].Location - nodes[m].Location;
Vector norm = dist.Normalize();
Vector force = new Vector(2, (i) => norm[i] / Math.Pow(dist.Norm() + 1.0, 2.0));

nodes[n].Acceleration += force;
nodes[m].Acceleration -= force;
}
}

Vector xExtrema = new Vector(2);
xExtrema[0] = double.MaxValue;
xExtrema[1] = double.MinValue;

Vector yExtrema = new Vector(2);
yExtrema[0] = double.MaxValue;
yExtrema[1] = double.MinValue;

// update the locations & velocity && figure out new bounding region
foreach (Node node in nodes) {
// p = a0t^2/2 + v0t + p0
node.Location = (settings.TimeStep * settings.TimeStep * 0.5) * node.Acceleration + (settings.TimeStep) * node.Velocity + node.Location;

// v = at + v0
node.Velocity = (settings.TimeStep) * node.Acceleration + node.Velocity;
node.Velocity = (settings.VelocityDampening) * node.Velocity;

node.Acceleration = new Vector(2, (i) => 0.0);

xExtrema[0] = Math.Min(xExtrema[0], node.Location[0]);
xExtrema[1] = Math.Max(xExtrema[1], node.Location[0]);

yExtrema[0] = Math.Min(yExtrema[0], node.Location[1]);
yExtrema[1] = Math.Max(yExtrema[1], node.Location[1]);
}

RectangleF R = new RectangleF();
R.X = (float)xExtrema[0];
R.Y = (float)yExtrema[0];
R.Width = (float)(xExtrema[1] - xExtrema[0]);
R.Height = (float)(yExtrema[1] - yExtrema[0]);

R.X -= R.Width / 2;
R.Y -= R.Height / 2;
R.Width *= 2;
R.Height *= 2;

return R;
}

private void foreachNode(Node root, Action<Node> act) {
Stack<Node> stack = new Stack<Node>();
stack.Push(root);
while (stack.Count > 0) {
Node node = stack.Pop();
act(node);
node.Children.ForEach((x) => stack.Push(x));
}
}
}
}


Edit: 2010-10-21

By popular demand, here is the vector class:

public class Vector {
private double[] V;

public int Dimension { get { return V.Length; } }

public double this[int n] {
get {
if (n < 0 || n >= Dimension)
throw new Exception(string.Format("{0} must be between 0 and {1}", n, Dimension));
return V[n];
}
set {
if (n < 0 || n >= Dimension)
throw new Exception(string.Format("{0} must be between 0 and {1}", n, Dimension));
V[n] = value;
}
}

public Vector() : this(0) { }

public Vector(int n) { V = new double[n]; }

public Vector(int n, VectorInitializer initializer) : this(n) {
for (int i = 0; i < Dimension; i++)
V[i] = initializer(i);
}

public double dot(Vector y) {
if (Dimension != y.Dimension)
throw new Exception();
double d = 0.0;
for (int n = 0; n < Dimension; n++)
d += this[n] * y[n];
return d;
}

public override bool Equals(object obj) {
Vector x = obj as Vector;
if (x == null || x.Dimension != Dimension)
return false;

for (int n = 0; n < Dimension; n++)
if (this[n] != x[n])
return false;
return true;
}

static public Vector operator +(Vector x, Vector y) {
if (x.Dimension != y.Dimension)
throw new Exception();

Vector z = new Vector(x.Dimension);
for (int n = 0; n < z.Dimension; n++)
z[n] = x[n] + y[n];
return z;
}

static public Vector operator -(Vector x, Vector y) {
if (x.Dimension != y.Dimension)
throw new Exception();

Vector z = new Vector(x.Dimension);
for (int n = 0; n < z.Dimension; n++)
z[n] = x[n] - y[n];
return z;
}

static public double operator *(Vector x, Vector y) {
return x.dot(y);
}

static public Vector operator *(double c, Vector x) {
Vector y = new Vector(x.Dimension);
for (int n = 0; n < y.Dimension; n++)
y[n] = c*x[n];
return y;
}
}


Written by lewellen

2010-05-01 at 8:00 am

## Tree Drawing: The radial tree

Although not as interesting as a sunburst diagram, the radial tree view can hold its color against a number of more primitive information visualizations. A radial tree view places the root at the center of the screen then fans out each child node. Each child node then fans out its children within a restricted span and continues on until each leaf is reached. The strengths of the technique allow for any easy to digest depiction of the structure behind the data in a compact space. A common application is visualizing computer networks. It is worthwhile to examine the algorithm behind the technique because it is an exercise in identifying simplicity.

While in college, I would have approached this problem by trying to identify the location of nodes in terms of $(r, \theta)$ after all, I want a radial tree view- makes sense to sprint out the gate with a polar system right? While possible, this is a bad path to head down, as you end up drowning in a sea of extraneous details. Rather, it is better to think in terms of $(x,y)$ and then map to $(r, \theta)$. To clarify that position, let’s think about how we’d go about drawing the run of the mill tree view as in the figure below:

First some observations:

• Every node at a given depth lies on the same line.
• Every child at a given depth is given an equal share of horizontal space independent of necessity relative to the space owned by its parent.

We can construct a simple recursive definition for drawing the tree if we think about these two facts. Given a node, we want to center the node at the top of a region then carve up a region into the number of child nodes where each sub region is equally wide and the same height as the parent minus a layering distance, then draw a line from the parent node to the child node. Continue doing so until all of the nodes have been drawn. All that remains is mapping this tree to the radial tree view below:

To achieve this last step, we want to map each node at $(x,y)$ to a point $\hat{C} + r \cdot(cos(\theta), sin(\theta))$ Where $\hat{C}$ is the center of the display area. The radius is simply the node’s present $y$ coordinate. $\theta$ can be determined as the ratio between the node’s present $x$ coordinate and the display width times $2 \pi$. And thus, the mapping is complete.

Written by lewellen

2008-08-31 at 6:05 pm

Posted in Computer Graphics

Tagged with

## Tree Drawing: Sunburst diagram

with one comment

Information Visualization is an interesting subject for me because it is the aesthetics of displaying large amounts of information into an easy to digest vision that depicts metrics of interest. One branch of this subject deals with methods for visualizing hierarchical information such techniques as tree maps, hyperbolic tree views, et al are typically deployed. Of these methods, one less common is the sunburst diagram.

A sunburst diagram is essentially the polar form of a Tree icicle visualization. At the core is the root of the tree, each concentric ring represents the child nodes and is partitioned by the metric of our choosing to represent the percentage a node consumes relative to its siblings. Sunburst diagrams are ideal for displaying any kind of tree data where nodes have weights and the totality of the nodes represents a whole.

While there are certainly several existing software solutions out on the market that utilize this information visualization technique, I feel that the majority of which are too focused on solving one specific problem rather than stepping back to identify the general problem that sunburst diagrams solve. As a result, I find it appropriate to develop my own software solution.

I want to try and incorporate some of the following features that I feel are lacking in other applications:

• Ability to visualize any XML document by way of import and option to export
• Choose to visualize metrics based on the attributes defined on each node of an XML document
• Capacity to search and filter information in the document
• Freedom to navigate the tree in an intuitive manner
• A clean and sharp looking user interface that is easy on the eyes

This project really boils down to a Swiss Army knife of simple data analysis tools. Depending on my schedule this project may or may not become a reality but could turn out to be useful for a variety of problems. As the project grows and matures I’ll try to post updates as appropriate.

Written by lewellen

2008-08-03 at 6:00 pm

Posted in Computer Graphics

Tagged with