## Archive for **August 2008**

## Tree Drawing: The radial tree

Although not as interesting as a sunburst diagram, the radial tree view can hold its color against a number of more primitive information visualizations. A radial tree view places the root at the center of the screen then fans out each child node. Each child node then fans out its children within a restricted span and continues on until each leaf is reached. The strengths of the technique allow for any easy to digest depiction of the structure behind the data in a compact space. A common application is visualizing computer networks. It is worthwhile to examine the algorithm behind the technique because it is an exercise in identifying simplicity.

While in college, I would have approached this problem by trying to identify the location of nodes in terms of after all, I want a radial tree view- makes sense to sprint out the gate with a polar system right? While possible, this is a bad path to head down, as you end up drowning in a sea of extraneous details. Rather, it is better to think in terms of and then map to . To clarify that position, let’s think about how we’d go about drawing the run of the mill tree view as in the figure below:

First some observations:

- Every node at a given depth lies on the same line.
- Every child at a given depth is given an equal share of horizontal space independent of necessity relative to the space owned by its parent.

We can construct a simple recursive definition for drawing the tree if we think about these two facts. Given a node, we want to center the node at the top of a region then carve up a region into the number of child nodes where each sub region is equally wide and the same height as the parent minus a layering distance, then draw a line from the parent node to the child node. Continue doing so until all of the nodes have been drawn. All that remains is mapping this tree to the radial tree view below:

To achieve this last step, we want to map each node at to a point Where is the center of the display area. The radius is simply the node’s present coordinate. can be determined as the ratio between the node’s present coordinate and the display width times . And thus, the mapping is complete.

## Solutions to some Microsoft interview questions

A couple of weeks ago, someone on reddit posted a link to a collection of Microsoft Interview Questions. As someone who interviewed with them while in college, I was curious to see what kind question they were asking folks. After reviewing the list, I thought I’d work out a few that looked interesting:

Imagine an analog clock set to 12 o’clock. Note that the hour and minute hands overlap. How many times each day do both the hour and minute hands overlap? How would you determine the exact times of the day that this occurs?

Thus^{*}: 12:00:00 AM, 01:05:27 AM, 02:10:54 AM, 03:16:21 AM, 04:21:49 AM, 05:27:16 AM, 06:32:43 AM, 07:38:10 AM, 08:43:38 AM, 09:49:05 AM, 10:54:32 AM, 12:00:00 PM, 01:05:27 PM, 02:10:54 PM, 03:16:21 PM, 04:21:49 PM, 05:27:16 PM, 06:32:43 PM, 07:38:10 PM, 08:43:38 PM, 09:49:05 PM and 10:54:32 PM.

^{*} Floor the conversion from to hour, minute and second of day.

Pairs of primes separated by a single number are called prime pairs. Examples are 17 and 19. Prove that the number between a prime pair is always divisible by 6 (assuming both numbers in the pair are greater than 6). Now prove that there are no ‘prime triples.’

Assuming:

Twin Primes:

Prime Triples:

There are 4 women who want to cross a bridge. They all begin on the same side. You have 17 minutes to get all of them across to the other side. It is night. There is one flashlight. A maximum of two people can cross at one time. Any party who crosses, either 1 or 2 people, must have the flashlight with them. The flashlight must be walked back and forth, it cannot be thrown, etc. Each woman walks at a different speed. A pair must walk together at the rate of the slower woman’s pace.

For example if Woman 1 and Woman 4 walk across first, 10 minutes have elapsed when they get to the other side of the bridge. If Woman 4 then returns with the flashlight, a total of 20 minutes have passed and you have failed the mission. What is the order required to get all women across in 17 minutes? Now, what’s the other way?

One way:

- Woman 1 and 2 Cross, woman 1 returns. 3 minutes total.
- Woman 3 and 4 Cross, woman 2 returns. 15 minutes total.
- Woman 1 and 2 Cross. 17 Minutes total.

Other way:

- Woman 1 and 2 Cross, woman 2 returns. 4 minutes total.
- Woman 3 and 4 Cross, woman 1 returns. 11 minutes total.
- Woman 1 and 2 Cross. 17 minutes total.

If you had an infinite supply of water and a 5 quart and 3 quart pail, how would you measure exactly 4 quarts?

- Fill 5 quart pail with 5 quarts of water from source.
- Fill 3 quart pail with 3 quarts of water from 5 quart pail.
- Empty 3 quart pail.
- Empty 5 quart pail containing 2 quarts of water into 3 quart pail.
- Fill 5 quart pail with 5 quarts of water from source.
- Fill 3 quart pail with water from 5 quart pail.
- 5 quart pail now contains 4 quarts of water.

Suppose you have an array of 1001 integers. The integers are in random order, but you know each of the integers is between 1 and 1000 (inclusive). In addition, each number appears only once in the array, except for one number, which occurs twice. Assume that you can access each element of the array only once. Describe an algorithm to find the repeated number. If you used auxiliary storage in your algorithm, can you find an algorithm that does not require it?

public int duplicateNumber(int[] A) { int count = 0; for(int k = 0; k < A.Length; k++) count += A[k]; return count - (A.Length * (A.Length - 1) >> 1); }

Count the number of set bits in a number. Now optimize for speed. Now optimize for size.

public int bitsUsed(int n) { int count = 0; while(n != 0) { count += n & 1; n >>= 1; } return count; }

## Reimplementing arcade classics

Arcade games are a fun exercise in trying out different techniques that ultimately yield the same result: a responsive graphical interface where an agent is controlled by the user’s keyboard. I’ve had a chance to design a few applications in a couple of languages and thought I’d go over the design decisions of each. Plus a little variety never hurts in tuning your skill set.

Tetris was a simple game pushed by Nintendo to fuel sales of the Game Boy in the early 90s. I wrote a variant awhile ago that uses blocks rather than tetrominoes to cut out unnecessary complexity. This C# implementation revolved around the .Net WinForms and falls under the umbrella of Object Oriented and Event Driven designs. A one dimensional array keeps tabs of how deep a block can fall along with a list of fallen blocks. Any time a block lands on top of another block or the user hits a key, an event handler processes the event and causes the screen to be repainted. This design felt forced but otherwise worked as needed. I hope to refine this approach in future applications.

Pacman was the iconic flagship of 80s arcade armada. During college I wrote a simple version of Pacman in C that relied on the prototypical input, logic and draw loop. In this implementation, an array is kept that represents the inanimate actors: nothing, reward and walls. In addition separate cursors are kept to keep track of Pacman and each of the ghosts. Design-wise, this worked out really well. Input was parsed and applied to Pacman, each of the ghosts move towards Pacman, termination conditions are checked and finally the array and animated actors are painted on the screen. Out of these designs, this one felt the most versatile.

Snake, also known as Nibbler or Nibbles was a classic game that used to get put onto cell phones (when phones used to come with games for free…) where you have a snake that grows as it consumes rewards. The snake moves around a torus (represented as a 2d surface) and the game is over when the snake covers the surface or when part of the snake crosses over itself. This implementation went with C# relying purely on the Console. The snake is represented as a linked list where every node holds a direction and location. As each loop passes, the direction and location of the head is passed onto the next segment and so on until the tail is reached. If the snake is on top of a reward a new segment is appended to the tail. The design is the same as my Pacman implementation, however there is no underlying array to maintain.

If the comments call for it, I’ll post more implementation details on any of the above.

## The ring of Gaussian Integers

I recently started taking a foray into Complex Analysis as a means of filling in the gaps of my undergraduate mathematics knowledge. After reading about holomorphic functions, the Cauchy-Riemann equations and how to model ideal fluid flows I decided to take a break to digest it all. During this period I was reflecting on what I had done with Number Theory the previous summer and the question popped in my head: what is the complex equivalent of ? Or for that matter, are integer primes also primes in the complex domain (and vice versa)? And, are all complex factorizations unique?

After looking around the internet for a while I came across a rather well written paper [pdf] by Assistant Professor Keith Conrad of the University of Connecticut that answered all of the questions brewing in my head along with the ones that weren’t. I’m going to surmise the basic ideas behind the solutions Conrad wrote about, if you have the free time it’s worthwhile to read the original text in its entirety.

If we consider the set of complex numbers of the form and restrict we have the set of the Gaussian Integers . If we take two values we can say that only if where . Thus, . With this last step we’ve reduced the problem to satisfying two conditions: . For example let which satisfies the conditions . For the congruence with to hold true, it must be the case that .

Under any integer that is divisible by any integer other than a unit or unit multiple is said to be composite, otherwise is said to be prime. The same definition holds in but our units are now and our unit multiples are now . As one might have guessed all primes in are primes in however, the converse does not hold. For example is prime but can be written as and is thus a composite number under . It is also useful to mention that if then is prime. By way of the Fundamental Theorem of Arithmetic, every composite integer can be written as the product of prime integers, this is also the case under and each factorization is unique.

The mechanics of finding the prime factorization of can be naïvely done by trial division from to . Once reset and and continue this procedure until . That is all well and fine for but it isn’t immediately useful for doing the same under .

For example, let’s consider , since we are uncertain about how to factor , lets think about how to factor as we already know how to factor under and see where that takes us. We went from so we’d hope that we could go the other way around by finding for each of the prime factors of . In other words, we want to know which will produce four Gaussian primes . Thus, we find , , , . Now, according to Conrad we should be able to pick any one of the four possible Guassian primes for each of the prime factors and multiply each Gaussian prime across, in doing so we get . Now that’s pretty damn cool.

The general procedure for factoring a Gaussian Integer into it’s prime components is to find the integer factorization of and for each prime factor find . Next, explore the Cartesian space formed by each factor’s four Gaussian primes until you come across a product coordinate that equals and output said coordinate. As one can image this is a woefully poor algorithm for find the prime factorization of . If anyone has ideas or knows of more efficient methods post them in the comments.

There is clearly much, much more to be said about Gaussian Integers, but this feels like a good stopping point. If you want to find out more about how some of the traditional Number Theory constructs are defined under you’ll want to read the entirety of Conrad’s paper or jump on Google and see what’s out there.

## Tree Drawing: Sunburst diagram

Information Visualization is an interesting subject for me because it is the aesthetics of displaying large amounts of information into an easy to digest vision that depicts metrics of interest. One branch of this subject deals with methods for visualizing hierarchical information such techniques as tree maps, hyperbolic tree views, et al are typically deployed. Of these methods, one less common is the sunburst diagram.

A sunburst diagram is essentially the polar form of a Tree icicle visualization. At the core is the root of the tree, each concentric ring represents the child nodes and is partitioned by the metric of our choosing to represent the percentage a node consumes relative to its siblings. Sunburst diagrams are ideal for displaying any kind of tree data where nodes have weights and the totality of the nodes represents a whole.

While there are certainly several existing software solutions out on the market that utilize this information visualization technique, I feel that the majority of which are too focused on solving one specific problem rather than stepping back to identify the general problem that sunburst diagrams solve. As a result, I find it appropriate to develop my own software solution.

I want to try and incorporate some of the following features that I feel are lacking in other applications:

- Ability to visualize any XML document by way of import and option to export
- Choose to visualize metrics based on the attributes defined on each node of an XML document
- Capacity to search and filter information in the document
- Freedom to navigate the tree in an intuitive manner
- A clean and sharp looking user interface that is easy on the eyes

This project really boils down to a Swiss Army knife of simple data analysis tools. Depending on my schedule this project may or may not become a reality but could turn out to be useful for a variety of problems. As the project grows and matures I’ll try to post updates as appropriate.