Antimatroid, The

thoughts on computer science, electronics, mathematics

Archive for June 2008

One tree, many possibilities

with 2 comments

Introduction

Given the finite set S and the surjective function \sigma, Let T_{n}\left(S, \sigma\right) be a complete |S|-ary tree of depth n such that T_{0}\left(S, \sigma\right) = \emptyset, T_n(S, \sigma) = T_{n - 1}(S, \sigma) \{ \left( T_{n - 1}(S, \sigma) \oplus s \right) | s \in \sigma(S) \}. For example, Let A = \{0, 1\} and \\f = S \to S, T_{2}\left(A, \\f\right) = \emptyset \{ \left(\emptyset, 0\right) \{ \left(\emptyset, 0, 0\right), \left(\emptyset, 0, 1\right) \}, \left(\emptyset, 1\right) \{ \left(\emptyset, 1, 0\right), \left(\emptyset, 1, 1\right) \} \}. Pictured bellow is T_{3}\left(A, \\f \right) with omitting \emptyset for brevity.

It should be evident from the example that the set of nodes at depth n is the n-ary Cartesian product of S, S^{n} = S \times \ldots \times S.

Now, if we change \\f to now be \\f : S \to \{ s | s \in S \wedge \nu(s) \neq \nu\left(parent\left(s\right)\right)\}, where \nu : S \leftrightarrow \mathbb{N}_{0}, the set of all leaf nodes at depth |S| is the set of permutations of S, S!. Extending the example above, T_{2} \left( A, \\f \right) = \{ \{ \left( \emptyset, 0, 1 \right) \}, \{ \left( \emptyset, 1, 0 \right) \} \}.

Without changing the definition of \\f, it is possible to produce all possible k-permutations from T, P(S, k) by selecting all nodes at depth k.

At this point, if we make a small change to \\f : S \to \{ s | s \in S \wedge \nu\left(s\right) > \nu\left( parent\left(s\right)\right)\}, every node in T now represents the power set of S, 2^{S}. Again, using the example from above, T_{2} \left( A, \\f \right) = \emptyset \{ \left( \emptyset, 0 \right), \left( \emptyset, 1 \right) \{ \left( \emptyset, 1, 1 \right) \} \}.

Finally, without any further changes to the definition of \\f, it is possible to produce all possible n-combinations from T, C(S, n), by selecting all nodes at depth n.

T is a very elegant way of representing and enumerating some of the most rudimentary combinatorial operations of introductory mathematics. In this series of post we’ll explore each operation and look at implementations, time complexities and possible applications.

Written by lewellen

2008-06-29 at 8:25 pm

Problem with an interesting little problem

with 2 comments

A Simple Problem

\displaystyle{B_{i} = \prod_{j \neq i} A_{j}}

Rules:

  1. Solutions must be \\O(n) in time complexity
  2. Solutions must not use the division operator

fsharp.it posted the above simple Google interview question on their site a few weeks ago and subsequently the problem was referenced by OJ’s rants and later posted to the programming subreddit on reddit.com. Like everyone else, I sat down and wrote some quick code to solve the problem. After looking a the solution, it seemed to me that this was a rather poorly conceived problem- if not contrived at heart. It doesn’t ring to the tune of scalability, performance and quality that one thinks of when the word Google is thrown into the mix.

In the following, we’ll explore why this this problem isn’t a well designed interview question for a software engineering position. For software engineering, a problem should test a candidates ability to deliver simple and optimal solutions under non-ideal situations.

A Complex Solution

Rule (1) is simple enough to deal with, Rule (2) is a little more unexpected, and one can devise any number of reasons why it would be advantageous:

  • Most CPUs require 4x the number of cycles to perform a division vs a multiplication on a set of 32-bit registers
  • The rare event that a CPU doesn’t offer a division instruction
  • Google want to see the candidate solve a simple problem with only a subset of tools
  • And so on…

After some thinking, one eventually devises the following imperative solution:

public int[] exteriorProduct(int[] A) {
	int[] L = new int[A.Length], R = new int[A.Length];
	for (int n = 0, m = A.Length - 1; n < A.Length && m >= 0; n++, m-– ) {
		L[n] = n == 0 ? 1 : A[n - 1] * L[n - 1];
		R[m] = m == A.Length - 1 ? 1 : R[m + 1] * A[m + 1];
	}

	int[] B = new int[A.Length];
	for (int n = 0; n < A.Length; n++)
		B[n] = L[n] * R[n];
}

Given the input A, we instruct the machine to memoize the partial product from 0 to i – 1 and store said value into L[i]; likewise, from |A| – 1 to i + 1 we store the partial product into R[i]. The desired product is the product of L[i] and R[i].

Some quick time complexity analysis reveals the following:

U(N) = \alpha (2 N)

V(0) = 0
V(N) = 2 (\tau + \max(0, \mu)) + V(N - 1)
V(N) = 2 (\tau + \mu) N

W(0) = 0
W(N) = \mu + W(N-1)
W(N) = \mu N

T(N) = U(N) + V(N) + W(N)
T(N) = \alpha (2 N) + 2 (\tau + \mu) N + \mu N
T(N) = (\alpha + \tau + \mu) (2 N) + \mu N

Where \alpha is the memory allocation time, \tau is the time required to perform a boolean test and \mu is the time required to perform a multiplication. Lookup is assumed to be constant- however this is unrealistic as N increases (more on this later). Nonetheless, we’ve satisfied Rules (1 & 2).

The solution is elegant, but it is not obvious to the passerby what it is doing. One must dissect the solution to fully grok the simplicity of the problem. This is not a desirable software trait.

A Simple Solution

For the sake of argument, let’s take a look at the straight forward solution:

public int[] exteriorProduct(int[] A) {
	int[] B = new int[A.Length];
	int p = 1;

	for(int n = 0; n < A.Length; n++)
		p *= A[n];

	for(int n = 0; n < A.Length; n++)
		B[n] = p / A[n];

	return B;
}

Given the input A, we instruct the machine to compute the product over every element and store the result into p. To get the desired answer, the product is divided by A[i]. Simple, clean and easy to understand. But alas, we used that devious division operator…

Again some quick time complexity analysis reveals the following:

U(N) = \alpha \left(N + 1 \right)

V(0) = 0
V(N) = \mu + V(N - 1)
V(N) = \mu N

W(0) = 0
W(N) = \delta + W(N - 1)
W(N) = \delta N

T(N) = U(N) + V(N) + W(N)
T(N) = (\mu + \delta) N + \alpha (N + 1)

Here \delta is the cost of division and \mu and \alpha are the cost of performing multiplication and allocation, respectively.

Performance Showdown

Before we take a look at some hard numbers, lets see what the time complexity analysis says what we will should see:

S(N) = \left( \alpha + \mu + \delta \right) N + \alpha
C(N) = \left( 2 \alpha + 2 \tau + 2 \mu + 1 \right) N

\displaystyle\lim_{N \to \infty } \frac{S(N)}{C(N)} = \frac {\left( \alpha + \mu + \delta \right) N + \alpha}{\left( 2 \alpha + 2 \tau + 2 \mu + 1 \right) N}
\displaystyle\lim_{N \to \infty } \frac{S(N)}{C(N)} \approx \frac {\left( \alpha + \mu + \delta \right) }{2 \left(\alpha + \tau + \mu \right) }
\displaystyle\lim_{N \to \infty } \frac{S(N)}{C(N)} \approx \frac {\\O(1)}{2 \cdot \\O(1)}
\displaystyle\lim_{N \to \infty } \frac{S(N)}{C(N)} \approx \frac {1}{2}

So in short, the simple solution should execute twice as fast as the complex solution.

Algorithm Execution Time (ms) As a Function of N*

101 102 103 104 105 106 107 108 109
Simple 0.0 0.0 0.0 0.0 0.0 15.625 125.0 1265.625 OutOfMem
Complex 0.0 0.0 0.0 0.0 0.0 31.25 312.5 2250.0 OutOfMem

* Tests conducted on Intel T2400 1.83GHz, 987 Mhz bus, 2.00 GB RAM

Just as the time complexity analysis indicated, the simple solution is approximately twice as fast as the complex solution. In addition, both solutions will become worse as N increases as lookup times no longer execute in constant time– certainly more pronounced in the complex solution.

So what…

Granted, Google is in all likelihood aiming to identify candidates who can easily produce solutions that require a tad of lateral thinking- however, this isn’t a good question to ask for a software engineering position for a number of reasons:

  • There is no performance gain to be had from excluding the division operator
  • Unnecessary complexity is introduced into a code base which will ultimately need to be refactored out, thus wasting time
  • And many more…

Interview questions need to focus on problems that require ingenuity, not ones that require candidates to go against common sense software engineering practices. This problem, and many like it, ignore the software engineering aspect of a job which is just as important as being clever at devising algorithms.

Written by lewellen

2008-06-22 at 7:29 pm