Antimatroid, The

thoughts on computer science, electronics, mathematics

Problem with an interesting little problem

with 2 comments

A Simple Problem

\displaystyle{B_{i} = \prod_{j \neq i} A_{j}}

Rules:

  1. Solutions must be \\O(n) in time complexity
  2. Solutions must not use the division operator

fsharp.it posted the above simple Google interview question on their site a few weeks ago and subsequently the problem was referenced by OJ’s rants and later posted to the programming subreddit on reddit.com. Like everyone else, I sat down and wrote some quick code to solve the problem. After looking a the solution, it seemed to me that this was a rather poorly conceived problem- if not contrived at heart. It doesn’t ring to the tune of scalability, performance and quality that one thinks of when the word Google is thrown into the mix.

In the following, we’ll explore why this this problem isn’t a well designed interview question for a software engineering position. For software engineering, a problem should test a candidates ability to deliver simple and optimal solutions under non-ideal situations.

A Complex Solution

Rule (1) is simple enough to deal with, Rule (2) is a little more unexpected, and one can devise any number of reasons why it would be advantageous:

  • Most CPUs require 4x the number of cycles to perform a division vs a multiplication on a set of 32-bit registers
  • The rare event that a CPU doesn’t offer a division instruction
  • Google want to see the candidate solve a simple problem with only a subset of tools
  • And so on…

After some thinking, one eventually devises the following imperative solution:

public int[] exteriorProduct(int[] A) {
	int[] L = new int[A.Length], R = new int[A.Length];
	for (int n = 0, m = A.Length - 1; n < A.Length && m >= 0; n++, m-– ) {
		L[n] = n == 0 ? 1 : A[n - 1] * L[n - 1];
		R[m] = m == A.Length - 1 ? 1 : R[m + 1] * A[m + 1];
	}

	int[] B = new int[A.Length];
	for (int n = 0; n < A.Length; n++)
		B[n] = L[n] * R[n];
}

Given the input A, we instruct the machine to memoize the partial product from 0 to i – 1 and store said value into L[i]; likewise, from |A| – 1 to i + 1 we store the partial product into R[i]. The desired product is the product of L[i] and R[i].

Some quick time complexity analysis reveals the following:

U(N) = \alpha (2 N)

V(0) = 0
V(N) = 2 (\tau + \max(0, \mu)) + V(N - 1)
V(N) = 2 (\tau + \mu) N

W(0) = 0
W(N) = \mu + W(N-1)
W(N) = \mu N

T(N) = U(N) + V(N) + W(N)
T(N) = \alpha (2 N) + 2 (\tau + \mu) N + \mu N
T(N) = (\alpha + \tau + \mu) (2 N) + \mu N

Where \alpha is the memory allocation time, \tau is the time required to perform a boolean test and \mu is the time required to perform a multiplication. Lookup is assumed to be constant- however this is unrealistic as N increases (more on this later). Nonetheless, we’ve satisfied Rules (1 & 2).

The solution is elegant, but it is not obvious to the passerby what it is doing. One must dissect the solution to fully grok the simplicity of the problem. This is not a desirable software trait.

A Simple Solution

For the sake of argument, let’s take a look at the straight forward solution:

public int[] exteriorProduct(int[] A) {
	int[] B = new int[A.Length];
	int p = 1;

	for(int n = 0; n < A.Length; n++)
		p *= A[n];

	for(int n = 0; n < A.Length; n++)
		B[n] = p / A[n];

	return B;
}

Given the input A, we instruct the machine to compute the product over every element and store the result into p. To get the desired answer, the product is divided by A[i]. Simple, clean and easy to understand. But alas, we used that devious division operator…

Again some quick time complexity analysis reveals the following:

U(N) = \alpha \left(N + 1 \right)

V(0) = 0
V(N) = \mu + V(N - 1)
V(N) = \mu N

W(0) = 0
W(N) = \delta + W(N - 1)
W(N) = \delta N

T(N) = U(N) + V(N) + W(N)
T(N) = (\mu + \delta) N + \alpha (N + 1)

Here \delta is the cost of division and \mu and \alpha are the cost of performing multiplication and allocation, respectively.

Performance Showdown

Before we take a look at some hard numbers, lets see what the time complexity analysis says what we will should see:

S(N) = \left( \alpha + \mu + \delta \right) N + \alpha
C(N) = \left( 2 \alpha + 2 \tau + 2 \mu + 1 \right) N

\displaystyle\lim_{N \to \infty } \frac{S(N)}{C(N)} = \frac {\left( \alpha + \mu + \delta \right) N + \alpha}{\left( 2 \alpha + 2 \tau + 2 \mu + 1 \right) N}
\displaystyle\lim_{N \to \infty } \frac{S(N)}{C(N)} \approx \frac {\left( \alpha + \mu + \delta \right) }{2 \left(\alpha + \tau + \mu \right) }
\displaystyle\lim_{N \to \infty } \frac{S(N)}{C(N)} \approx \frac {\\O(1)}{2 \cdot \\O(1)}
\displaystyle\lim_{N \to \infty } \frac{S(N)}{C(N)} \approx \frac {1}{2}

So in short, the simple solution should execute twice as fast as the complex solution.

Algorithm Execution Time (ms) As a Function of N*

101 102 103 104 105 106 107 108 109
Simple 0.0 0.0 0.0 0.0 0.0 15.625 125.0 1265.625 OutOfMem
Complex 0.0 0.0 0.0 0.0 0.0 31.25 312.5 2250.0 OutOfMem

* Tests conducted on Intel T2400 1.83GHz, 987 Mhz bus, 2.00 GB RAM

Just as the time complexity analysis indicated, the simple solution is approximately twice as fast as the complex solution. In addition, both solutions will become worse as N increases as lookup times no longer execute in constant time– certainly more pronounced in the complex solution.

So what…

Granted, Google is in all likelihood aiming to identify candidates who can easily produce solutions that require a tad of lateral thinking- however, this isn’t a good question to ask for a software engineering position for a number of reasons:

  • There is no performance gain to be had from excluding the division operator
  • Unnecessary complexity is introduced into a code base which will ultimately need to be refactored out, thus wasting time
  • And many more…

Interview questions need to focus on problems that require ingenuity, not ones that require candidates to go against common sense software engineering practices. This problem, and many like it, ignore the software engineering aspect of a job which is just as important as being clever at devising algorithms.

Advertisements

Written by lewellen

2008-06-22 at 7:29 pm

2 Responses

Subscribe to comments with RSS.

  1. One thing to note is that your “simple” approach is incorrect. In the case where there is a zero somewhere in the input array, your function will crash with division by zero.

    It turns out when you deal with this problem, the code becomes much less elegant. You have to deal with the special case of an array having exactly one zero vs more than one zero vs no zeros.

    I think that’s what makes the ‘no division’ part interesting. It’s because the most natural way of thinking about the problem leads to a solution with many special cases. VS the ‘no divide’ solution has no special cases.

    greg

    2011-03-17 at 12:18 pm

  2. greg- you have some valid points, but by your criteria, so is the “complex” solution described above since it doesn’t check for integer overflows.

    Nonetheless, the above example code above is not intended to be all inclusive.

    I will still champion the simple method over the complex, on the principle that I don’t want an implementation to hide the fact that we are covering edge cases.

    lewellen

    2011-03-17 at 7:40 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: