Archive for the ‘Software Engineering’ Category
Android ecosystem on Windows 7
Introduction
This past spring I decided to take a dive into mobile platforms and decided I’d get my feet with a little Android development. I’d done some Java development in the past and and with reports of increased market share, especially among tablets, I figured Android was the right platform to get started with. Since it’s always good to document things, here’s a rundown of what was needed to get a basic development environment working on Windows 7. As I continue to explore and learn more, I’ll continue to update this list with more information.
JDK SE 1.6
First thing that I needed to download was the Java Development Kit (JDK) from Oracle. The JDK provides all of the necessary components to get started and run Java based applications.
Android SDK
Next core development kit to download is the Android SDK from Google. The SDK has all of the Android specific tools and libraries to develop and test applications that are meant to run on the Android platform.
SDK Manager
It is a little deceptive, but installing the Android SDK above only installs a set of tools to manage different versions of the SDK. To download the actual libraries you’ll need to launch the SDK Manager. From there I decided to download the API for Gingerbread 2.3.3 (API10) and for Honeycomb 3.x (API11-13). I had to run the SDK Manager as an Administrator in order for the application to properly download all of the assets to my machine. Running the application as a user will result in a number of “Access Denied” errors in the log console.
AVD Manager
The second thing I setup were two virtual devices using the AVD Manager. My main interest is in developing applications for tablets, so I created a Honeycomb virtual device with a gig of memory and a reduced screen size. Also decided to create a Gingerbread virtual device with half a gig of memory without any screen size restrictions. Both come in handy to make sure that any app I write will work well on handhelds and tablets.
IntelliJ Community Edition
To do my development I decided to go with IntelliJ as my editor. A lot of people out there use Eclipse; I used to use it a long time ago, but decided that I wanted to try something new. Installation went smoothly and the only custom configuration dealt with specifying the location of the JDK and the Android SDK.
Acer A500 USB Driver
Developing an application against a virtual device is fun and all, but nothing beats testing and using an application on an actual device. I decided to go with an Acer A500 since it was the right mix of features, cost and responsiveness. To deploy an application to the device, I did have to go and download a USB driver from Acer’s website so that the Android Debug Bridge (ADB) from the Android SDK would recognize the device when it was plugged in to my machine. Past that, it was a seamless experience of getting IntelliJ to deploy an application to the device and for me to begin hands-on testing.
Wrap-up
Overall, getting started with the Android Platform has gone very smoothly. I dove into all of this without reading any documentation and the process was fairly self explanatory. I’m looking forward to learning more about the platform and ultimately trying to get a product out on the App Market.
Thoughts on delivering software
Preface
A lot can happen in four years. We get used to its passing as a milestone marking a transition from one stage of life to another. I was thinking about the last four some odd years since I’d graduated from college and came to realize that while I’ve learned a lot since then, I still have a ways to go. One recurring question I have floating around my head all this time is how to deliver a complete product. At the end of college I had the theoretical side of the story, but I only had limited exposure doing internships and consulting.
My first job after college was working for a small company focusing on healthcare. Our development group was just a couple months old and I was the second developer brought on board. We started out as a group of cowboys doing what was necessary to retain customers and bring on new ones. Despite the chaos, lack of requirements and ever shifting priorities, we managed to make the company successful and a leader in its industry. During that time, I feel that I have learned a lot about how to deliver software in the face of uncertainty.
More importantly, I’ve learned how be more than just a developer and grow into someone trusted to autonomously improve the business with my vision and execution. I’m going to cover a few choice topics that I feel have had the biggest impact on how I deliver software. Things that I feel answer that burning question of how to deliver a complete product. This isn’t intended to be comprehensive, but it should give someone where I was four years ago a little insight into how to get a quality product out the door even when it feels like it’ll never happen.
Communication
When you are sitting in front of a computer, it’s easy to get excited and starting writing whatever happens to pop into mind and race its way down your fingers. Few things are more satisfying as a developer than falling into flow and building something to completion- going through the motions of refactoring, testing and becoming convinced that you’ve written something solid. You eagerly present your product to the stakeholders. They look back at you and say, “this isn’t at all what we asked for”.
It is easy to make this mistake. Not taking the time to fully understand the stakeholder’s needs and more importantly, making sure they do too. After all, the biggest reason why software fails to meet expectations is because expectations weren’t established and requirements weren’t well defined. The source of these failures lies in failures in communication. It should seem obvious, but any software endeavor that involves more than one person relies on thorough communication to execute a shared goal- to bring a solution to market in a timely manner.
To make sure that you avoid this pitfall, take the time to work with your stakeholders. An hour of talking and planning can prevent you from wasting a day of misdirected development time. Focus on the domain problem and its solution. How the solution is implemented, is largely irrelevant compared to its completeness of requirements. To ensure that you and your stakeholder have the same view of the product, produce a dossier similar to a user’s guide that outlines how the solution behaves. Once defined, produce a quick prototype and present it to the stakeholder. Upon their approval, implement the product.
Simplicity
Now, simplicity doesn’t imply mediocre no more than complexity implies extraordinary. As Antoine de Saint-Exupéry said in Wind, Sand and Stars, “Il semble que la perfection soit atteinte non quand il n’y a plus rien à ajouter, mais quand il n’y a plus rien à retrancher“. Simplicity is simply the path to perfection and removing that which is unnecessary. Much of what can be said about delivering software can be said about managing complexity. Markets change and so too do the demands on products. As a consequence, so too do the demands on code. How much that code has to change depends largely on how well it was written.
Complexity encroaches on software in one of two ways. A product that grows organically without any direction and second through poor coding practices and design. This lack of direction comes from stakeholders committing to one-off functionality without thinking how it contributes to a uniform product. In the second example, an unorthodox solution results in a product that can never be changed and a bad set of abstractions make it difficult to change anything.
What can be done to avoid complexity and enable simplicity? Focus on established principles of the field. Design components that are modular, capable only of doing one thing and doing it well. Interfaces between systems should be flexible in what they accept and resistant to change. Code should assert its assumptions and not be ad hoc in its solutions. Above all, code should be written with meaningful abstractions that represent the problem and take into consideration how that problem can mutate or be extended to other problems. Doing this intelligently, results in maintainable, reliable and predictable code -but more importantly- code that ships.
Introspection
Being a software professional requires a certain degree of introspection. Solving problems requires us to build complex models in our minds and walk around them to discover insights that lead to solutions. The ability to maintain these mental models is an important factor to our ability to deliver a product. Despite the amount of time spent in one’s mind, there is often a failure to think about how one carries out his or her job. Lucid requirements, minimal designs and all the theory in the world are useless if you don’t know how to keep up with the demands placed upon you.
What happens when you fail to keep up? You might think to spend an extra hour a day trying to catch up. This works for a bit, but not for long. You end up giving something up for that extra hour and usually it means sacrificing sleep, relationships, interests or your health. Doing this for prolonged periods leads to falling further and further behind until deadlines are racing past you, pounds piling up on your body and work invading every aspect of your life. Ultimately, it all leads to burnout.
To keep up with demands, and to get that product out the door, it is important to know yourself and set limitations. Figure out if you are most effective in the morning or night. Understand if you like switching between projects or like focusing on one project at a time. At the end of each day, write down what you’ve accomplished and what you hope to accomplish the next day. A list makes it easy for you to see that things are getting done and what you need to be working on. If you find yourself behind due to reasons beyond your control, negotiate with the stakeholders to extend the timeline. Nothing is ever set in stone.
Epilogue
I doubt there is an all encompassing answer to how to deliver a complete product. Each year I have a different take on the question and no doubt I will continue to in the years to come. I am certain however, that building a product takes more than just a few clever software professionals, it takes a broad set of skills and abilities from people of different backgrounds. Sales, management, marketing, accounting, information technology and many other disciplines contribute to the delivery of a complete product. Building a product with these groups in mind makes it easier to deliver a complete product but also one that is successful.
Do each of these groups have the tools they need to provide a quality service to customers? Is there a well defined process for supporting the product and escalating issues to development? Does sales and marketing have access to usage data to know what customers are doing with the product? Does accounting and payroll have the information they need to send out bills and payments for services rendered? A product can really only be called complete if there is a set of processes, services and tools supporting it.
Take the time to really learn your company and industry. It’s easy enough to just be someone who spends all day writing code, but its just as easy to spend time learning what the business is really all about and what problems it aims to solve in its industry. Spend the time to learn what other groups in the company are up to and how their efforts are contributing to the success of your product. The technical mind is a great platform for spotting opportunities that can result in better products and ultimately a better company. When you have the big picture, there really isn’t anything holding you back from delivering a complete product.
Haskell ecosystem on Windows XP
It’s been fun watching the Haskell ecosystem evolve into a mature system over the years. Seems that about every three months it takes a leap forward and I usually find myself uninstalling what I previously had and putting the latest and greatest on my laptop. To save myself some time in the future, I’ve compiled this post as a reference of basic “stuff” that is good to have for doing Haskell development on a Windows XP machine.
Haskell Platform
It used to be that you had to go find a number of different applications to get packages, compile source code and generate documents (among a handful of other things), then a group of folks identified an opportunity and put together a platform for getting new users a place to start. The result is the Haskell Platform.
- Glasgow Haskell Compiler (GHC) – The Haskell compiler that also comes with a command line interpreter (GHCi). Alternatives are the York Haskell Compiler (yhc) and Hugs
- Cabal – Basic package management system
- Haddock – used for generating documentation based on comments in source code
After installing, you’ll want to go to the command line and run the following commands to make sure that you’ve got the latest version of Cabal and to make sure that it has the latest package list:
C:\cabal install cabal-install C:\cabal update
Leksah
Many developers are probably used to having a quality Integrated Development Environment (IDE) to work with and the Haskell Community’s answer is Leksah. Leksah is still fairly green and has a ways to go before being comparable to Eclipse or Visual Studio, but nonetheless, Leksah is packed with plenty of goodies that will make for easy development of packages and modules for distribution on Cabal.
It is best to use the installer from the Leksah website. Once you’ve installed the latest, you’ll need to run the following from the command-line
C:\ghc-pkg recache
So that the packages on the system (those provided by GHC) will show up when you have the browse pane open.
Gtk2hs
If you plan on doing any Graphical User Interfaces (GUIs), then you’ll want to get the Haskell binding to the GTK+ library. On the page there should be an “All-in-one bundle”- for the purpose of the following, I went with version 2.20.
After extracting the bundle on the machine, make sure that the path you extracted the bundle at along with the child bin directory is added to the PATH environment variable.
Then from the command-line run the following and you should be able to include GTK in your project:
C:\cabal install gtk
HOpenGL
I’ve been working on some basic game programming and I’ve done some stuff in the past with OpenGL so I decided to give the Haskell bindings a try. Windows doesn’t natively ship with the OpenGL library, so you’ll need to get it from here.
Then get the following packages off of Cabal:
c:\cabal install glut C:\cabal install opengl
Wrap-up
I haven’t done a dry run to test all of the above, so if you follow all of the above and come across a problem, post the solution in the comments. I’ll continue to update this post as I identify any problems or come across additional “stuff” that falls into the must-have category.
Getting a Development Network Started
I’ve been in the process of revamping my home network and have been spending time thinking about how I’d like to set up my development environment for my personal projects that I might try and sell one day. Most of the work I do is C#, ASP.NET, PHP, Java, Haskell,… the list goes on, so I’ve been thinking about what kind of solution will allow me to build against different OSes and platforms. The following is a rundown of this thought process and the considerations and decisions made in bringing up my network.
Anytime I do any planning there are a handful of main points that I try to keep focused on:
- Cost – How much money I’m interested in putting into a project?
- Time – Total time investment to bring up the hardware and software.
- Quality – Am I after a quick solution or one that will have long lasting use?
- Portability – How easy would it be to move the system to another platform.
- Extensibility – How easy it is to add on new entities.
I know I want to keep the project under 1000 USD- this cost includes hardware, operating system licenses, software licenses, utility costs over the lifetime of the solution and opportunity costs etc. Time wise, I want something that could take an hour a day to get setup, working and tweaked to perfection over the course of a week. I want something that is going to be flexible enough to be useful five years down the road, but is also capable of doing what I want today, thus I want a solution that doesn’t look like it was thrown together with duct tape but also doesn’t look like I spent years planning it out. It is important for me to be able to port my solution to new hardware quickly and effortlessly as well as add on new elements as needed. This is especially important if my environment crashes as a result of hardware failure or software malice.
There are a variety of ecosystems that I’m used to working with. the following table summarizes the residents of each:
Ecosystem | Type of Work | Runtimes | IDE | Databases | Web Server |
---|---|---|---|---|---|
.net | Websites, Web Services, Clients, Services | .net Framework 4.0, Mono 2.6 | Visual Studio 2008 Express | SQL Server 2008 Express | Internet Information Services 7.0 |
Java | Clients | JVE 1.6 | Eclipse Galileo | MySQL Community Server 5.1 | Apache 2.2 |
Haskell | Clients | NA | vim, yi, Leksah | MySQL Community Server 5.1 | Apache 2.2 |
These ecosystems also have corresponding environments for dealing with source control, build automation, bug tracking and project tracking. Given the ecosystems I’m interested in, I’ve decided on Subversion, Hudson, Mantis and twiki to manage all of my projects’ artifacts.
Having reviewed what a lot of other shops have done, there are a couple common elements that most development networks incorporate:
- repo– Source repository and OS specific Database(s).
- dev– IDEs and frameworks for the development of OS specific applications. (per developer machine, often dual booting)
- web– OS specific web servers hosting platform specific websites.
- build– Dedicated build machines for producing assemblies for specific platforms and OSes.
Given these elements, the languages, platforms and operating systems that I’m interested in, I’ve settled on an ideal network that looks like the following:
Important to notice the use of virtualization here. Being able to store a series of ISOs for each of the element groups on a NAS makes it easy to bring up new instances and make backups, thus satisfying my time and portability criteria. As well as satisfying my cost criteria as it is cheaper to purchase a beefy box running several virtual machines than it is to purchase several physical machines. At the time of writing (2009-12), most quad core machines run for about 1000 USD and, and most (consumer) NASes cost about 100-300 USD. Now, I could load up the server up with 1TB storage for an additional 100 USD. Of course, this means I have a single point of failure- which in a home environment may not be a huge deal. In terms of money these two are the main consumers at the hardware level. Everything else already exists on the network.
Lets take a look at the estimated costs:
Item | Quantity | Amount (USD) | Extended Amount (USD) |
---|---|---|---|
Dell Studio XPS 8000 | 1 | 1100.00 | 1100.00 |
1TB Western Digital MyBook | 1 | 120.00 | 120.00 |
Windows XP Professional Licenses | 4 | 100.00 | 400.00 |
1620.00 |
That total amount is a little more that initially desired. It is possible to collapse vm-windows-dev and time-thief down to one machine, and then collapse vm-windows-repo, vm-windows-web and vm-windows-build down to a single machine resulting in a cost savings of 300.00 USD from reduced operating system license costs. If the NAS is removed from the picture, that brings us down to 1200.00 which is about as close as I’m going to get to my initial target of 1000 USD. Not sure if this is the final setup that I will end up going with, per usual, I’ll update this post with any new developments.
Solutions to some Microsoft interview questions
A couple of weeks ago, someone on reddit posted a link to a collection of Microsoft Interview Questions. As someone who interviewed with them while in college, I was curious to see what kind question they were asking folks. After reviewing the list, I thought I’d work out a few that looked interesting:
Imagine an analog clock set to 12 o’clock. Note that the hour and minute hands overlap. How many times each day do both the hour and minute hands overlap? How would you determine the exact times of the day that this occurs?
Thus*: 12:00:00 AM, 01:05:27 AM, 02:10:54 AM, 03:16:21 AM, 04:21:49 AM, 05:27:16 AM, 06:32:43 AM, 07:38:10 AM, 08:43:38 AM, 09:49:05 AM, 10:54:32 AM, 12:00:00 PM, 01:05:27 PM, 02:10:54 PM, 03:16:21 PM, 04:21:49 PM, 05:27:16 PM, 06:32:43 PM, 07:38:10 PM, 08:43:38 PM, 09:49:05 PM and 10:54:32 PM.
* Floor the conversion from to hour, minute and second of day.
Pairs of primes separated by a single number are called prime pairs. Examples are 17 and 19. Prove that the number between a prime pair is always divisible by 6 (assuming both numbers in the pair are greater than 6). Now prove that there are no ‘prime triples.’
Assuming:
Twin Primes:
Prime Triples:
There are 4 women who want to cross a bridge. They all begin on the same side. You have 17 minutes to get all of them across to the other side. It is night. There is one flashlight. A maximum of two people can cross at one time. Any party who crosses, either 1 or 2 people, must have the flashlight with them. The flashlight must be walked back and forth, it cannot be thrown, etc. Each woman walks at a different speed. A pair must walk together at the rate of the slower woman’s pace.
For example if Woman 1 and Woman 4 walk across first, 10 minutes have elapsed when they get to the other side of the bridge. If Woman 4 then returns with the flashlight, a total of 20 minutes have passed and you have failed the mission. What is the order required to get all women across in 17 minutes? Now, what’s the other way?
One way:
- Woman 1 and 2 Cross, woman 1 returns. 3 minutes total.
- Woman 3 and 4 Cross, woman 2 returns. 15 minutes total.
- Woman 1 and 2 Cross. 17 Minutes total.
Other way:
- Woman 1 and 2 Cross, woman 2 returns. 4 minutes total.
- Woman 3 and 4 Cross, woman 1 returns. 11 minutes total.
- Woman 1 and 2 Cross. 17 minutes total.
If you had an infinite supply of water and a 5 quart and 3 quart pail, how would you measure exactly 4 quarts?
- Fill 5 quart pail with 5 quarts of water from source.
- Fill 3 quart pail with 3 quarts of water from 5 quart pail.
- Empty 3 quart pail.
- Empty 5 quart pail containing 2 quarts of water into 3 quart pail.
- Fill 5 quart pail with 5 quarts of water from source.
- Fill 3 quart pail with water from 5 quart pail.
- 5 quart pail now contains 4 quarts of water.
Suppose you have an array of 1001 integers. The integers are in random order, but you know each of the integers is between 1 and 1000 (inclusive). In addition, each number appears only once in the array, except for one number, which occurs twice. Assume that you can access each element of the array only once. Describe an algorithm to find the repeated number. If you used auxiliary storage in your algorithm, can you find an algorithm that does not require it?
public int duplicateNumber(int[] A) { int count = 0; for(int k = 0; k < A.Length; k++) count += A[k]; return count - (A.Length * (A.Length - 1) >> 1); }
Count the number of set bits in a number. Now optimize for speed. Now optimize for size.
public int bitsUsed(int n) { int count = 0; while(n != 0) { count += n & 1; n >>= 1; } return count; }
Automated inheritance refactoring
Over time, we all gain experience and insight in deciding how to generalize the principle actors of our problem domains into different layers of abstraction to minimize the amount of code we write. After awhile, this practice becomes tedious and one begins to wonder whether or not it is feasible to have an integrated refactoring (in our IDE of choice) that will review our class inheritance hierarchy and suggest to us an idealized class inheritance hierarchy. After all, an optimal solution would be able to identify relationships that our minds would otherwise not be able to fathom- as is the case as our code bases and development costs increase.
What we’re really interested in is an automated version of Extract Superclass that takes in a collection of classes and returns some number of superclasses. Thing is, we are back to our original problem where we have a collection of classes that we want to generalize. If we continue this process as a feedback loop we end up eventually with an irreducible collection of classes. The original collection along with the rest of the generated collections constitute the idealized class inheritance hierarchy.
During the mid-nineties this topic was the subject of a handful of masters and doctorate programs but since then, there seems to be a significant gap in the flow of academic material on the subject. Furthermore, there seems to be little recent evidence from industry to indicate work being done on a solution. Is such a solution simply not needed? Are the outputs of solutions impractical? What is the blocking element to adoption? The only evidence of this idea getting any momentum is given by Guru– a solution targeted at the Self programming language. Where are the solutions for C++, Java, C# et al.?
Problem with an interesting little problem
A Simple Problem
Rules:
- Solutions must be
in time complexity
- Solutions must not use the division operator
fsharp.it posted the above simple Google interview question on their site a few weeks ago and subsequently the problem was referenced by OJ’s rants and later posted to the programming subreddit on reddit.com. Like everyone else, I sat down and wrote some quick code to solve the problem. After looking a the solution, it seemed to me that this was a rather poorly conceived problem- if not contrived at heart. It doesn’t ring to the tune of scalability, performance and quality that one thinks of when the word Google is thrown into the mix.
In the following, we’ll explore why this this problem isn’t a well designed interview question for a software engineering position. For software engineering, a problem should test a candidates ability to deliver simple and optimal solutions under non-ideal situations.
A Complex Solution
Rule (1) is simple enough to deal with, Rule (2) is a little more unexpected, and one can devise any number of reasons why it would be advantageous:
- Most CPUs require 4x the number of cycles to perform a division vs a multiplication on a set of 32-bit registers
- The rare event that a CPU doesn’t offer a division instruction
- Google want to see the candidate solve a simple problem with only a subset of tools
- And so on…
After some thinking, one eventually devises the following imperative solution:
public int[] exteriorProduct(int[] A) { int[] L = new int[A.Length], R = new int[A.Length]; for (int n = 0, m = A.Length - 1; n < A.Length && m >= 0; n++, m-– ) { L[n] = n == 0 ? 1 : A[n - 1] * L[n - 1]; R[m] = m == A.Length - 1 ? 1 : R[m + 1] * A[m + 1]; } int[] B = new int[A.Length]; for (int n = 0; n < A.Length; n++) B[n] = L[n] * R[n]; }
Given the input A, we instruct the machine to memoize the partial product from 0 to i – 1 and store said value into L[i]; likewise, from |A| – 1 to i + 1 we store the partial product into R[i]. The desired product is the product of L[i] and R[i].
Some quick time complexity analysis reveals the following:
Where is the memory allocation time,
is the time required to perform a boolean test and
is the time required to perform a multiplication. Lookup is assumed to be constant- however this is unrealistic as N increases (more on this later). Nonetheless, we’ve satisfied Rules (1 & 2).
The solution is elegant, but it is not obvious to the passerby what it is doing. One must dissect the solution to fully grok the simplicity of the problem. This is not a desirable software trait.
A Simple Solution
For the sake of argument, let’s take a look at the straight forward solution:
public int[] exteriorProduct(int[] A) { int[] B = new int[A.Length]; int p = 1; for(int n = 0; n < A.Length; n++) p *= A[n]; for(int n = 0; n < A.Length; n++) B[n] = p / A[n]; return B; }
Given the input A, we instruct the machine to compute the product over every element and store the result into p. To get the desired answer, the product is divided by A[i]. Simple, clean and easy to understand. But alas, we used that devious division operator…
Again some quick time complexity analysis reveals the following:
Here is the cost of division and
and
are the cost of performing multiplication and allocation, respectively.
Performance Showdown
Before we take a look at some hard numbers, lets see what the time complexity analysis says what we will should see:
So in short, the simple solution should execute twice as fast as the complex solution.
Algorithm Execution Time (ms) As a Function of N*
101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | |
---|---|---|---|---|---|---|---|---|---|
Simple | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 15.625 | 125.0 | 1265.625 | OutOfMem |
Complex | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 31.25 | 312.5 | 2250.0 | OutOfMem |
* Tests conducted on Intel T2400 1.83GHz, 987 Mhz bus, 2.00 GB RAM
Just as the time complexity analysis indicated, the simple solution is approximately twice as fast as the complex solution. In addition, both solutions will become worse as N increases as lookup times no longer execute in constant time– certainly more pronounced in the complex solution.
So what…
Granted, Google is in all likelihood aiming to identify candidates who can easily produce solutions that require a tad of lateral thinking- however, this isn’t a good question to ask for a software engineering position for a number of reasons:
- There is no performance gain to be had from excluding the division operator
- Unnecessary complexity is introduced into a code base which will ultimately need to be refactored out, thus wasting time
- And many more…
Interview questions need to focus on problems that require ingenuity, not ones that require candidates to go against common sense software engineering practices. This problem, and many like it, ignore the software engineering aspect of a job which is just as important as being clever at devising algorithms.