Archive for the ‘Computer Science’ Category
Minesweeper Agent
Introduction

Lately I’ve been brushing up on probability, statistics and machine learning and thought I’d play around with writing a Minesweeper agent based solely on these fields. The following is an overview of the game’s mechanics, verification of an implementation, some different approaches to writing the agent and some thoughts on the efficacy of each approach.
Minesweeper
Background
Minesweeper was created by Curt Johnson in the late eighties and later ported to Windows by Robert Donner while at Microsoft. With the release of Windows 3.1 in 1992, the game became a staple of the operating system and has since found its way onto multiple platforms and spawned several variants. The game has been shown to be NP-Complete, but in practice, algorithms can be developed to solve a board in a reasonable amount of time for the most common board sizes.
Specification
Gameplay |
|
|
An agent, |
|
Initialization |
|
|
The board consists of hidden and visible states. To represent the hidden, Characters ’0′-’8′ represent the number of neighboring mines, character ‘U’ to represent an unexposed grid location and character ‘*’ for a mine. Neighbors of a grid location |
|
Exposing Cells |
|
|
The expose behavior can be thought of as a flood fill on the grid, exposing any empty region bordered by grid locations containing mine counts and the boundaries of the grid.
A matrix, If a cell location can be exposed, then each of its neighbors will be added to the stack to be inspected. Those neighbors that have already been inspected will be skipped. Once all the reachable grid locations have been inspected, the process terminates. |
|
Verification
Methodology
Statistical tests are used to verify the random aspects of the game’s implementation. I will skip the verification of the game’s logic as it requires use of a number of different methods that are better suited for their own post.
There are two random aspects worth thinking about: the distribution of mines and the distribution of success (i.e., not clicking a mine) for random trials. In both scenarios it made since to conduct Pearson’s chi-squared test. Under this approach there are two hypotheses:
: The distribution of experimental data follows the theoretical distribution
: The distribution experimental data does not follow the theoretical distribution
is accepted when the test statistic,
, is less than the critical value,
. The critical value is determined by deciding on a p-value (e.g., 0.05, 0.01, 0.001),
, that results in the tail area beneath the chi-squared distribution
equal to
.
is the degrees of freedom in the observation.
Mine distribution
The first aspect to verify was that mines were being uniformly placed on the board. For a standard board with
mines, the expectation is that each grid location should be assigned
times for
trials.
for this experiment.

In the above experiment, and
. Since
, this affirms
and that the implemented distribution of mines is indeed uniform with a statistical significance of
.
Distribution of successful clicks
The second aspect to verify is that the number of random clicks before exposing a mine follows a hypergeometric distribution. The hypergeometric distribution is appropriate since we are sampling (exposing) without replacement (the grid location remains exposed after clicking). This hypothesis relies on a non-flood-fill exposure.
The distribution has four parameters. The first is the number of samples drawn (number of exposures), the second the number of successes in the sample (number of empty exposures), the third the number of successes in the population (empty grid locations) and the last the size of the population (grid locations): .
The expected frequencies for the hypergeometric distribution is given by for
trials.
in this case.

In the above experiment and
. Since
, this affirms
and that the number of locations exposed prior to exposing a mine follows a hypergeometric distribution with a statistical significance of
.
Also included in the plot is the observed distribution for a flood based exposure. As one might expect, the observed frequency of more exposures decreases more rapidly than that of the non-flood based exposure.
Agents
Methodology
Much like how a human player would learn to play the game, I decided that each model would have knowledge of game’s mechanics and no prior experience with the game. An alternative class of agents would have prior experience with the game as the case would be in a human player who had studied other player’s strategies.
To evaluate the effectiveness of the models, each played against a series of randomly generated grids and their respective success rates were captured. Each game was played on a standard beginner’s grid containing between
mines.
For those models that refer to a probability measure, , it is assumed that the measure is determined empirically and treated as an estimate of the probability of an event and not as an a priori measure.
Marginal Model
DevelopmentThe first model to consider is the Marginal Model. It is designed to simulate the behavior of a naive player who believes that if he observes a mine at a grid location that the location should be avoid in future trials. The model treats the visible board,
|
|
Test Results
Since the mine distribution is uniform, the model should be equivalent to selecting locations at random. The expected result is that avoiding previously occupied grid locations is an ineffective strategy as the number of mines increases. This does however, provide an indication of what the success rate should look like for chance alone.

Conditional Model
DevelopmentOne improvement over the Marginal Model is to take into account the visual clues made visible when an empty grid location is exposed. Since an empty grid location represents the number of neighboring mines, the Conditional Model can look at these clues to determine whether or not an unexposed grid location contains a mine. This boils down to determining the probability of As in the case of the Marginal Model, the Conditional Model returns the grid location that it has determined has the greatest probability of being empty given its neighbors:
|
|
Test Results
The Naïve Bayes Classifier is regarded as being an effective approach to classifying situations for a number of different tasks. In this case, it doesn’t look like it is effective at classifying mines from non-mines. The results are only slightly better than the Marginal Model.

Graphical Model
DevelopmentOne shortfall of the Conditional Model is that it takes a greedy approach in determining which action to take. A more sophisticated approach is to not just consider the next action, but the possible sequence of actions that will minimize the possibility of exposing a mine. Each of the possible observable grids, It is possible to pick a path, Solving for
|
|
From the optimal walk, a sequence of optimal actions is determined by mapping over the path. Taking the first action gives the optimal grid location to expose given the current visible state of the board.
This description constitutes a Markov Decision Process. As is the case for most stochastic processes, it is assumed that the process holds the Markov Property; that future states only depend upon the current states and none of the prior states. In addition to being a Markov Decision Process, this is also an example of Reinforcement Learning.
First thing to observe is that the game state space is astronomical. For a standard beginner’s grid there is at most a sesvigintillion possible grids that a player can encounter. Which as an aside, is on the order of the number of atoms in the observable universe! The set of actions at each state is slightly more manageable with at most eighty-one actions.
To simplify the state space, I chose to only consider boards and when evaluating a full grid, consider the possible sub-grids and evaluate the optimal sequence of actions for each sub-grid and pick the maximum reward associated for each sub-grid that was evaluated as the action to take on the full grid.
Test Results
The Graphical Model produces results that are only a margin better than those of the Conditional Model.

Semi-deterministic Model
DevelopmentThe last model I’m going to talk about is a semi-deterministic model. It works by using the visible grid to infer the topology of the hidden grid and from the hidden grid, the topology that the visible grid can become. The grid can be viewed as a graph. Each grid location is a vertex and an edge is an unexposed grid location’s influence on another grid location’s neighbor mine count. For each of the exposed grid locations on the board, The model produces its inferred version, For each of the grid locations that are exposed and the inferred influence matches the visible count, then each of the neighbors about that location can be exposed provided they are not already exposed and not an inferred mine. From this set of possibilities, a mine location is chosen. When no mine locations can be determined, then an alternative model can be used. |
|
Test Results
Since the model is a more direct attempt at solving the board, its results are superior to the previously presented models. As the number of mines increases, it is more likely that it has to rely on a more probabilistic approach.

Summary
Each of the models evaluated offered incremental improvements over their predecessors. Randomly selecting locations to expose is on par with choosing a location based on previously observed mine locations. The Conditional Model and Graphical Model yield similar results since they both make decisions based on conditioned probabilities. The Semi-deterministic Model stands alone as the only one model that produced reliable results.

The success rate point improvement between the Condition and Marginal models is most notable for boards consisting of three mines and the improvement between Graphical and Semi-deterministic models for seven mines. Improvements between Random and Marginal models is negligible and between Conditional and Graphical is minor for all mine counts fewer than seven.

Given the mathematical complexity and nondeterministic nature of the machine learning approaches, (in addition the the complexity and time involved in implementing those approaches) they don’t seem justified when more deterministic and simpler approaches exist. In particular, it seems like most people have implemented their agents using heuristics and algorithms designed to solve constraint satisfaction problems. Nonetheless, this was a good refresher to some of the elementary aspects of probability, statistics and machine learning.
References
“Classification – Naïve Bayes.” Data Mining Algorithms in R. Wikibooks. 3 Nov. 2010. Web. 30 Oct. 2011.
“Windows Minesweeper.” MinesweeperWiki. 8 Sept. 2011. Web. 30 Oct. 2011.
Kaye, Richard. “Minesweeper Is NP-complete.” [pdf] Mathematical Intelligencer 22.2 (2000): 9-15. Web. 30 Oct. 2011.
Nakov, Preslav, and Zile Wei. “MINESWEEPER, #MINESWEEPER.” 14 May 2003. Web. 14 Apr. 2012.
Richard, Sutton, and Andrew G. Barto. “3.6 Markov Decision Processes.” Reinforcement Learning: An Introduction. Cambridge, Massachusetts: Bradford Book, 1998. 4 Jan. 2005. Web. 30 Oct. 2011.
Rish, Irene “An Empirical Study of the Naive Bayes Classifer.” [pdf] IJCAI-01 Workshop on Empirical Methods in AI (2001). Web. 30 Oct. 2011.
Russell, Stuart J., and Peter Norvig. Artificial Intelligence: A Modern Approach. Upper Saddle River, NJ: Prentice Hall/PearsonEducation., 2003. Print.
Sun, Yijun, and Jian Li. “Adaptive Learning Approach to Landmine Detection.” [pdf] IEEE Transactions of Aerospace and Electronic Systems 41.3 (2005): 1-9. 10 Jan. 2006. Web. 30 Oct. 2011.
Taylor, John R. An introduction to error analysis: the study of uncertainties in physical measurements. Sausalito, CA: University Science Books, 1997. Print.
Abstract Algebra in C#
Motivation
In C++ it is easy to define arbitrary template methods for computations involving primitive numeric types because all types inherently have arithmetic operations defined. Thus, a programmer need only implement one method for all numeric types. The compiler will infer the use and substitute the type at compile time and emit the appropriate machine instructions. This is C++’s approach to parametric polymorphism.
With the release of C# 2.0 in the fall of 2005, the language finally got a taste of parametric polymorphism in the form of generics. Unfortunately, types in C# do not inherently have arithmetic operations defined, so methods involving computations must use ad-hoc polymorphism to achieve the same result as in C++. The consequence is a greater bloat in code and an increased maintenance liability.
To get around this design limitation, I’ve decided to leverage C#’s approach to subtype polymorphism and to draw from Abstract Algebra to implement a collection of interfaces allowing for C++ like template functionality in C#. The following is an overview of the mathematical theory used to support and guide the design of my solution. In addition, I will present example problems from mathematics and computer science that can be represented in this solution along with examples how type agnostic computations that can be performed using this solution.
Abstract Algebra
Abstract Algebra is focused on how different algebraic structures behave in the presence of different axioms, operations and sets. In the following three sections, I will go over the fundamental sub-fields and how they are represented under the solution.
In all three sections, I will represent the distinction between algebraic structures using C# interfaces. The type parameters on these interfaces represent the sets being acted upon by each algebraic structure. This convention is consistent with intuitionistic (i.e., Chruch-style) type theory embraced by C#’s Common Type System (CTS). Use of parameter constraints will be used when type parameters are intended to be of a specific type. Functions on the set and elements of the set will be represented by methods and properties respectively.
Group Theory
Group Theory is the simplest of sub-fields of Abstract Algebra dealing with the study of a single binary operation, , acting on a set
. There are five axioms used to describe the structures studied under Group Theory:
- Closure:
- Associativity:
- Commutativity :
- Identity:
- Inverse:
The simplest of these structures is the Groupoid satisfying only axiom (1). Any Groupoid also satisfying axiom (2) is known as a Semi-group. Any Semi-group satisfying axiom (4) is a Monoid. Monoid’s also satisfying axiom (5) are known as Groups. Any Group satisfying axiom (3) is an Abelian Group.
public interface IGroupoid<T> {
T Operation(T a, T b);
}
public interface ISemigroup<T> : IGroupoid<T> {
}
public interface IMonoid<T> : ISemigroup<T> {
T Identity { get; }
}
public interface IGroup<T> : IMonoid<T> {
T Inverse(T t);
}
public interface IAbelianGroup<T> : IGroup<T> {
}
Ring Theory
The next logical sub-field of Abstract Algebra to study is Ring Theory which is the study of two operations, and
, on a single set. In addition to the axioms outlined above, there is an addition axiom for describing how one operations distributes over the other.
- Distributivity:
All of the following ring structures satisfy axiom (6). Rings are distinguished by the properties of their operands. The simplest of these structures is the Ringoid where both operands are given by Groupoids. Any Ringoid whose operands are Semi-groups is a Semi-ring. Any Semi-ring whose first operand is a Group is a Ring. Any Ring whose second operand is a Monoid is a Ring with Unity. Any Ring with Unity whose second operand is a Group is Division Ring. Any Division Ring whose operands are both Abelian Groups is a Field.
public interface IRingoid<T, A, M>
where A : IGroupoid<T>
where M : IGroupoid<T> {
A Addition { get; }
M Multiplication { get; }
T Distribute(T a, T b);
}
public interface ISemiring<T, A, M> : IRingoid<T, A, M>
where A : ISemigroup<T>
where M : ISemigroup<T> {
}
public interface IRing<T, A, M> : ISemiring<T, A, M>
where A : IGroup<T>
where M : ISemigroup<T> {
}
public interface IRingWithUnity<T, A, M> : IRing<T, A, M>
where A : IGroup<T>
where M : IMonoid<T> {
}
public interface IDivisionRing<T, A, M> : IRingWithUnity<T, A, M>
where A : IGroup<T>
where M : IGroup<T> {
}
public interface IField<T, A, M> : IDivisionRing<T, A, M>
where A : IAbelianGroup<T>
where M : IAbelianGroup<T> {
}
Module Theory
The last, and more familiar, sub-field of Abstract Algebra is Module Theory which deals with structures with an operation, , over two separate sets:
and
that satisfy the following axioms.
- Distributivity of
:
- Distributivity of
:
- Associativity of
:
All of the following module structures satisfy axioms (7)-(9). A Module consists of a scalar Ring and an vector Abelian Group. Any Module whose Ring is a Ring with Unity is a Unitary Module. Any Unitary Module whose Ring with Unity is a Abelian Group is a Vector Space.
public interface IModule<
TScalar,
TVector,
TScalarRing,
TScalarAddativeGroup,
TScalarMultiplicativeSemigroup,
TVectorAddativeAbelianGroup
>
where TScalarRing : IRing<TScalar, TScalarAddativeGroup, TScalarMultiplicativeSemigroup>
where TScalarAddativeGroup : IGroup<TScalar>
where TScalarMultiplicativeSemigroup : ISemigroup<TScalar>
where TVectorAddativeAbelianGroup : IAbelianGroup<TVector>
{
TScalarRing Scalar { get; }
TVectorAddativeAbelianGroup Vector { get; }
TVector Distribute(TScalar t, TVector r);
}
public interface IUnitaryModule<
TScalar,
TVector,
TScalarRingWithUnity,
TScalarAddativeGroup,
TScalarMultiplicativeMonoid,
TVectorAddativeAbelianGroup
>
: IModule<
TScalar,
TVector,
TScalarRingWithUnity,
TScalarAddativeGroup,
TScalarMultiplicativeMonoid,
TVectorAddativeAbelianGroup
>
where TScalarRingWithUnity : IRingWithUnity<TScalar, TScalarAddativeGroup, TScalarMultiplicativeMonoid>
where TScalarAddativeGroup : IGroup<TScalar>
where TScalarMultiplicativeMonoid : IMonoid<TScalar>
where TVectorAddativeAbelianGroup : IAbelianGroup<TVector>
{
}
public interface IVectorSpace<
TScalar,
TVector,
TScalarField,
TScalarAddativeAbelianGroup,
TScalarMultiplicativeAbelianGroup,
TVectorAddativeAbelianGroup
>
: IUnitaryModule<
TScalar,
TVector,
TScalarField,
TScalarAddativeAbelianGroup,
TScalarMultiplicativeAbelianGroup,
TVectorAddativeAbelianGroup
>
where TScalarField : IField<TScalar, TScalarAddativeAbelianGroup, TScalarMultiplicativeAbelianGroup>
where TScalarAddativeAbelianGroup : IAbelianGroup<TScalar>
where TScalarMultiplicativeAbelianGroup : IAbelianGroup<TScalar>
where TVectorAddativeAbelianGroup : IAbelianGroup<TVector>
{
}
Representation of Value Types
The CTS allows for both value and reference types on the .NET Common Language Infrastructure (CLI). The following are examples of how each theory presented above can leverage value types found in the C# language to represent concepts drawn from mathematics.
Enum Value Types and the Dihedral Group 
One of the simplest finite groups is the Dihedral Group of order eight, , representing the different orientations of a square,
, obtained by reflecting the square about the vertical axis,
, and rotating the square by ninety degrees,
. The generating set is given by
and gives rise to the set
These elements are assigned names as follows:
,
,
,
,
,
,
and
. The relationship between these elements is visualized below:

The easiest way to represent this group as a value type is with an enum.
enum Symmetry { Rot000, Rot090, Rot180, Rot270, RefVer, RefDes, RefHoz, RefAsc }
From this enum we can define the basic Group Theory algebraic structures to take us to .
public class SymmetryGroupoid : IGroupoid<Symmetry> {
public Symmetry Operation(Symmetry a, Symmetry b) {
// 64 cases
}
}
public class SymmetrySemigroup : SymmetryGroupoid, ISemigroup<Symmetry> {
}
public class SymmetryMonoid : SymmetrySemigroup, IMonoid<Symmetry> {
public Symmetry Identity {
get { return Symmetry.Rot000; }
}
}
public class SymmetryGroup : SymmetryMonoid, IGroup<Symmetry> {
public Symmetry Inverse(Symmetry a) {
switch (a) {
case Symmetry.Rot000:
return Symmetry.Rot000;
case Symmetry.Rot090:
return Symmetry.Rot270;
case Symmetry.Rot180:
return Symmetry.Rot270;
case Symmetry.Rot270:
return Symmetry.Rot090;
case Symmetry.RefVer:
return Symmetry.RefVer;
case Symmetry.RefDes:
return Symmetry.RefAsc;
case Symmetry.RefHoz:
return Symmetry.RefHoz;
case Symmetry.RefAsc:
return Symmetry.RefDes;
}
throw new NotImplementedException();
}
}
Integral Value Types and the Commutative Ring with Unity over 
C# exposes a number of fixed bit integral value types that allow a programmer to pick an integral value type suitable for the scenario at hand. Operations over these integral value types form a commutative ring with unity whose set is the congruence class where
is the number of bits used to represent the integer and
is the equivalance class
with
.
Addition is given by just as multiplication is given by
. Both statements are equivalent to the following congruence statements:
and
respectively.
Under the binary numeral system, modulo is equivalent to ignoring the bits exceeding
, or equivalently,
where
. As a result only the first (right most) bits need to be considered when computing the sum or product of two congruence classes, or in this case, integer values in C#. Thus, in the following implementation, it is not necessary to write any extra code to represent these operations other than writing them in their native form.
The reason why we are limited to a commutative ring with unity instead of a full field is that multiplicative inverses do not exist for all elements. A multiplicative inverse only exists when where
is the multiplicative inverse of
. For a solution to exist,
. Immediately, any even value of
will not have a multiplicative inverse in
. However, all odd numbers will.
public class AddativeIntegerGroupoid : IGroupoid<long> {
public long Operation(long a, long b) {
return a + b;
}
}
public class AddativeIntegerSemigroup : AddativeIntegerGroupoid, ISemigroup<long> {
}
public class AddativeIntegerMonoid : AddativeIntegerSemigroup, IMonoid<long> {
public long Identity {
get { return 0L; }
}
}
public class AddativeIntegerGroup : AddativeIntegerMonoid, IGroup<long> {
public long Inverse(long a) {
return -a;
}
}
public class AddativeIntegerAbelianGroup : AddativeIntegerGroup, IAbelianGroup<long> {
}
public class MultiplicativeIntegerGroupoid : IGroupoid<long> {
public long Operation(long a, long b) {
return a * b;
}
}
public class MultiplicativeIntegerSemigroup : MultiplicativeIntegerGroupoid, ISemigroup<long> {
}
public class MultiplicativeIntegerMonoid : MultiplicativeIntegerSemigroup, IMonoid<long> {
public long Identity {
get { return 1L; }
}
}
public class IntegerRingoid : IRingoid<long, AddativeIntegerGroupoid, MultiplicativeIntegerGroupoid> {
public AddativeIntegerGroupoid Addition { get; private set;}
public MultiplicativeIntegerGroupoid Multiplication { get; private set;}
public IntegerRingoid() {
Addition = new AddativeIntegerGroupoid();
Multiplication = new MultiplicativeIntegerGroupoid();
}
public long Distribute(long a, long b) {
return Multiplication.Operation(a, b);
}
}
public class IntegerSemiring : IntegerRingoid, ISemiring<long, AddativeIntegerSemiring, MultiplicativeIntegerSemiring> {
public AddativeIntegerSemiring Addition { get; private set;}
public MultiplicativeIntegerSemiring Multiplication { get; private set;}
public IntegerSemiring() : base() {
Addition = new AddativeIntegerSemiring();
Multiplication = new MultiplicativeIntegerSemiring();
}
}
public class IntegerRing : IntegerSemiring, IRing<long, AddativeIntegerGroup, MultiplicativeIntegerSemigroup>{
public new AddativeIntegerGroup Addition { get; private set; }
public IntegerRing() : base() {
Addition = new AddativeIntegerGroup();
}
}
public class IntegerRingWithUnity : IntegerRing, IRingWithUnity<long, AddativeIntegerGroup, MultiplicativeIntegerMonoid> {
public MultiplicativeIntegerMonoid Multiplication { get; private set; }
public IntegerRingWithUnity() : base() {
Multiplication = new MultiplicativeIntegerMonoid();
}
}
Floating-point Value Types and the Real Vector Space 
C# offers three types that approximate the set of Reals: floats, doubles and decimals. Floats are the least representative followed by doubles and decimals. These types are obviously not continuous, but the error involved in rounding calculations with respect to the calculations in question are negligible and for most intensive purposes can be treated as continuous.
As in the previous discussion on the integers, additive and multiplicative classes are defined over the algebraic structures defined in the Group and Ring Theory sections presented above. In addition to these implementations, an additional class is defined to describe a vector.
public class Vector<T> {
private T[] vector;
public int Dimension {
get { return vector.Length; }
}
public T this[int n] {
get { return vector[n]; }
set { vector[n] = value; }
}
public Vector() {
vector = new T[2];
}
}
With these classes, it is now possible to implement the algebraic structures presented in the Module Theory section from above.
public class RealVectorModule : IModule<double, Vector<double>, RealRing, AddativeRealGroup, MultiplicativeRealSemigroup, VectorAbelianGroup<double>> {
public RealRing Scalar {
get;
private set;
}
public VectorAbelianGroup<double> Vector {
get;
private set;
}
public RealVectorModule() {
Scalar = new RealRing();
Vector = new VectorAbelianGroup<double>(new AddativeRealAbelianGroup());
}
public Vector<double> Distribute(double t, Vector<double> r) {
Vector<double> c = new Vector<double>();
for (int i = 0; i < c.Dimension; i++)
c[i] = Scalar.Multiplication.Operation(t, r[i]);
return c;
}
}
public class RealVectorUnitaryModule : RealVectorModule, IUnitaryModule<double, Vector<double>, RealRingWithUnity, AddativeRealGroup, MultiplicativeRealMonoid, VectorAbelianGroup<double>> {
public new RealRingWithUnity Scalar {
get;
private set;
}
public RealVectorUnitaryModule()
: base() {
Scalar = new RealRingWithUnity();
}
}
public class RealVectorVectorSpace : RealVectorUnitaryModule, IVectorSpace<double, Vector<double>, RealField, AddativeRealAbelianGroup, MultiplicativeRealAbelianGroup, VectorAbelianGroup<double>> {
public new RealField Scalar {
get;
private set;
}
public RealVectorVectorSpace()
: base() {
Scalar = new RealField();
}
}
Representation of Reference Types
The following are examples of how each theory presented above can leverage reference types found in the C# language to represent concepts drawn from computer science.
Strings, Computability and Monoids
Strings are the simplest of reference types in C#. From an algebraic structure point of view, the set of possible strings, , generated by an alphabet,
, and paired with a concatenation operation,
, forms a monoid.
public class StringGroupoid : IGroupoid<string> {
public string Operation(String a, String b) {
return string.Format("{0}{1}", a, b);
}
}
public class StringSemigroup : StringGroupoid, ISemigroup<string> {
}
public class StringMonoid : StringSemigroup, IMonoid<string> {
public string Identity {
get { return string.Empty; }
}
}
Monoids over strings have a volley of applications in the theory of computation. Syntactic Monoids describe the smallest set that recognizes a formal language. Trace Monoids describe concurrent programming by allowing different characters of an alphabet to represent different types of locks and synchronization points, while the remaining characters represent processes.
Classes, Interfaces, Type Theory and Semi-rings
Consider the set of types , that are primitive and constructed in C#. The generating set of
is the set of primitive reference and value types,
, consisting of the types discussed thus far. New types can be defined by defining classes and interfaces.
A simple operation over
takes two types,
, and yields a third type,
, known as a sum type. In type theory, this means that an instance of
can be either an instance of
or
. A second operation
over
takes two types and yields a third type representing a tuple of the first two types. In other words,
.
Both operations form a semi-group and
and in conjunction the two form a semi-ring.
To implement this semi-ring is a little involved. The .NET library supports emitting dynamic type definitions at runtime. For sum types, this would lead to an inheritance view of the operation. Types and
would end up deriving from
which means that any sequence of sums would yield an inheritance tree. A product type would result in composition of types with projection operations,
, to access and assign the
element of the composite. Both type operation implementations are outside the scope of this write-up and I’ll likely revisit this topic in a future write-up.
Delegates and Process Algebras
The third type of reference type to mention is the delegate type which is C#’s approach to creating first-class functions. The simplest of delegates is the built-in Action delegate which represents a single procedure taking no inputs and returning no value.
Given actions , we can define a possible execution operator,
, where either
or
is executed denoted as
. The choice operation forms a commutative semigroup
since operations are associative
and the operation is commutative
.
A product operation, , representing the sequential execution of
and then
is given by
. The sequence operator forms a groupoid with unity since the operation is not associative
and there is an identity action
representing a void operation resulting in
.
Both operations together form a ringoid, since the sequence operation distributes over the choice operation
. Meaning that
takes place and then
or
takes places is equivalent to
and then
takes place or
and then
takes place.
public class SequenceGroupoidWithUnity<Action> : IGroupoid<Action> {
public Action Identity {
get { return () => {}; }
}
public Action Operation(Action a, Action b) {
return () => { a(); b(); }
}
}
public class ChoiceGroupoid<Action> : IGroupoid<Action> {
public Action Operation(Action a, Action b) {
if(DateTime.Now.Ticks % 2 == 0)
return a;
return b;
}
}
The process algebra an be extended further to describe parallel computations with an additional operation. The operations given thus far enable one to derive the possible execution paths in a process. This enables one to comprehensively test each execution path to achieve complete test coverage.
Examples
The motivation of this work was to achieve C++’s approach to parametric polymorphism by utilizing C# subtype polymorphism to define the algebraic structure required by a method (akin to the built-in operations on types in C++). To illustrate how these interfaces are to be used, the following example extension methods operate over a collection of a given type and accept the minimal algebraic structure to complete the computation. The result is a single implementation of the calculation that one would expect in C++.
static public class GroupExtensions {
static public T Sum<T>(this IEnumerable<T> E, IMonoid<T> m) {
return E
.FoldL(m.Identity, m.Operation);
}
}
static public class RingoidExtensions {
static public T Count<T, R, A, M>(this IEnumerable<R> E, IRingWithUnity<T, A, M> r)
where A : IGroup<T>
where M : IMonoid<T> {
return E
.Map((x) => r.Multiplication.Identity)
.Sum(r.Addition);
}
static public T Mean<T, A, M>(this IEnumerable<T> E, IDivisionRing<T, A, M> r)
where A : IGroup<T>
where M : IGroup<T> {
return r.Multiplication.Operation(
r.Multiplication.Inverse(
E.Count(r)
),
E.Sum(r.Addition)
);
}
static public T Variance<T, A, M>(this IEnumerable<T> E, IDivisionRing<T, A, M> r)
where A : IGroup<T>
where M : IGroup<T> {
T average = E.Mean(r);
return r.Multiplication.Operation(
r.Multiplication.Inverse(
E.Count(r)
),
E
.Map((x) => r.Addition.Operation(x, r.Addition.Inverse(average)))
.Map((x) => r.Multiplication.Operation(x, x) )
.Sum(r.Addition)
);
}
}
static public class ModuleExtensions {
static public TV Mean<TS, TV, TSR, TSRA, TSRM, TVA>(this IEnumerable<TV> E, IVectorField<TS, TV, TSR, TSRA, TSRM, TVA> m)
where TSR : IField<TS, TSRA, TSRM>
where TSRA : IAbelianGroup<TS>
where TSRM : IAbelianGroup<TS>
where TVA : IAbelianGroup<TV> {
return m.Distribute(
m.Scalar.Multiplication.Inverse(
E.Count(m.Scalar)
),
E.FoldL(
m.Vector.Identity,
m.Vector.Operation
)
);
}
}
Conclusion
Abstract Algebra comes with a rich history and theory for dealing with different algebraic structures that are easily represented and used in the C# language to perform type agnostic computations. Several examples drawn from mathematics and computer science illustrated how the solution can be used for both value and reference types in C# and be leveraged in the context of a few example type agnostic computations. The main benefit of this approach is that it minimizes the repetitious coding of computations required under the ad-hoc polymorphism approach adopted by the designers of C# language. The downside is that several structures must be defined for the types being computed over and working with C# parameter constraint system can be unwieldy. While an interesting study, this solution would not be practical in a production setting under the current capabilities of the C# language.
References
Baeten, J.C.M. A Brief History of Process Algebra [pdf]. Department of Computer Science, Technische Universiteit Eindhoven. 31 Mar. 2012.
ECMA International. Standard ECMA-335 Common Language Infrastructure [pdf]. 2006.
Fokkink, Wan. Introduction of Process Algebra [pdf]. 2nd ed. Berlin: Springer-Verlang, 2007. 10 Apr. 2007. 31 Mar. 2012.
Goodman, Joshua. Semiring Parsing [pdf]. Computational Linguistics 25 (1999): 573-605. Microsoft Research. 31 Mar. 2012.
Hungerford, Thomas. Algebra. New York: Holt, Rinehart and Winston, 1974.
Ireland, Kenneth. A classical introduction to modern number theory. New York: Springer-Verlag, 1990.
Litvinov, G. I., V. P. Maslov, and A. YA Rodionov. Universal Algorithms, Mathematics of Semirings and Parallel Computations [pdf]. Spring Lectures Notes in Computational Science and Engineering. 7 May 2010. 31 Mar. 2012 .
Mazurkiewicz, Antoni. Introduction to Trace Theory [pdf]. Rep. 19 Nov. 1996. Institute of Computer Science, Polish Academy of Sciences. 31 Mar. 2012.
Pierce, Benjamin. Types and programming languages. Cambridge, Mass: MIT Press, 2002.
Stroustrup, Bjarne. The C++ Programming Language. Reading, Mass: Addison-Wesley, 1986.
Android ecosystem on Windows 7
Introduction
This past spring I decided to take a dive into mobile platforms and decided I’d get my feet with a little Android development. I’d done some Java development in the past and and with reports of increased market share, especially among tablets, I figured Android was the right platform to get started with. Since it’s always good to document things, here’s a rundown of what was needed to get a basic development environment working on Windows 7. As I continue to explore and learn more, I’ll continue to update this list with more information.
JDK SE 1.6
First thing that I needed to download was the Java Development Kit (JDK) from Oracle. The JDK provides all of the necessary components to get started and run Java based applications.
Android SDK
Next core development kit to download is the Android SDK from Google. The SDK has all of the Android specific tools and libraries to develop and test applications that are meant to run on the Android platform.
SDK Manager
It is a little deceptive, but installing the Android SDK above only installs a set of tools to manage different versions of the SDK. To download the actual libraries you’ll need to launch the SDK Manager. From there I decided to download the API for Gingerbread 2.3.3 (API10) and for Honeycomb 3.x (API11-13). I had to run the SDK Manager as an Administrator in order for the application to properly download all of the assets to my machine. Running the application as a user will result in a number of “Access Denied” errors in the log console.
AVD Manager
The second thing I setup were two virtual devices using the AVD Manager. My main interest is in developing applications for tablets, so I created a Honeycomb virtual device with a gig of memory and a reduced screen size. Also decided to create a Gingerbread virtual device with half a gig of memory without any screen size restrictions. Both come in handy to make sure that any app I write will work well on handhelds and tablets.
IntelliJ Community Edition
To do my development I decided to go with IntelliJ as my editor. A lot of people out there use Eclipse; I used to use it a long time ago, but decided that I wanted to try something new. Installation went smoothly and the only custom configuration dealt with specifying the location of the JDK and the Android SDK.
Acer A500 USB Driver
Developing an application against a virtual device is fun and all, but nothing beats testing and using an application on an actual device. I decided to go with an Acer A500 since it was the right mix of features, cost and responsiveness. To deploy an application to the device, I did have to go and download a USB driver from Acer’s website so that the Android Debug Bridge (ADB) from the Android SDK would recognize the device when it was plugged in to my machine. Past that, it was a seamless experience of getting IntelliJ to deploy an application to the device and for me to begin hands-on testing.
Wrap-up
Overall, getting started with the Android Platform has gone very smoothly. I dove into all of this without reading any documentation and the process was fairly self explanatory. I’m looking forward to learning more about the platform and ultimately trying to get a product out on the App Market.
Menger Sponge in C++ using OpenGL
This past summer I was going through some old projects and came across a Menger Sponge visualizer that I wrote back in college. A Menger Sponge is simple fractal that has infinite surface area and encloses zero volume. The sponge is constructed in successive iterations and the first four iterations are rendered in the video below.
The sponge starts as a single cube that is segmented into twenty-seven equally sized cubes. The center cubes of each face and that of the parent cube are then discarded and the process is applied again to each of the remaining cubes. Visually, the process looks like the following:

The geometry of the process is straight forward. Starting with a cube’s origin, , and edge length,
, each of the children’s attributes can be calculated. Each child’s edge length is given by
. Each child’s origin given by
. The constant represents a child’s relative position (e.g.,
) to its parent.
The following implementation isn’t particularly well written, but it accomplishes the desired end result. The point and Cube classes achieve the logic that I’ve outlined above. Cube can be thought of as a tree structure that is generated upon instantiation. The visualize() method pumps out the desired OpenGL commands to produce the faces of the cubes.
#include <GL\glut.h>
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
//=================================================================================
//=================================================================================
class point
{
public:
point(GLfloat x, GLfloat y, GLfloat z, point* ref = NULL);
void visualize();
GLfloat x,y,z;
};
point::point(GLfloat x, GLfloat y, GLfloat z, point* ref)
{
this->x = x;
this->y = y;
this->z = z;
if(ref != NULL)
{
this->x += ref->x;
this->y += ref->y;
this->z += ref->z;
}
}
//=================================================================================
//=================================================================================
class Cube
{
public:
Cube(point* origin, GLfloat edgelength, GLfloat depth);
~Cube();
void visualize();
private:
void MakeFace(int i, int j, int k, int l);
void ActAsContainer(point* o, GLfloat e, GLfloat d);
void ActAsCube(point* o, GLfloat e);
point** PointCloud;
Cube** SubCubes;
};
Cube::Cube(point* origin, GLfloat edgelength, GLfloat depth)
{
if(depth <= 1.0)
{
ActAsCube(origin, edgelength);
} else {
ActAsContainer(origin, edgelength, depth);
}
}
Cube::~Cube()
{
int i;
if(PointCloud != NULL)
{
for(i = 0; i < 8; i++)
delete PointCloud[i];
delete[] PointCloud;
}
if(SubCubes != NULL)
{
for(i = 0; i < 20; i++)
delete SubCubes[i];
delete[] SubCubes;
}
}
void Cube::ActAsCube(point* o, GLfloat e)
{
GLfloat ne = e / 2.0;
PointCloud = new point*[8]; // This is the actual physical cube coordinates;
SubCubes = NULL;
PointCloud[0] = new point( ne, ne, ne, o); // net
PointCloud[1] = new point( ne, -ne, ne, o); // set
PointCloud[2] = new point(-ne, ne, ne, o); // nwt
PointCloud[3] = new point(-ne, -ne, ne, o); // swt
PointCloud[4] = new point( ne, ne, -ne, o); // neb
PointCloud[5] = new point( ne, -ne, -ne, o); // seb
PointCloud[6] = new point(-ne, ne, -ne, o); // nwb
PointCloud[7] = new point(-ne, -ne, -ne, o); // swb
}
void Cube::ActAsContainer(point* o, GLfloat e, GLfloat d)
{
GLfloat ne = e / 3.0;
SubCubes = new Cube*[20]; // These are the centers of each sub cube structure
PointCloud = NULL;
SubCubes[0] = new Cube(new point(-ne, ne, ne, o), ne, d-1.0);
SubCubes[1] = new Cube(new point(0.0, ne, ne, o), ne, d-1.0);
SubCubes[2] = new Cube(new point( ne, ne, ne, o), ne, d-1.0);
SubCubes[3] = new Cube(new point( ne, 0.0, ne, o), ne, d-1.0);
SubCubes[4] = new Cube(new point( ne, -ne, ne, o), ne, d-1.0);
SubCubes[5] = new Cube(new point(0.0, -ne, ne, o), ne, d-1.0);
SubCubes[6] = new Cube(new point(-ne, -ne, ne, o), ne, d-1.0);
SubCubes[7] = new Cube(new point(-ne, 0.0, ne, o), ne, d-1.0);
SubCubes[8] = new Cube(new point( ne, ne, 0.0, o), ne, d-1.0);
SubCubes[9] = new Cube(new point( ne, -ne, 0.0, o), ne, d-1.0);
SubCubes[10] = new Cube(new point(-ne, ne, 0.0, o), ne, d-1.0);
SubCubes[11] = new Cube(new point(-ne, -ne, 0.0, o), ne, d-1.0);
SubCubes[12] = new Cube(new point(-ne, ne, -ne, o), ne, d-1.0);
SubCubes[13] = new Cube(new point(0.0, ne, -ne, o), ne, d-1.0);
SubCubes[14] = new Cube(new point( ne, ne, -ne, o), ne, d-1.0);
SubCubes[15] = new Cube(new point( ne, 0.0, -ne, o), ne, d-1.0);
SubCubes[16] = new Cube(new point( ne, -ne, -ne, o), ne, d-1.0);
SubCubes[17] = new Cube(new point(0.0, -ne, -ne, o), ne, d-1.0);
SubCubes[18] = new Cube(new point(-ne, -ne, -ne, o), ne, d-1.0);
SubCubes[19] = new Cube(new point(-ne, 0.0, -ne, o), ne, d-1.0);
}
void Cube::MakeFace(int i, int j, int k, int l)
{
glVertex3f(PointCloud[i]->x, PointCloud[i]->y, PointCloud[i]->z);
glVertex3f(PointCloud[j]->x, PointCloud[j]->y, PointCloud[j]->z);
glVertex3f(PointCloud[k]->x, PointCloud[k]->y, PointCloud[k]->z);
glVertex3f(PointCloud[l]->x, PointCloud[l]->y, PointCloud[l]->z);
}
void Cube::visualize()
{
int i;
if(PointCloud != NULL)
{
glBegin(GL_QUADS);
glColor3f(1.0,0.0,0.0);// top
MakeFace(0,2,3,1);
glColor3f(0.0,1.0,1.0);//bottom
MakeFace(4,6,7,5);
glColor3f(0.0,1.0,0.0);// north
MakeFace(0,2,6,4);
glColor3f(1.0,0.0,1.0);// south
MakeFace(1,3,7,5);
glColor3f(0.0,0.0,1.0);//east
MakeFace(0,4,5,1);
glColor3f(1.0,1.0,0.0);// west
MakeFace(2,6,7,3);
glEnd();
}
if(SubCubes != NULL)
{
for(i = 0; i < 20; i++)
{
SubCubes[i]->visualize();
}
}
}
The implementation of the program is your run-of-the-mill OpenGL boilerplate. The application takes in an argument dictating what order of sponge it should produce. It sets up the camera and positions the sponge at the origin. The sponge is left stationary, while the camera is made to orbit upon each display(). On idle(), a redisplay message is sent back to the OpenGL system in order to achieve the effect that the sponge is spinning.
//=================================================================================
//=================================================================================
Cube* MengerCube;
void idle()
{
glutPostRedisplay();
}
void display()
{
static GLfloat rtri = 0.0;
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
gluLookAt(1.0,1.0,1.0, 0.0,0.0,0.0,0.0,1.0,0.0);
glRotatef((rtri+=0.932), 1.0, 0.5, -1.0);
MengerCube->visualize();
glutSwapBuffers();
}
void reshape(int w, int h)
{
glViewport(0,0,w,h);
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glOrtho(-8.0, 8.0,-8.0, 8.0,-8.0, 8.0);
}
void init()
{
glShadeModel(GL_SMOOTH);
glClearColor(0.0, 0.0, 0.0, 0.0);
glClearDepth(1.0f);
glEnable(GL_DEPTH_TEST);
glColor3f(1.0, 1.0, 1.0);
}
GLfloat getDepth(char* depth)
{
int k = atoi(depth);
if(k <= 1) return 1.0;
else if (k>= 5) return 5.0;
else return (GLfloat) k;
}
int main(int argc, char* argv[])
{
GLfloat depth;
bool viewAsPointCloud = false;
point origin(0.0, 0.0, 0.0);
printf("%d\n",argc);
switch(argc)
{
case 2:
depth = getDepth(argv[1]);
break;
default:
depth = 2.0;
break;
}
MengerCube = new Cube(&origin, 8.0, depth);
glutInit(&argc, argv);
glutInitDisplayMode(GLUT_DOUBLE | GLUT_RGB | GLUT_DEPTH);
glEnable(GL_DEPTH_TEST);
glutInitWindowSize(500,500);
glutInitWindowPosition(0,0);
glutCreateWindow(*argv);
glutReshapeFunc(reshape);
glutDisplayFunc(display);
glutIdleFunc(idle);
init();
glutMainLoop();
delete MengerCube;
}
Thoughts on delivering software
Preface
A lot can happen in four years. We get used to its passing as a milestone marking a transition from one stage of life to another. I was thinking about the last four some odd years since I’d graduated from college and came to realize that while I’ve learned a lot since then, I still have a ways to go. One recurring question I have floating around my head all this time is how to deliver a complete product. At the end of college I had the theoretical side of the story, but I only had limited exposure doing internships and consulting.
My first job after college was working for a small company focusing on healthcare. Our development group was just a couple months old and I was the second developer brought on board. We started out as a group of cowboys doing what was necessary to retain customers and bring on new ones. Despite the chaos, lack of requirements and ever shifting priorities, we managed to make the company successful and a leader in its industry. During that time, I feel that I have learned a lot about how to deliver software in the face of uncertainty.
More importantly, I’ve learned how be more than just a developer and grow into someone trusted to autonomously improve the business with my vision and execution. I’m going to cover a few choice topics that I feel have had the biggest impact on how I deliver software. Things that I feel answer that burning question of how to deliver a complete product. This isn’t intended to be comprehensive, but it should give someone where I was four years ago a little insight into how to get a quality product out the door even when it feels like it’ll never happen.
Communication
When you are sitting in front of a computer, it’s easy to get excited and starting writing whatever happens to pop into mind and race its way down your fingers. Few things are more satisfying as a developer than falling into flow and building something to completion- going through the motions of refactoring, testing and becoming convinced that you’ve written something solid. You eagerly present your product to the stakeholders. They look back at you and say, “this isn’t at all what we asked for”.
It is easy to make this mistake. Not taking the time to fully understand the stakeholder’s needs and more importantly, making sure they do too. After all, the biggest reason why software fails to meet expectations is because expectations weren’t established and requirements weren’t well defined. The source of these failures lies in failures in communication. It should seem obvious, but any software endeavor that involves more than one person relies on thorough communication to execute a shared goal- to bring a solution to market in a timely manner.
To make sure that you avoid this pitfall, take the time to work with your stakeholders. An hour of talking and planning can prevent you from wasting a day of misdirected development time. Focus on the domain problem and its solution. How the solution is implemented, is largely irrelevant compared to its completeness of requirements. To ensure that you and your stakeholder have the same view of the product, produce a dossier similar to a user’s guide that outlines how the solution behaves. Once defined, produce a quick prototype and present it to the stakeholder. Upon their approval, implement the product.
Simplicity
Now, simplicity doesn’t imply mediocre no more than complexity implies extraordinary. As Antoine de Saint-Exupéry said in Wind, Sand and Stars, “Il semble que la perfection soit atteinte non quand il n’y a plus rien à ajouter, mais quand il n’y a plus rien à retrancher“. Simplicity is simply the path to perfection and removing that which is unnecessary. Much of what can be said about delivering software can be said about managing complexity. Markets change and so too do the demands on products. As a consequence, so too do the demands on code. How much that code has to change depends largely on how well it was written.
Complexity encroaches on software in one of two ways. A product that grows organically without any direction and second through poor coding practices and design. This lack of direction comes from stakeholders committing to one-off functionality without thinking how it contributes to a uniform product. In the second example, an unorthodox solution results in a product that can never be changed and a bad set of abstractions make it difficult to change anything.
What can be done to avoid complexity and enable simplicity? Focus on established principles of the field. Design components that are modular, capable only of doing one thing and doing it well. Interfaces between systems should be flexible in what they accept and resistant to change. Code should assert its assumptions and not be ad hoc in its solutions. Above all, code should be written with meaningful abstractions that represent the problem and take into consideration how that problem can mutate or be extended to other problems. Doing this intelligently, results in maintainable, reliable and predictable code -but more importantly- code that ships.
Introspection
Being a software professional requires a certain degree of introspection. Solving problems requires us to build complex models in our minds and walk around them to discover insights that lead to solutions. The ability to maintain these mental models is an important factor to our ability to deliver a product. Despite the amount of time spent in one’s mind, there is often a failure to think about how one carries out his or her job. Lucid requirements, minimal designs and all the theory in the world are useless if you don’t know how to keep up with the demands placed upon you.
What happens when you fail to keep up? You might think to spend an extra hour a day trying to catch up. This works for a bit, but not for long. You end up giving something up for that extra hour and usually it means sacrificing sleep, relationships, interests or your health. Doing this for prolonged periods leads to falling further and further behind until deadlines are racing past you, pounds piling up on your body and work invading every aspect of your life. Ultimately, it all leads to burnout.
To keep up with demands, and to get that product out the door, it is important to know yourself and set limitations. Figure out if you are most effective in the morning or night. Understand if you like switching between projects or like focusing on one project at a time. At the end of each day, write down what you’ve accomplished and what you hope to accomplish the next day. A list makes it easy for you to see that things are getting done and what you need to be working on. If you find yourself behind due to reasons beyond your control, negotiate with the stakeholders to extend the timeline. Nothing is ever set in stone.
Epilogue
I doubt there is an all encompassing answer to how to deliver a complete product. Each year I have a different take on the question and no doubt I will continue to in the years to come. I am certain however, that building a product takes more than just a few clever software professionals, it takes a broad set of skills and abilities from people of different backgrounds. Sales, management, marketing, accounting, information technology and many other disciplines contribute to the delivery of a complete product. Building a product with these groups in mind makes it easier to deliver a complete product but also one that is successful.
Do each of these groups have the tools they need to provide a quality service to customers? Is there a well defined process for supporting the product and escalating issues to development? Does sales and marketing have access to usage data to know what customers are doing with the product? Does accounting and payroll have the information they need to send out bills and payments for services rendered? A product can really only be called complete if there is a set of processes, services and tools supporting it.
Take the time to really learn your company and industry. It’s easy enough to just be someone who spends all day writing code, but its just as easy to spend time learning what the business is really all about and what problems it aims to solve in its industry. Spend the time to learn what other groups in the company are up to and how their efforts are contributing to the success of your product. The technical mind is a great platform for spotting opportunities that can result in better products and ultimately a better company. When you have the big picture, there really isn’t anything holding you back from delivering a complete product.
Haskell ecosystem on Windows XP
It’s been fun watching the Haskell ecosystem evolve into a mature system over the years. Seems that about every three months it takes a leap forward and I usually find myself uninstalling what I previously had and putting the latest and greatest on my laptop. To save myself some time in the future, I’ve compiled this post as a reference of basic “stuff” that is good to have for doing Haskell development on a Windows XP machine.
Haskell Platform
It used to be that you had to go find a number of different applications to get packages, compile source code and generate documents (among a handful of other things), then a group of folks identified an opportunity and put together a platform for getting new users a place to start. The result is the Haskell Platform.
- Glasgow Haskell Compiler (GHC) – The Haskell compiler that also comes with a command line interpreter (GHCi). Alternatives are the York Haskell Compiler (yhc) and Hugs
- Cabal – Basic package management system
- Haddock – used for generating documentation based on comments in source code
After installing, you’ll want to go to the command line and run the following commands to make sure that you’ve got the latest version of Cabal and to make sure that it has the latest package list:
C:\cabal install cabal-install C:\cabal update
Leksah
Many developers are probably used to having a quality Integrated Development Environment (IDE) to work with and the Haskell Community’s answer is Leksah. Leksah is still fairly green and has a ways to go before being comparable to Eclipse or Visual Studio, but nonetheless, Leksah is packed with plenty of goodies that will make for easy development of packages and modules for distribution on Cabal.
It is best to use the installer from the Leksah website. Once you’ve installed the latest, you’ll need to run the following from the command-line
C:\ghc-pkg recache
So that the packages on the system (those provided by GHC) will show up when you have the browse pane open.
Gtk2hs
If you plan on doing any Graphical User Interfaces (GUIs), then you’ll want to get the Haskell binding to the GTK+ library. On the page there should be an “All-in-one bundle”- for the purpose of the following, I went with version 2.20.
After extracting the bundle on the machine, make sure that the path you extracted the bundle at along with the child bin directory is added to the PATH environment variable.
Then from the command-line run the following and you should be able to include GTK in your project:
C:\cabal install gtk
HOpenGL
I’ve been working on some basic game programming and I’ve done some stuff in the past with OpenGL so I decided to give the Haskell bindings a try. Windows doesn’t natively ship with the OpenGL library, so you’ll need to get it from here.
Then get the following packages off of Cabal:
c:\cabal install glut C:\cabal install opengl
Wrap-up
I haven’t done a dry run to test all of the above, so if you follow all of the above and come across a problem, post the solution in the comments. I’ll continue to update this post as I identify any problems or come across additional “stuff” that falls into the must-have category.
An Experiment in Optical Character Recognition
Introduction
I’ve always been interested in machine learning and how it can be applied to a number of different problems. I spent some time during college learning some of the techniques used in machine learning, but since then I’ve gotten a bit rusty. So, I revisited the subject by doing some additional research and decided to try my hand at Optical Character Recognition (OCR)- the process of identifying characters within an image and producing a text output. There are a handful of traditional aspects to this process: image acquisition, segmentation, recognition and correction. Acquisition is about correcting an input image so that a segmentation process can be readily applied; segmentation identifies the characters in the image, recognition maps those visual characters to text characters, finally correction takes the text output and rectifies any errors. The following outlines my approach to segmentation and recognition.
Segmentation
Consider the following body of text from one of my earlier posts:

The goal is to extract each of the characters in the image. The simplest way to accomplish this is to implement an algorithm that reads the image much in the same way one might read a block of text: start at the top of the text and scan downward identifying all of the rows of text, then for each row, read all the characters from left to right, then identify words based on white space. Figuring out the word boundaries is done via a simple clustering process. Assuming we have an ordered set of rectangles produced by the first two steps, we can calculate the average distance between consecutive rectangles. Then, once this average has been produced, to then iterate over the list once more adding rectangles to words when the distance between consecutive rectangles is less than the average distance, then creating new words when the distance is exceeded.

This segmentation approach isn’t perfect as it assumes that we are dealing with evenly spaced characters of the same size from the same font. Of course, this isn’t always the case and we have things like kerning and ligatures to deal with. In this example two categories of problems arise: character combinations such as ay, ey and ly that are separable then combinations such as yw, rt and ct that are not separable end up being interpreted as a single character using this method. For the first category, I chose to divide rectangles whenever a line of characters has a single black pixel that does not have a black pixel neighboring ((x-1, y + a) | a <- [-1, 1]) it to the left. The second case isn't as clear cut as all the pixels neighbor one another, in principal one could implement a k-means clustering algorithm, but that assumes you know how many characters (k) you have in the image. I decided to allow the error to propagate through the system.
Recognition
Starting out, I knew that I wanted to use an Artificial neural network (ANN), so I spent some time reading Stuart’s and Norvig’s “Artificial Intelligence: A Modern Approach“, but felt that it wasn’t as comprehensive as I wanted, so I also read MacKay’s “Information Theory, Inference and Learning Algorithms,” which was more in tune with what I had in mind. I also came across a series (1, 2, 3) of pdf files hosted at Aberdeen’s Robert Gordon University that provided a more practical view of applying neural networks to pattern recognition.
A little background on ANNs: The general idea is that an individual neuron aggregates the weighted inputs from other neurons then outputs a signal. The magnitude of the signal is determined as a function of the aggregation called the activation function. Neurons are organized into layers, typically an input layer, one or more hidden layers and finally an output layer. Values from the input layer and feed into the hidden layer, then those outputs feed into the next and so on until all the values have gone through the output layer. The process of getting the network to map an input to an output is accomplished through training, in this case, a method known as Backpropagation. Given an input and an expected output, the input is feed through the network and produces an output. The difference between the two output vectors is the error that then needs to be feed backward through the network updating each node’s input weights such that the net error of the system is reduced. The method is effectively a gradient descent algorithm. I recommend reading the aforementioned references for details on how it all works. Following is my implementation of the Backpropagation algorithm:
using System;
using System.Linq;
using Library.Mathematics;
namespace OCRProject.Recognizers.NeuralNetworks {
public class Neuron {
Func<double, double> activationFunction;
public Vector InputWeights { get; set; }
public Neuron(Func<double, double> activationFunction) {
this.activationFunction = activationFunction;
}
public double Output(Vector input) {
return activationFunction(InputWeights.dot(input));
}
}
public class NeuralNetwork {
private Neuron[] hiddenLayer, outputLayer;
...
public Vector Output(Vector input) {
Vector hiddenLayerOutput = new Vector(hiddenLayer.Length, (i) => hiddenLayer[i].Output(input));
return new Vector(outputLayer.Length, (i) => outputLayer[i].Output(hiddenLayerOutput));
}
public Vector Train(Vector input, Vector desiredOutput, double learningRate) {
Vector hOutput = new Vector(hiddenLayer.Length, (i) => hiddenLayer[i].Output(input));
Vector oOutput = new Vector(outputLayer.Length, (i) => outputLayer[i].Output(hOutput));
Vector oError = new Vector(
oOutput.Dimension,
(i) => oOutput[i] * (1 - oOutput[i]) * (desiredOutput[i] - oOutput[i])
);
for (int n = 0; n < outputLayer.Length; n++) {
outputLayer[n].InputWeights = new Vector(
hiddenLayer.Length,
(i) => outputLayer[n].InputWeights[i] + learningRate * oError[n] * hOutput[i]
);
}
Vector hError = new Vector(
hiddenLayer.Length,
(i) => hOutput[i] * (1 - hOutput[i]) * (oError.dot(new Vector(oError.Dimension, (j) => outputLayer[j].InputWeights[i])))
);
for (int n = 0; n < hiddenLayer.Length; n++) {
hiddenLayer[n].InputWeights = new Vector(
input.Dimension,
(i) => hiddenLayer[n].InputWeights[i] + learningRate * hError[n] * input[i]
);
}
return oError;
}
}
}
In terms of how all of this applies to OCR, I started out with all visible characters and produced a collection of 16×16 images. Each image contained a single character centered in the image. This image was then mapped to a 256 dimensional vector with its corresponding character mapped to an 8 dimensional vector representing the expected output. The question that remains is how many hidden layer units should be used. To figure this out, I conducted a small experiment:

| Hidden Units | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Minimum (%) | 0 | 4 | 7 | 21 | 36 | 48 | 69 | 80 | 87 | 85 | 92 | 88 | 92 | 89 | 93 | 93 |
| Average (%) | 1 | 4 | 12 | 28 | 44 | 56 | 72 | 83 | 89 | 90 | 94 | 93 | 94 | 93 | 95 | 95 |
| Maximum (%) | 1 | 6 | 15 | 31 | 51 | 62 | 75 | 87 | 91 | 93 | 99 | 96 | 98 | 95 | 98 | 99 |
Each run consisted of 10 trials with each trial taking 5000 iterations to train the network. (Normally, one can measure the Mean squared error of the network and halt the training process once it is sufficiently small.) After doing this testing, I found that ANNs with 11 hidden units looked to give the highest accuracy with the fewest number of units. Given the original text that was used, the following text was produced:

As expected, the category of errors that were identified earlier (character combinations ff, rt, ct) were not segmented correctly and consequently not recognized correctly.
Wrap-up
There is a lot of additional work that could be thrown at the project. In the future, it’d be good to modify the solution to accept any sized font as well as any font, adding support for images containing scanned or photographed text rather than computer generated images and some error correction on the output to flag parts of the text that require review. I suspect that I will continue down this road and start investigating how these methods can be applied to general computer vision for some planned robotics projects.
Tree Drawing: Force-based Algorithm
I’ve written a couple times on different data visualization techniques for viewing hierarchical information. One method that I haven’t discussed is the force-based algorithm approach. After digging around a bit, it looks like this method was originally developed in Birbil’s and Fang’s 2003 publication [pdf], “An Electromagnetism-like Mechanism for Global Optimization”. In terms of tree drawing, the idea is fairly simple: treat the tree as a dynamical system composed of springs and point charges, apply the appropriate physical laws and over time, you will get a nicely laid out tree. My implementation in the clip below demonstrates this approach.
Let’s think about why we want a dynamical system. An idealized tree drawing is aesthetically pleasing, compact, nodes are uniformly distributed and edges between nodes do not cross. Under this proposed system, we can’t guarantee these criteria, but we can get “close enough” to produce something that is both nice on paper and sensible in practice. Treating the nodes as point charges will make the nodes repel from one another. The magnitude of the repulsion is determined by Column’s Law which states that the force applied to a point charge by another point charge follows an inverse square law. If all we had was Coulomb’s Law, then all the nodes will fly away from one another and finally come to rest, but the nodes would be too distantly spread across the screen. To alleviate this, we can treat the edges between nodes as springs. Following Hooke’s Law we can tether the nodes together to preserve the compactness criteria. The magnitude of this force is directly linear with respect to the distance between the nodes.
Let’s jump into some quick physics. To determine the net force applied on a given node, we need to calculate the spring force applied to the node, and then we need to calculate all the point charge forces applied to the node. The key to getting this right is making sure that we get the direction and magnitude of the forces correct. To make sure that we have the direction correct, let be the vector representing the distance between the i’th and j’th nodes (
representing the i’th node’s location). Let
be the normalized vector representing the direction that the force that needs to be applied. For the spring force we are looking at
where
is the spring constant,
is the length of the spring at rest and
is the force between the i’th and j’th node. For the point charge we are interested in
.
is the charge of the i’th node and
is the coulomb constant. (I default all constants to 1.0 in my implementation).
We’ve got the net force on each node figured out, so now it is necessary to figure out the velocity of the nodes as well as their location. To do this, we recall that by Newton’s Laws, a force is the product of a mass and acceleration . If we know an acceleration, we can derive (integrate mathematically) the velocity and location each iteration. Thus, if we know a node’s present velocity
, then we know that
. If the node has current location
, then its next location will be
. Once we’ve determined these values between time steps
, we will need to remember to zero out the net force on each node in our implementation. In my implementation I chose a value of 0.1 for my time step) One important aspect to note for this algorithm, is the need to apply dampening to the velocity over time, otherwise the structure continue to move in space and we run into the possibility of running out of numbers on the machine resulting in an overflow error.
To go about implementing all of this you’ll find the time complexity of the algorithm should be on the order . The square order comes from apply Coulomb’s law. By Newton’s third law (equal and opposite forces), you can reduce this down to
, but for significantly sized
you are still going to be looking at
. You could improve this by applying a binary space partitioning (in particular the R-tree and its variants) data structure to the tree and only use those nodes within a certain radius of the node when calculating the Coulomb forces. Testing my implementation (using Lists), I found things get a little sluggish for a real-time render around 100 nodes on my 2.4GHz laptop. Depending on your application (domain and implementation), you millage may vary. Following are some test cases I used in writing my solution.
| Fan | Flat |
|---|---|
|
|
| Each node a number of nodes equivalent to its number of siblings minus one | Tree containing one level of nodes |
| Complete Binary | Random |
|
|
| Every node has exactly two children | Randomly generated tree of depth four |
I chose to implement my solution in C# using the Microsoft .net 3.5 Framework (as a side note, 4.0 was recently released on the 12th of April). I went with my custom mathematics library and the WinForms library of the Framework (in the future I would like to migrate this over to WPF) for displaying and end-user manipulation of the algorithm’s variables. I went with a vanilla UserControl that contains a setter for the tree data structure, the algorithm to apply to the structure and the settings for that algorithm. Upon setting the tree, a Timer instance kicks off (non-reentrant) to invoke the algorithm below every 50ms. To get the tree to display correctly on the control, the algorithm calculates the boundary of the nodes and then the control performs a linear map from the model space to the view space. The first implementation used the DoubleBuffered property of the UserControl, but I found that it was ineffective in reducing flickering so I implemented custom double buffering using the BufferedGraphicsContext class. It’s worth noting that most implementations track the kinetic energy of the system to determine when to terminate the algorithm. I chose not to do this, as I didn’t find value in adding in the additional calculation.
using System;
using System.Collections.Generic;
using System.Drawing;
using Library.Mathematics;
namespace ForceBasedTreeLayout {
public class ForceBasedTreeLayout {
private LayoutSettings settings;
public ForceBasedTreeLayout(LayoutSettings settings) {
this.settings = settings;
}
public RectangleF Evaluate(Node root) {
List<Node> nodes = new List<Node>();
foreachNode(root, (x) => nodes.Add(x));
// Apply Hooke's law
// F = -k x
foreachNode(root, (parent) => {
parent.Children.ForEach((child) => {
Vector dist = parent.Location - child.Location;
Vector restingDist = settings.MinimumSpringLength * dist.Normalize();
Vector force = -settings.SpringConstant * (dist - restingDist);
parent.Acceleration += force;
child.Acceleration -= force;
});
});
// Apply Coulomb's Law
// F = Q1 Q1 / d^2
for (int n = 0; n < nodes.Count; n++) {
for (int m = n + 1; m < nodes.Count; m++) {
Vector dist = nodes[n].Location - nodes[m].Location;
Vector norm = dist.Normalize();
Vector force = new Vector(2, (i) => norm[i] / Math.Pow(dist.Norm() + 1.0, 2.0));
nodes[n].Acceleration += force;
nodes[m].Acceleration -= force;
}
}
Vector xExtrema = new Vector(2);
xExtrema[0] = double.MaxValue;
xExtrema[1] = double.MinValue;
Vector yExtrema = new Vector(2);
yExtrema[0] = double.MaxValue;
yExtrema[1] = double.MinValue;
// update the locations & velocity && figure out new bounding region
foreach (Node node in nodes) {
// p = a0t^2/2 + v0t + p0
node.Location = (settings.TimeStep * settings.TimeStep * 0.5) * node.Acceleration + (settings.TimeStep) * node.Velocity + node.Location;
// v = at + v0
node.Velocity = (settings.TimeStep) * node.Acceleration + node.Velocity;
node.Velocity = (settings.VelocityDampening) * node.Velocity;
node.Acceleration = new Vector(2, (i) => 0.0);
xExtrema[0] = Math.Min(xExtrema[0], node.Location[0]);
xExtrema[1] = Math.Max(xExtrema[1], node.Location[0]);
yExtrema[0] = Math.Min(yExtrema[0], node.Location[1]);
yExtrema[1] = Math.Max(yExtrema[1], node.Location[1]);
}
RectangleF R = new RectangleF();
R.X = (float)xExtrema[0];
R.Y = (float)yExtrema[0];
R.Width = (float)(xExtrema[1] - xExtrema[0]);
R.Height = (float)(yExtrema[1] - yExtrema[0]);
R.X -= R.Width / 2;
R.Y -= R.Height / 2;
R.Width *= 2;
R.Height *= 2;
return R;
}
private void foreachNode(Node root, Action<Node> act) {
Stack<Node> stack = new Stack<Node>();
stack.Push(root);
while (stack.Count > 0) {
Node node = stack.Pop();
act(node);
node.Children.ForEach((x) => stack.Push(x));
}
}
}
}
Edit: 2010-10-21
By popular demand, here is the vector class:
public class Vector {
private double[] V;
public int Dimension { get { return V.Length; } }
public double this[int n] {
get {
if (n < 0 || n >= Dimension)
throw new Exception(string.Format("{0} must be between 0 and {1}", n, Dimension));
return V[n];
}
set {
if (n < 0 || n >= Dimension)
throw new Exception(string.Format("{0} must be between 0 and {1}", n, Dimension));
V[n] = value;
}
}
public Vector() : this(0) { }
public Vector(int n) { V = new double[n]; }
public Vector(int n, VectorInitializer initializer) : this(n) {
for (int i = 0; i < Dimension; i++)
V[i] = initializer(i);
}
public double dot(Vector y) {
if (Dimension != y.Dimension)
throw new Exception();
double d = 0.0;
for (int n = 0; n < Dimension; n++)
d += this[n] * y[n];
return d;
}
public override bool Equals(object obj) {
Vector x = obj as Vector;
if (x == null || x.Dimension != Dimension)
return false;
for (int n = 0; n < Dimension; n++)
if (this[n] != x[n])
return false;
return true;
}
static public Vector operator +(Vector x, Vector y) {
if (x.Dimension != y.Dimension)
throw new Exception();
Vector z = new Vector(x.Dimension);
for (int n = 0; n < z.Dimension; n++)
z[n] = x[n] + y[n];
return z;
}
static public Vector operator -(Vector x, Vector y) {
if (x.Dimension != y.Dimension)
throw new Exception();
Vector z = new Vector(x.Dimension);
for (int n = 0; n < z.Dimension; n++)
z[n] = x[n] - y[n];
return z;
}
static public double operator *(Vector x, Vector y) {
return x.dot(y);
}
static public Vector operator *(double c, Vector x) {
Vector y = new Vector(x.Dimension);
for (int n = 0; n < y.Dimension; n++)
y[n] = c*x[n];
return y;
}
}
Getting a Development Network Started
I’ve been in the process of revamping my home network and have been spending time thinking about how I’d like to set up my development environment for my personal projects that I might try and sell one day. Most of the work I do is C#, ASP.NET, PHP, Java, Haskell,… the list goes on, so I’ve been thinking about what kind of solution will allow me to build against different OSes and platforms. The following is a rundown of this thought process and the considerations and decisions made in bringing up my network.
Anytime I do any planning there are a handful of main points that I try to keep focused on:
- Cost – How much money I’m interested in putting into a project?
- Time – Total time investment to bring up the hardware and software.
- Quality – Am I after a quick solution or one that will have long lasting use?
- Portability – How easy would it be to move the system to another platform.
- Extensibility – How easy it is to add on new entities.
I know I want to keep the project under 1000 USD- this cost includes hardware, operating system licenses, software licenses, utility costs over the lifetime of the solution and opportunity costs etc. Time wise, I want something that could take an hour a day to get setup, working and tweaked to perfection over the course of a week. I want something that is going to be flexible enough to be useful five years down the road, but is also capable of doing what I want today, thus I want a solution that doesn’t look like it was thrown together with duct tape but also doesn’t look like I spent years planning it out. It is important for me to be able to port my solution to new hardware quickly and effortlessly as well as add on new elements as needed. This is especially important if my environment crashes as a result of hardware failure or software malice.
There are a variety of ecosystems that I’m used to working with. the following table summarizes the residents of each:
| Ecosystem | Type of Work | Runtimes | IDE | Databases | Web Server |
|---|---|---|---|---|---|
| .net | Websites, Web Services, Clients, Services | .net Framework 4.0, Mono 2.6 | Visual Studio 2008 Express | SQL Server 2008 Express | Internet Information Services 7.0 |
| Java | Clients | JVE 1.6 | Eclipse Galileo | MySQL Community Server 5.1 | Apache 2.2 |
| Haskell | Clients | NA | vim, yi, Leksah | MySQL Community Server 5.1 | Apache 2.2 |
These ecosystems also have corresponding environments for dealing with source control, build automation, bug tracking and project tracking. Given the ecosystems I’m interested in, I’ve decided on Subversion, Hudson, Mantis and twiki to manage all of my projects’ artifacts.
Having reviewed what a lot of other shops have done, there are a couple common elements that most development networks incorporate:
- repo- Source repository and OS specific Database(s).
- dev- IDEs and frameworks for the development of OS specific applications. (per developer machine, often dual booting)
- web- OS specific web servers hosting platform specific websites.
- build- Dedicated build machines for producing assemblies for specific platforms and OSes.
Given these elements, the languages, platforms and operating systems that I’m interested in, I’ve settled on an ideal network that looks like the following:
Important to notice the use of virtualization here. Being able to store a series of ISOs for each of the element groups on a NAS makes it easy to bring up new instances and make backups, thus satisfying my time and portability criteria. As well as satisfying my cost criteria as it is cheaper to purchase a beefy box running several virtual machines than it is to purchase several physical machines. At the time of writing (2009-12), most quad core machines run for about 1000 USD and, and most (consumer) NASes cost about 100-300 USD. Now, I could load up the server up with 1TB storage for an additional 100 USD. Of course, this means I have a single point of failure- which in a home environment may not be a huge deal. In terms of money these two are the main consumers at the hardware level. Everything else already exists on the network.
Lets take a look at the estimated costs:
| Item | Quantity | Amount (USD) | Extended Amount (USD) |
|---|---|---|---|
| Dell Studio XPS 8000 | 1 | 1100.00 | 1100.00 |
| 1TB Western Digital MyBook | 1 | 120.00 | 120.00 |
| Windows XP Professional Licenses | 4 | 100.00 | 400.00 |
| 1620.00 |
That total amount is a little more that initially desired. It is possible to collapse vm-windows-dev and time-thief down to one machine, and then collapse vm-windows-repo, vm-windows-web and vm-windows-build down to a single machine resulting in a cost savings of 300.00 USD from reduced operating system license costs. If the NAS is removed from the picture, that brings us down to 1200.00 which is about as close as I’m going to get to my initial target of 1000 USD. Not sure if this is the final setup that I will end up going with, per usual, I’ll update this post with any new developments.
Sudoku Solver in Haskell
This month will be a bit of short article since I haven’t had a whole lot of time on my hands lately. Haskell is a wonderful little language that has begun to pick up a bit of moment in the past year that I’ve been playing with on-and-off now for several years. Since I don’t post enough on Haskell, I figured I’d post my bare-bones Haskell Sudoku Solver.
import Data.List import Data.Maybe toRowColumn :: Int -> (Int, Int) toRowColumn index = (r, c) where r = div index 9 c = mod index 9 toIndex :: (Int, Int) -> Int toIndex (r, c) = r * 9 + c toRegion :: (Int, Int) -> Int toRegion (r, c) = (div r 3) * 3 + (div c 3) columnIndicies :: Int -> [Int] columnIndicies c = [c, c + 9..80] regionIndicies :: Int -> [Int] regionIndicies g = [toIndex(r + x, c + y) | x <- [0..2], y <- [0..2] ] where r = (div g 3) * 3 c = (mod g 3) * 3 rowIndicies :: Int -> [Int] rowIndicies r = [9 * r..9 * (r + 1) - 1] values :: [Int] -> [Int] -> [Int] values board indicies = filter (>0) (map (board!!) indicies) possibleValues :: [Int] -> (Int, Int) -> [Int] possibleValues board rowColumn = foldl (\\) [1..9] ( map (values board) ( map (\f -> (fst f . snd f) rowColumn) ( zip [rowIndicies, columnIndicies, regionIndicies] [fst, snd, toRegion] ) ) ) validBoard :: [Int] -> Bool validBoard board = (length board == 81) && (and $ map (==0) l) where l = map length s s = map (\\[1..9]) v v = [values board (xIndicies x) | x <- [0..8], xIndicies <- [rowIndicies, columnIndicies, regionIndicies]] solvedBoard :: [Int] -> Bool solvedBoard board = and $ map (>0) board hasUnassigned :: [Int] -> Bool hasUnassigned board = isJust $ elemIndex 0 board assignFirstUnassigned :: [Int] -> Int -> [Int] assignFirstUnassigned (b:bs) value | b == 0 = value : bs | otherwise = b : (assignFirstUnassigned bs value) possibleBoards :: [Int] -> [Int] -> [[Int]] possibleBoards board possibleAssignments = map (assignFirstUnassigned board) possibleAssignments solve :: [Int] -> [[Int]] solve board | not (validBoard board) = [[]] | solvedBoard board = [board] | not (hasUnassigned board) = [[]] | otherwise = concated where concated = concat mapped mapped = map solve filtered filtered = filter (not . null) possibleSolved possibleSolved = possibleBoards board possibleAssignments possibleAssignments = possibleValues board unassignedRowColumn unassignedRowColumn = toRowColumn unassignedIndex unassignedIndex = fromJust $ elemIndex 0 board demo :: [Int] demo = [2,0,0,0,8,0,3,0,0, 0,6,0,0,7,0,0,8,4, 0,3,0,5,0,0,2,0,9, 0,0,0,1,0,5,4,0,8, 0,0,0,0,0,0,0,0,0, 4,0,2,7,0,6,0,0,0, 3,0,1,0,0,7,0,4,0, 7,2,0,0,4,0,0,6,0, 0,0,4,0,1,0,0,0,3]

