Antimatroid, The

niche for the aesthetics, mathematics and computer science

Posts Tagged ‘C# 2.0

Integer Factorization by Dynamic Programming with Number Theoretic Applications

with 5 comments

Having been a participant of a number of mathematical programming competitions over the years, I’ve had to find a number of efficient ways of implementing many common Number Theoretic Functions. In this write up, I’m going to go over a method I’ve found useful for easily factoring numbers using a sieving method, go over some the implementation of a few number theory functions along with time complexity analysis of each. The cornerstone to many of these implementations relies on the ability to quickly factor integers and find primes.

One of the most popular methods of for finding primes is the Sieve of Eratosthenes. The algorithm starts by populating a table with every positive integer from 2 to a ceiling value. Then find the first integer not yet crossed off, in the case 2, and eliminate every multiple of 2 from the table then return to 2 and find the next positive integer not yet crossed off and repeat the procedure until the end of the table is reached. The method is fine and all, but a lot of really great information is lost in that computation. Here is a sample implementation:

bool[] isPrime = new bool[400];
for (uint n = 2; n < isPrime.Length; n++)
    isPrime[n] = true;
for (uint n = 2; n < isPrime.Length; n++)
    if (isPrime[n])
        for (uint m = 2, c = 0; (c = m * n) < isPrime.Length; m++)
            isPrime[c] = false;

On the other hand, say we approach sieve a little differently. Create an empty table as large as the ceiling value. Start at 2 and for every multiple of 2, create a record that has two parts: 2 and half of the multiple. Return to 2 and find the next integer in the table that has yet to be recorded, in this case 3. For every multiple of 3, create a record that has two parts: 3 and third of the multiple (only if the multiple was not previously recorded. E.g., 6 because 2 previously recorded the record). Return to 3 and find the next integer in the table that has yet to be recorded so on and so forth until every integer in the table has been recorded.

The following graphic demonstrates this process for a ceiling values of 25. If we wish to factor 16, we go to 16’s record (2, 8), follow to 8’s record (2,4), again follow 4’s record (2, 2) and finally 2’s record (2, λ). Thus the prime factorization of 16 is 2, 2, 2, 2.

factor_table_polar

It should be apparent that this algorithm is a simple dynamic programming solution that yields two major results:

  1. We have factored every positive integer up to a ceiling value.
  2. We have found every positive prime integer up to a ceiling value.

And one major draw back

  1. Uses a lot of memory as a trade off for speed.

Let’s get into the C# implementation. To start off, we need a record class that’ll store the information about the first prime that divides an entry and the composite to jump to if the record corresponds to a composite.

public class Record {
    public uint Prime {get; set;}
    public uint? JumpTo {get; set;}
}

We’ll have a class called NumberTheory and assume that it has the following structure. If you want, you could make this a Singleton class but I felt it was unnecessary for the scope of this write up.

public class NumberTheory {
    private Record[] table;

    ...
}

It makes sense to put the core algorithm in the constructor and then have member methods for each of the functions we’d like to have. It should be assumed that for the lifetime of the class that the largest value ever called on the methods will be N otherwise an exception should be thrown by the methods (omitted here for brevity).

public NumberTheory(uint N){
    uint c = 0;
    table = new Record[N+1];
    for(uint n = 2; n < table.Length; n++) {
        if(table[n] != null)
            continue;
        table[n] = new Record() { Prime = n };
        for(uint m = 2; (c = n * m) < table.Length; m++)
            if(table[c] == null)
                table[c] = new Record() { JumpTo = m, Prime = n };
    }
}

The time complexity of the implementation can be derived using some analysis and by having some knowledge of certain identities. Starting from 2 there are N – 2 numbers to check of which 1/2 will be visited by the interior loop, starting from 3 there are N – 3 numbers to check of which 1/3 will be visited visited by the interior loop, so on and so forth leading to the following summation:
\displaystyle T(n) = \sum_{p \le n} \left( \frac{1}{p}(n - p) \right)
\displaystyle T(n) = \sum_{p \le n} \left( \frac{n}{p} - 1 \right)

If we separate the summation into the harmonic series of primes (HSP) and prime counting function (aggregate 1 for every prime less than n) we get:
\displaystyle T(n) = n \sum_{p \le n} \left( \frac{1}{p} \right) - \pi(n)

Asymptotically, the HSP tends towards \ln \ln n and \pi(n) towards \frac{n}{\ln n}.
\displaystyle T(n) = n \left( \ln \ln n \right) - \pi(n)
\displaystyle T(n) = n \left( \ln \ln n \right) - \frac{n}{\ln n}
\displaystyle T(n) = n \frac{ \ln n \cdot \ln \ln(n) - 1}{\ln n}

Giving us our final asymptotic time complexity of O(n \ln \ln n).

The first member method we’ll implement is a trivial check to see if a given number is prime by checking the table’s Record’s JumpTo property for null.

public bool IsPrime(uint N){
    return !table[N].JumpTo.HasValue;
}

The time complexity here is a simple O(1).

From IsPrime, we can easily implement a function that will get every single prime up to a given value by iterating over the table.

public void PrimesLessThan(uint value, Action<uint> actOnPrime) {
    for(int n = 2; n < value; n++)
	if(IsPrime(n))
		actOnPrime(n);
}

We can get the prime counting function \pi(n) but utilizing the PrimeLessThan function.

public uint CountPrimes(uint n) {
	uint count = 0;
	PrimesLessThan(n, (p) => {count++;});
	return count;
}

Here the time complexity is the same as PrimesLessThan: O(n).

We can get the prime factorization of a composite easily. To do so, we simply iterate over the records until we reach a Record with no JumpTo value.

public void PrimeFactorsOf(uint composite, Action actOnFactor) {
    Record temp = table[composite];
    while(temp != null) {
        actOnFactor(temp.Prime);
        if(temp.JumpTo.HasValue)
            temp = table[temp.JumpTo.Value];
        else
            temp = null;
    }
}

The time complexity of this implementation relies on the the prime omega function \Omega(n) which is the number of prime factors (not necessarily distinct) of n. The function tends to O( \ln \ln n).

From PrimeFactorsOf, we can also easily implement Euler’s Totient Function \phi(n)- the function tells us how many positive integers less than n are coprime to n. It is defined as:
\displaystyle \phi(n) = n \prod_{p|n} \left( 1 - \frac{1}{p} \right)
Which essentially states that if you multiply all of the repeated prime factors of n together by all of the non-repeat prime factors – 1 of n together, you will have the result of \phi(n).

public uint EulerTotient(uint n){
    uint phi = 1, last = 0;
    PrimeFactorsOf(n, (p) => {
        if(p != last) {
            phi *= p - 1;
            last = p;
        } else {
            phi *= p;
        }
    });
    return phi;
}

A similar function known as the Dedekind \psi(n) Function defined as
\displaystyle \psi(n) = n \prod_{p|n} \left ( 1 + \frac{1}{p} \right )
can be implemented in a similar way as \phi(n):

public uint DedekindPsi(uint n){
    uint phi = 1, last = 0;
    PrimeFactorsOf(n, (p) => {
        if(p != last) {
            phi *= p + 1;
            last = p;
        } else {
            phi *= p;
        }
    });
    return phi;
}

The Von Mangoldt Function \Lambda(n) is another interesting function, unfortunately I haven’t had a chance to use it, but it is trivial to implement so I will include it here for completeness. It is defined as
\displaystyle \Lambda (n) = \begin{cases} \ln{p} & \text {if n = prime to some positive integer power} \\ 0 & \text{otherwise} \end{cases}

public double VonMangoldt(uint n){
    uint P = 0;
    PrimeFactorsOf(n, (p) => {
        if(P == 0) {
            P = p;
        } else if (P != p) {
            P = 1;
        }
    });
    return Math.Log(P);
}

The Möbius Function \mu(n) is a handy little function for determining if a number is square free or not (among many more interesting things). It is defined as:
 \displaystyle \mu(n) = \begin{cases} 0 & \text{if n has one or more replicated factors} \\ 1 & \text{if n = 1} \\ (-1)^{k} & \text{if n is a product of k distinct primes} \end{cases}

public int MoebiusFunction(uint N){
    if(N == 1)
        return 1;
    bool distinct = true;
    uint last = 0, k = 0;
    PrimeFactorsOf(N, (p) => {
        if(p == last) {
            distinct = false;
        } else {
            k++;
            last = p;
        }
    });

    if(distinct)
        return ((k & 1) == 0) ? 1 : -1;
    return 0;
}

Since EulerTotient, DedekindPsi, VonMangoldt and Moebius each use PrimeFactorsOf without any additional lifting, their time complexities are the same as PrimeFactorsOf – O(\ln \ln n).

The last function I’ll implement is the Mertens Function M(n) which is simply defined as \displaystyle M(n) = \sum_{k = 1}^{n} \mu(k):

public int Mertens(uint n){
    int m = 0;
    for(uint k = 1; k <= n; k++)
        m += MoebiusFunction(k);
    return m;
}

The implementation’s time complexity is simply O(n \ln \ln n).

Written by lewellen

2009-04-01 at 12:00 am

Generalized Interpreter Framework in C#

without comments

Introduction

Like most Computer Science graduates, I had taken a Programming Languages course during college in which we built a compiler and studied the consequences of our design decisions. I had also taken a course in symbolic logic- particularly, looking at the sentential and predicate logic. Lately, I had been thinking about these two courses and thought it would be interesting to write an interpreter that would take in a sentence valid under sentential logic and output the resulting truth table.

After looking at what I had built, it became apparent that I could generalize my solution into a flexible framework where I could easily create a new interpreter for a desired (albeit simple) formal grammar and produce the desired outputs that other applications could then use for whatever purpose. The following is an overview of this framework and an example implementation for generating truth tables for sentences valid under sentential logic.

Framework

Specification

An Interpreter takes a string and produces an object hierarchy that a client can perform functions upon. The Interpreter does so by following the rules of a grammar that the string is assumed (although not garaunteed) to be valid under. The Interpreter will only be valid for context-free-grammars (CFG). A CFG consists of Terminals and Nonterminals. A Terminal is a character or sequence of characters that terminate a definition. A Nonterminal is a rule that defines the relationships between other Nonterminals and Terminals.

Design

interpreter_components

Interpreter designs differs from traditional compiler design, in that only the so-called front-end is implemented. There are three principal components involved in this process: Tokenizer, Parser and Semantic Analyzer.

A Tokenizer takes a string of characters and returns a collection of Tokens, where each Token is a tuple containing the accepted substring and meta data describing what the substring represents. Tokens are identified by looking for Terminals in the input string.

The Parser then builds (formally) a concrete syntax tree (informally a syntax tree) following the rules defined by the grammar. This is done by looking for Nonterminals in the collection of Tokens.

Finally, the Semantic Analyzer builds (formally) an abstract syntax tree (informally a semantic tree) by ignoring the syntax in the parse tree and constructing the appropriate tree for performing functions upon.

In the event of an exception, the Interpreter will write the details of the exception to a text writer (e.g., Console.Error) provided by the client of the Interpreter.

Exceptions are to be thrown under the following conditions:

  • The input string is null or empty, or none of the Terminals are able to accept a character in the input string.
  • None of the Nonterminals are able to accept a set of tokens, or the number of tokens is deficient for the Nonterminals to identify a match.

Implementation

interpreter_uml

abstract public class Tokenizer<T> {
    protected TerminalList<T> Terminals { get; private set; }

    public Tokenizer() {
        Terminals = new TerminalList<T>();
        loadTerminals();
    }

    public TokenList<T> Tokenize(string input) {
        if (string.IsNullOrEmpty(input))
            throw new UnexpectedEndOfInputException();

        TokenList<T> tokens = new TokenList<T>();
        Token<T> candiate;

        for (int n = 0; n < input.Length; n++) {
            if (char.IsWhiteSpace(input[n]))
                continue;

            if (!terminalExists(input, n, out candiate))
                throw new UnexpectedInputException(n);

            tokens.Add(candiate);
            n += candiate.Lexeme.Length - 1;
        }

        return tokens;
    }

    private bool terminalExists(string input, int n, out Token<T> candiate) {
        candiate = null;
        foreach (Terminal<T> terminal in Terminals)
            if (terminal.Exists(input, n, out candiate))
                return true;
        return false;
    }

    abstract protected void loadTerminals();
}
abstract public class Parser<T, P> {
    protected NonterminalList<T, P> Nonterminals { get; private set; }
    protected T[] FirstExpectedTokens { get; private set; }

    public Parser() {
        Nonterminals = new NonterminalList<T, P>();
        loadNonterminals();

        FirstExpectedTokens = new T[Nonterminals.Count];
        for (int n = 0; n < Nonterminals.Count; n++)
            FirstExpectedTokens[n] = Nonterminals[n].FirstExpectedToken;
    }

    public P Parse(TokenList<T> tokens) {
        return Parse(tokens, 0, tokens.Count - 1);
    }

    public P Parse(TokenList<T> tokens, int n, int m) {
        P parsedSentence;
        foreach (Nonterminal<T, P> nonterminal in Nonterminals)
            if (nonterminal.Exists(tokens, n, m, out parsedSentence))
                return parsedSentence;

        throw new UnexpectedTokenException<T>(tokens[n], FirstExpectedTokens);
    }

    abstract protected void loadNonterminals();
}
abstract public class SemanticAnalyzer<P, S> {
    abstract public S Analyze(P sentence);
}
abstract public class Interpreter<T, P, S> {
    private TextWriter externalLogger;

    abstract protected Tokenizer<T> Tokenizer { get; }
    abstract protected Parser<T, P> Parser { get; }
    abstract protected SemanticAnalyzer<P, S> SemanticAnalyzer { get; }

    public Interpreter(TextWriter externalLogger) {
        this.externalLogger = externalLogger;
    }

    public bool Interpret(string candidateSentence, out S sentence, out TokenList<T> tokens) {
        sentence = default(S);
        tokens = null;

        try {
            tokens = Tokenizer.Tokenize(candidateSentence);
            sentence = SemanticAnalyzer.Analyze(Parser.Parse(tokens));

            return true;
        } catch (UnexpectedInputException uie) {
            logger(uie.Message);
            logger("'{0}'", candidateSentence);
            logger(" {0}^", string.Empty.PadLeft(uie.AtIndex, '.'));

        } catch (UnexpectedEndOfInputException ueie) {
            logger(ueie.Message);
            logger("'{0}'", candidateSentence);
            logger(" {0}^", string.Empty.PadLeft(candidateSentence.Length, '.'));

        } catch (UnexpectedTokenException<T> pe) {
            logger(pe.Message);
            logger("'{0}'", candidateSentence);
            logger(" {0}^", string.Empty.PadLeft(pe.Unexpected.StartsAtIndex, '.'));
        }

        return false;
    }

    private void logger(string format, params object[] values) {
        externalLogger.WriteLine(format, values);
    }
}

Example

Specification

SL Backus-Naur Form (BNF)
sentence ::= sentence_letter
| ~ sentence
| ( sentence connective sentence )
 
sentence_letter ::= letter
| letter number
 
connective ::= v
| ^
| ->
| <->
 
letter ::= a
|
| z
| A
|
| Z
 
number ::= number digit
| digit
 
digit ::= 0
|
| 9

Sentential Logic (SL) is a simple formal system that consists of two semantic elements: primitive types called Sentence Letters that may either be true or false, and expressions built upon Sentence Letters called Sentences. A Sentence Letter is any character a-z (lower or upper) followed by an optional number. Sentences are defined in terms of other sentences, the simplest is a plain Sentence Letter. Any negated ¬ sentence is also a sentence. Any two sentences connected by a conjunction ∧, disjunction ∨, material conditional → or material biconditional ↔ and enclosed by parentheses () is also a sentence under SL. No other arrangement of symbols constitutes a sentence under SL.

Connectives and negation Truth Tables
  ¬   T F   T F   T F   T F
T F     T F     T T     T F     T F
F T     F F     T F     T T     F T

A truth table has a column for each of the sentence letter that show up in a sentence and a column for the sentence. Each row is one Cartesian coordinate in the space \{T, F \}^{n} where n is the number of the number sentence letters. By convention, one lists the rows in order all sentence letters being true, to the last row where all sentence letters are false.

Implementation

TerminalSyntax and TerminalSentenceLetter are subclasses of Terminal and implement the logic for determining if a certain syntax or sentence letter exists in an input string. TerminalSyntax takes in the TokenType and string to find in the input as configuration options. If one felt inclined, one could have implemented a TerminalRegularExpression class instead of two separate classes but I felt it was unnecessary.

public class SLTokenizer : Tokenizer<TokenType> {
    protected override void loadTerminals() {
        Terminals.AddRange(new Terminal<TokenType>[] {
            new TerminalSyntax(TokenType.Conjunction, "^"),
            new TerminalSyntax(TokenType.Disjunction, "v"),
            new TerminalSyntax(TokenType.LeftParen, "("),
            new TerminalSyntax(TokenType.MaterialBiconditional, "<->"),
            new TerminalSyntax(TokenType.MaterialConditional, "->"),
            new TerminalSyntax(TokenType.Negation, "~"),
            new TerminalSyntax(TokenType.RightParen, ")"),
            new TerminalSentenceLetter()
        });
    }
}

The three Nonterminal types for Connectives, Negation and Letters are subclasses of Nonterminal and implement the logic for determining if a collection of tokens matches the rules associated with each Nonterminal type.

public class SLParser : Parser<TokenType, ParsedSentence> {
    protected override void loadNonterminals() {
        Nonterminals.AddRange(new Nonterminal<TokenType, ParsedSentence>[] {
            new NonterminalSentenceConnective(this),
            new NonterminalSentenceNegation(this),
            new NonterminalSentenceLetter()
        });
    }
}

Nothing particularly exciting here, simply mapping the parsed sentences to actual sentences that will be used by the client code. Sentence implementations have the logic for evaluating themselves.

public class SLSemanticAnalyzer : SemanticAnalyzer<ParsedSentence, Sentence> {
    public override Sentence Analyze(ParsedSentence sentence) {
        ParsedSentenceLetter letter = sentence as ParsedSentenceLetter;
        if (letter != null)
            return new SentenceLetter(letter);

        ParsedSentenceUnary negation = sentence as ParsedSentenceUnary;
        if (negation != null)
            return new SentenceUnaryNegation(negation, Analyze(negation.Sentence));

        ParsedSentenceBinary connective = sentence as ParsedSentenceBinary;
        if (connective != null) {
            switch (connective.Operation.TokenType) {
                case TokenType.Conjunction:
                    return new SentenceBinaryConjunction(connective, Analyze(connective.Left), Analyze(connective.Right));
                case TokenType.Disjunction:
                    return new SentenceBinaryDisjunction(connective, Analyze(connective.Left), Analyze(connective.Right));
                case TokenType.MaterialBiconditional:
                    return new SentenceBinaryMaterialBiconditional(connective, Analyze(connective.Left), Analyze(connective.Right));
                case TokenType.MaterialConditional:
                    return new SentenceBinaryMaterialConditional(connective, Analyze(connective.Left), Analyze(connective.Right));
            }
        }

        throw new NotImplementedException();
    }
}

Nothing extra to implement in the SLInterpreter to add other than the constructor and properties.

public class SLInterpreter : Interpreter<TokenType, ParsedSentence, Sentence> {
    public SLInterpreter(TextWriter externalLogger) : base(externalLogger) { 

    }

    override protected Tokenizer<TokenType> Tokenizer {
        get { return new SLTokenizer(); }
    }

    override protected Parser<TokenType, ParsedSentence> Parser {
        get { return new SLParser(); }
    }

    override protected SemanticAnalyzer<ParsedSentence, Sentence> SemanticAnalyzer {
        get { return new SLSemanticAnalyzer(); }
    }
}

Once all of the Interpreter code has been completed, it becomes trivial to go and implement the truth table.

private void table(string input) {
    SLInterpreter interpreter = new SLInterpreter(Console.Out);
    TokenList<TokenType> tokens;
    Sentence sentence;

    if (!interpreter.Interpret(input, out sentence, out tokens))
        return;

    List<Variable> variables = tokens
       .Where((x) => x.TokenType == TokenType.Letter)
       .Select((x) => x.Lexeme)
       .Distinct()
       .Select((x) => new Variable(x))
       .ToList();

    foreach (Variable variable in variables)
        Console.Write("{0} ", variable.Identifier);
    Console.WriteLine("| {0}", input);

    for (int n = (1 << variables.Count) - 1; n >= 0; n--) {
        int N = n;
        for (int m = variables.Count - 1; m >= 0; m--) {
            variables[m].Value = (N & 1) == 1;
            N >>= 1;
        }

        IDictionary<Token<TokenType>, bool> tableRow = new Dictionary<Token<TokenType>, bool>();
        sentence.Evaluate(variables, tableRow);

        int offset = 2;
        foreach (Variable variable in variables) {
            Console.Write("{0} ", variable.Value ? "T" : "F");
            offset += 1 + variable.Identifier.Length;
        }

        Console.Write("| ");
        foreach (Token<TokenType> token in tableRow.Keys) {
            Console.CursorLeft = token.StartsAtIndex + offset;
            Console.Write(tableRow[token] ? "T" : "F");
        } Console.WriteLine();
    }
}

Written by lewellen

2009-03-01 at 12:00 am

Name that fraction

with 2 comments

How do we go about converting a floating point number q to a quotient of natural numbers n, m to a given precision p such that \gcd(n, m) = 1?

There are two ways to go about this, one of which is a subset of the other. We could use the Farey Sequence, but that is limited to mapping floating point values to fractions which fall into the range \lvert q \rvert \in [0, 1] or we can use a Stern-Brocot tree, which allows for values to fall into the range \vert q \rvert \in [0, \infty].

If we think of the Stern-Brocot tree as a list whose composition is determined by iteration, we will want to define the initial list as the domain \frac{0}{1}, \frac{1}{0}. We want to know a value that lies between these two values such that the numerator and denominator are coprime. (So that the tree we construct contains only fractions that are in minimal terms.) The mediant achieves that goal and is defined as \frac{a}{c} < \frac{a + b}{c + d} < \frac{b}{d}. If we insert the mediant between every pair in the list until the last pair is accessed we’ve completed an iteration. If we take this process ad infinitum we get the Stern-Brocot tree. For example:

\displaystyle T_{0} = \{ \frac{0}{1}, \frac{1}{0} \}
\displaystyle T_{1} = \{ \frac{0}{1}, \frac{1}{1}, \frac{1}{0} \}
\displaystyle T_{2} = \{ \frac{0}{1}, \frac{1}{2}, \frac{1}{1}, \frac{2}{1}, \frac{1}{0} \}
\displaystyle T_{3} = \{ \frac{0}{1}, \frac{1}{3}, \frac{1}{2}, \frac{2}{3}, \frac{1}{1}, \frac{3}{2}, \frac{2}{1}, \frac{3}{1}, \frac{1}{0} \}

It is worth noting that if we had instead started with \frac{0}{1}, \frac{1}{1} that we would instead be operating over the Farey Sequence where every iteration is denoted as \mathfrak{F}_{n}. Arguably, we could have gone with this sequence by inverting all q > 1 and then inverting the resolved fraction- which would have added unnecessary complexity to the end implementation.

Using the Stern-Brocot structure, we can implement a simple binary search over the list to map a floating point number to a fraction within a given precision. The following C# implementation illustrates how this can be done but it is not complete nor is it optimized in any way.

public static Fraction toFraction(this double q, int precision) {
    if (double.IsNaN(q))
        return null;
    if (double.IsPositiveInfinity(q))
        return new Fraction(1, 0);
    else if (double.IsNegativeInfinity(q))
        return new Fraction(-1, 0);
    else if (q == 0.0)
        return new Fraction(0, 0);

    bool negative = q < 0;
    double val;

    Fraction L = new Fraction(0, 1);
    Fraction R = new Fraction(1, 0);
    Fraction M = null;

    q = Math.Round(Math.Abs(q), precision);

    do {
        M = Fraction.mediant(L, R);
        val = M.getValue(precision);

        if (val < q)
            L = M;
        else if(val > q)
            R = M;
    } while (val != q);

    return new Fraction(negative ? -1 : 1 * M.N, M.D);
}
public class Fraction {
       public long N, D;

       public Fraction(long n, long d) {
           N = n; D = d;
       }

       public double value(int precision) {
           return Math.Round(N / (double)D, precision);
       }

       public static Fraction mediant(Fraction l, Fraction r) {
           return new Fraction(l.N + r.N, l.D + r.D);
       }
}

Written by lewellen

2008-07-27 at 9:30 pm

Posted in Uncategorized

Tagged with , ,