Lexical scanners

How do a scanner work then? Is it hard to write one by hand? I will show you and no it is not.

Let’s start with theory.

A scanner is a finite state-machine, or more precisely a Deterministic Finite-state Machine (DFSM). It handles different states through regular expressions. In this case it maps the characters in the input stream to the rules. Do not worry if this does do not make sense right now.

The DFSM has a starting state and from that you may fulfill some conditions to reach another state, and another from that. If there are no condition that can be satisfied you simply go back to the initial state.

We will take a look at this example where we have a string: “foozie30 23.3″. A valid DFSM for this input could be drawn like this:

dfa

The most common ways of implementing this state machine in an imperative programming language would be using state controllers like if, while and for or any equivalents. You process each character and when you got the string you return an object of a type which symbolizes the symbol that was found. In an object-oriented language it would be even simpler to implement if you ask me.

If you want to take a look at an implementation you just download my Expression parser project. :)

Most compiler geeks, as they are called, are using parser generators and grammars that create both scanner and parser. A parser is far more advanced than scanner. It is not impossible to write them by hand, as I have shown, but generally it is regarded as better writing a grammar and then let the generator create the scanner and the parser for you.

Compilers are very interesting both in theory and in their implementations. Because it is such a large subject you will probably not find this enough. That is why I am planning to write more articles about compilers and techniques that can be used while writing one.

0 Responses to “Lexical scanners”



  1. No Comments Yet

Leave a Reply





Robert Sundström

Tweets

  • Waiting for the VS 2010 Beta 2 to be released to public. Only a couple of hours left. But still not sure when. Is it local time? 3 weeks ago
  • Writing meaningless programs.. 3 weeks ago
  • I consider myself being a logical being, but my logic is flaud. 3 weeks ago
  • SvD: Facebook gör dig smart – men Twitter fördummar (http://bit.ly/14XGNr 2 months ago
  • 2 weeks have passed since I came to Karlskrona. I pretty much enjoy living here. College has started and so far little coursework. 2 months ago

Recent Comments

Steve on Windows 3.11 on a virtual…
catchmikey on Wakoopa
Hillary on First days in Delsbo
wilhol on PC or Mac?

Categories

Calendar

June 2009
M T W T F S S
« May   Jul »
1234567
891011121314
15161718192021
22232425262728
2930  

Blog Stats

  • 15,396 hits