Computer scientists have also done a lot of research in order to
find easy mechanisms of computation which can be modified and improved
in a very straightforward way. The easyness of the mechanisms can then
be exploited in order to gain some insight on the abilities of these
mechanisms and also to be able to check with computer programs whether
such mechanisms do what they should. One of these mechanisms is that
of a deterministic finite automaton (dfa). Its task is very simple:
It should check whether a number (given as a string of its digits) or
more general whether some string of symbols belongs to some given language
L or not. This can of course only be done for languages L of a certain form.
A deterministic finite automaton consists of a set of finitely many states,
which one might number 0,1,2,... and where one might assume that
the machine is initially in the state 0.
The second ingredient of the DFA is an update function which tells
for each state and symbol read what the succeeding state is.
Furthermore, some of the states are called accepting states; the
other states are called rejecting states.
Now a DFA processes a word w by starting in the state 0 and
updating the state for each symbol in the word according to
the update function; if the state reached this way is "accepting"
then the DFA accepts the word w else the DFA rejects the word w.
Multiples of 3
An example is the well-known rule to check whether a number
is a multiple of 3. For this one starts with the value 0 in the
state. For each digit, the new value of the state is
the remainder by 3 of the sum of the old value of the state plus
the value of the digit. The whole number is a multiple of 3 iff
the state is 0 after processing all digits.
The set of states is {0,1,2};
If the symbol read is "0","3","6","9" then the new state is the old state;
if the symbol read is "1","4","7" then the new state is one plus the old
state where 2+1 is mapped to 0;
if the symbol read is "2","5","8" then the state 0 is mapped to 2,
the state 1 is mapped to 0 and the state 2 is mapped to 1.
The state 0 is accepting.
Here two sample runs.
If the word is 2557 then the state sequence is
0 (Read "2") 2 (Read "5") 1 (Read "5") 0 (Read "7") 1.
As 1 is a rejecting state, the number 2557 is not a multiple of 3.
If the word is 36543456 then the state sequence is
0 (Read "3") 0 (Read "6") 0 (Read "5") 2 (Read "4") 0 (Read "3")
0 (Read "4") 1 (Read "5") 0 (Read "6") 0.
The following button permits to try out a Javascript program implementing
this automaton.
Multiples of a number p in general
The update function can be implemented more in general to check
whether the input is the multiple of a fixed number p. The adjustment
is the following:
The set of states is {0,1,...,p-1}.
If d is the value of the digit then the state is updated by the formula
state = (state*10+d) % p.
The state 0 is accepting and the initial state.
The reason that this algorithm works are the following rules for the
remainder operation "%":
(a * b) % p = ((a % p) * b) % p ;
(a + b) % p = ((a % p) + b) % p .
Furthermore, the equation
val(wd) = val(w)*10+val(d)
on the value of strings is combined with the above rules on the remainder
to obtain the following rule:
val(wd) % p = ( (val(w) % p) * 10 + val(d) ) % d .
In the automaton, val(w) % p is the old state and val(wd) % p is the
new state. In the case of computing the remainders by 3 and 9, the
multiplication with 10 can be omitted (as 10%3 and 10%9 are both 1),
but in general, it is needed.
The following button permits to try out a Javascript program implementing
this automaton.
Note that the so implemented function also works when the number analised
has much more digits than the precision with which Javascript is computing.
Optimising the size of an automaton
One goal of automata theory is not only to construct automata, but also
to construct automata of small size. To check whether a number is a multiple
of p needs in the above default algorithm p states. For certain numbers, it
is possible to do this with much less states, an example would be to check
whether a number is a multiple of 10000. The idea is here just to count
the trailing zeroes in the input processed so far and when this number is
at least 4 then the automaton accepts; as there are only finitely many
states, the automaton does not go beyong 4 when counting.
The set of states is {0,1,2,3,4}.
If the digit processed is "0" then the new state is the maximum of 0
and the old state minus one else the new state is 4.
The state 0 is accepting and the initial state is 0.
Try out the automaton here.
For the case of checking whether a number is a multiple of 25,
the automaton can do with 5
states, but it is more complicated to be built and the saving of states
is not as large as the case of multiples of 10000. Here the
automaton:
The set of states is
{0,1,2,3,4} and the states have the following meaning.
State 0: The value of the number is 0 or the last two digits
processed were "00" or "50".
State 1: The last two digits processed were "25" or "75".
State 2: The last digit processed was "0" or "5" but the cases
of states 0 and 1 do not apply.
State 3: The last digit processed was "2" or "7".
State 4: All other cases.
The set of accepting states is {0,1} and the starting state is 0.
The following button activates the corresponding automaton.
General pattern matching
One can use finite automata not only for checking whether a number
(given as a string of digits) is a multiple of a fixed number.
Many simple patterns of numbers can be checked with a finite automaton.
For example, does a number have only two non-zero digits? Or does
a number contain a certain sequence of digits like "121"? The two
optimised automata used that the multiples of 10000 and 25 follow
in the decimal system certain patterns; they would not work in the
binary or ternary system.
The automaton to check whether the sequence
"121" has the states {0,1,2,3} where being in state k means that the
first k digits of "121" have just been processed and the automaton remains
in state 3 when it has processed at some point all 3 digits one after
the other. The starting state is 0 and the only accepting state is 3.
The state transition function can be given best by a table.
State
Successor at "1"
Successor at "2"
Successor at other digit
0
1
0
0
1
1
2
0
2
3
0
0
3
3
3
3
The button below permits to run this automaton.
Combining Automata
One could also whether there are automata which combine the search
for certain patterns. Is there, for example, an automaton which
accepts a string w exactly if (w represents a decimal number which is
a multiple of 25) or (w represents a decimal number which is a multiple
of 3 and which contains the pattern "121")? The idea is to run all
three automata in parallel and to combine their sets of states to
one set of triple states. The program to do this would look like this:
function combinedpattern()
{ var stateseq = "000"; var statea=0; var stateb=0; var statec=0;
var word = prompt("Input number to be checked"); var k;
for (k in word)
{ statea = update("mod25",statea,word[k]);
stateb = update("mod3",stateb,word[k]);
statec = update("121",statec,word[k]);
stateseq = stateseq+"-"+statea+stateb+statec; }
if ((acceptance("mod25",statea)) ||
((acceptance("mod3",stateb))&&(acceptance("121",statec))))
{ alert("DFA accepts "+word+" with statesequence "+stateseq); }
else
{ alert("DFA rejects "+word+" with statesequence "+stateseq); }
return; }
Here the function update(..,state.,word[k]) computes for the state of
the corresponding simulated dfa the new value; the first input parameter
tells the function which dfa to use and the variables statea, stateb, statec
refer to the corresponding states. The triple (statea,stateb,statec) is
then the state of the combined automaton. The initial state is (0,0,0).
The variable stateseq keeps a protocol of the states when processing the
input. The acceptance condition is that the first simulated automaton has
to accept or both the second and the third simulated automaton have to accept.
The first automaton uses the states 0,1,2,3,4 where 0,1 are accepting,
the second automaton uses states 0,1,2 where 0 is accepting, the third
automaton uses states 0,1,2,3 where 3 is accepting. So all states of
the forms x03, 0yz, 1yz are accepting. See the sections above
for a more detailed description of these autoamta.
Regular languages
A language L is called regular iff it is recognised by a finite automaton,
that is, there is a finite automaton which accepts the words in L and
rejects the words outside L. The last example of the combined pattern
showed that unions and intersections of regular languages are again
regular and that one can also form more complicated combinations of
regular languages with the result being regular again. Furthermore,
one can also show that the set-difference of regular languages is
regular. There are two more operations with languages which can be done
and which combine regular languages to a new regular language.
The concatenation of two strings x and y is just the string
obtained by putting x and y one after the other. So if x = 110 and y = 212
then xy = 110212. Furthermore, the concatenation of two languages L and H
is the set {xy: x in L and y in H}.
The Kleene Star. Given a language L, the new language L* consists
of all strings formed by concatenating finitely many words from L. Note that
the empty string is formed by concatenating 0 many words and that every member
of L is obtained by forming a concatenation consisting just of one parameter.
Then xy is the concatenation of two strings and xyz the concatenation of
three strings x,y,z and so on. Any finite concatenation of strings from L
is in L*. See the Wikipedia
page of the Kleene star for further information.
Now the regular
languages can be characterised as that class of languages which
can be formed from several base sets which are finite sets of strings
by combining these base sets with the Kleene star, the union, the intersection
and the concatenation.