Talk:Deterministic finite automaton

This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Computer scienceWikipedia:WikiProject Computer scienceTemplate:WikiProject Computer scienceComputer science articles

High

This article has been rated as High-importance on the project's importance scale.

Things you can help WikiProject Computer science with:

Here are some tasks awaiting attention:

Article requests :
- Requested articles/Applied arts and sciences/Computer science, computing, and Internet
Cleanup :
- Computer science articles needing attention
- Computer science articles needing expert attention
Copyedit :
- Computing
Expand :
- Computer science
Infobox :
- Computer science articles without infoboxes
Maintain :
- Timeline of computing 2020–present
Photo :
- Find pictures for the biographies of computer scientists (see List of computer scientists)
- Computing articles needing images
Stubs :
- Computer science stubs
Unreferenced :
- WikiProject Computer science/Unreferenced BLPs
Project-related :
- Tag all relevant articles in Category:Computer science and sub-categories with {{WikiProject Computer science}}

Mathematics Mid‑priority

	Mathematics portal This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.MathematicsWikipedia:WikiProject MathematicsTemplate:WikiProject Mathematicsmathematics articles
Mid	This article has been rated as Mid-priority on the project's priority scale.

Technical flag[edit]

I have rewritten the intro and added the section "Accept and Generate modes", both of which aim to be a non-technical introduction. Is this sufficient to remove the technical flag? Ounsworth (talk) 18:50, 5 August 2010 (UTC)[reply]

What field?[edit]

To what field of human endeavor does this relate? Could someone put an introductory sentance in English for the rest of us? Thanks! ;) Mark Richards 06:21, 14 May 2004 (UTC)[reply]

Doesn't everyone know automata theory? :-) -- jaredwf 07:27, 14 May 2004 (UTC)[reply]

Thanks Jaredwf Just a stupid question though - Finite State Machine says that it is related to 'computer science', while DFST points to 'Theory of computing'. Is this as it should be? Thanks! Mark Richards 15:39, 14 May 2004 (UTC)[reply]

I changed the finite state machine to say theory of computation, since theory of computation is a more exact answer. Thanks for noticing. -- jaredwf 15:46, 14 May 2004 (UTC)[reply]

Both are correct! Theory of Computation is a subdicipline of Computer Science. —Preceding unsigned comment added by Ben 1220 (talk • contribs) 09:36, 16 September 2009 (UTC)[reply]

symbols[edit]

The symbols used to describe the 5-tuple are inconsistent with those in Automata theory. Is it better to change to $\langle Q,\Sigma ,\delta ,q_{0},F\rangle$ ? My textbook also uses these notations.

I'm in favor of this change. Pkirlin 06:47, 11 November 2005 (UTC)[reply]

agree. Me too. We should make this change. Using S conflicts with S for semigroup. Using A conflicts with A for Autamaton. Using M conflicts with M for monoid. linas 14:17, 26 April 2007 (UTC)[reply]
Done. I have at least two books that use this format, so I went ahead and made the change. If anyone disagrees, please comment here and let me know. Thanks! ThomasOwens (talk) 00:01, 11 December 2008 (UTC)[reply]

i m a student of computer science engineering and for me automata theory and formal languages is a very new subject. i m not being able to get hold of the subject properly. i cannot understand the subject. please suggest somthing. —Preceding unsigned comment added by 115.248.12.253 (talk) 06:23, 6 March 2010 (UTC)[reply]

Merge proposal[edit]

Oppose. Other articles reference FSMs without the presumption of determinism. --Ancheta Wis 02:25, 3 December 2005 (UTC)[reply]

Regular languages in relation to FSM[edit]

Can someone please explain the following entry in the article more. Now sure what this exactly is in relation to FSM (the article on regular language is just as perplexing).

The language of M can be described by the regular language given by this regular expression:

   1*(01*01*)*

yusufm 20:51, 13 April 2006 (UTC)[reply]

The section on regular language is the best place to parse this. The * is the Kleene Star mentioned in that article. So 1* is the set { epsilon, 1, 11, 111, 1111...) The Kleene star operator over a number of the alphabet symbols such as (01)* is the set { epsilon, 01, 0101, 010101...) Essentially it means none, one, or more of the items repeating.

DFA search image[edit]

I've just fixed up the DFA image Image:DFA search mommy.svg so that it renders again. It's intended for string search algorithm but may be useful here. Dcoetzee 03:57, 2 April 2007 (UTC)[reply]

Missing paths - still deterministic?[edit]

The definition for DFA states that there should be one and only one transition for each pair of states and inputs. The definition for NFA only addresses the case of a single input leading to multiple, different states. What about the case where an input leads off to a "dummy state" that just loops back to itself on every input? If we just eliminate the dummy state and any inputs leading to it, does the diagram still represent a DFA? If not, is it an NFA? Or something else? Thanks, Maghnus 15:25, 29 October 2007 (UTC)[reply]

Ambigiuous: The description may be slightly ambiguous in that every pair of states and inputs should have exactly one transition. As I understand it, missing transitions are unacceptable for a DFA, and hence would be classes as an NFA. Pyrre (talk) 03:05, 23 December 2007 (UTC)[reply]

Missing paths are no big deal; the definition of what it means to run such an automaton will most likely say that if a path is missing, the automaton immediately halts (this can be simulated by adding an extra state and a new path to this state for each missing path in the original automaton). So long as there are never two paths on the same state/input pair, the automaton is deterministic. — Carl (CBM · talk) 03:21, 23 December 2007 (UTC)[reply]

I just added a new section to mention the difference. Rp (talk) 09:47, 11 November 2015 (UTC)[reply]

It should be made clear that if a path is missing, the automaton halts and rejects the string. 2600:1700:6EC0:35C0:D5F2:3A1:B655:C5F7 (talk) 22:26, 4 September 2020 (UTC)[reply]

Possible error in example[edit]

In the section "Advantages and Disadvantages", the example discribes the "bracket" language - i.e. properly paired brackets. This is not formally aⁿbⁿ, as stated, but "Strings whose each prefix has more or equal a's than b's" Raghunandan ma (talk) 10:18, 16 August 2011 (UTC)[reply]

You are correct. It is bad language. Let me try a fix. (Ashutosh Gupta (talk) 13:34, 16 August 2011 (UTC))[reply]

Thank you. It looks better now. Also I suggest:

DFAs are equivalent in computing power to nondeterministic finite automata (NFAs). This is because, firstly any DFA is also an NFA, so an NFA can do what a DFA can do. Also, given an NFA, one can build a DFA that recognizes the same language as the NFA, although the DFA could have exponentially larger number of states than the NFA. Raghunandan ma (talk) 06:46, 17 August 2011 (UTC)[reply]

Your suggestion is good. Please add it yourself. I also suggest to cite powerset construction page for the algorithm that translates NFA into DFA. (Ashutosh Gupta (talk) 11:30, 17 August 2011 (UTC))[reply]

Have added this. Also made some small changes in the wording for union/intersection/complement at the beginning of this section. Raghunandan ma (talk) 06:53, 18 August 2011 (UTC)[reply]

Requested move[edit]

The following discussion is an archived discussion of a requested move. Please do not modify it. Subsequent comments should be made in a new section on the talk page. No further edits should be made to this section.

The result of the move request was: Moves made, uncontested Mike Cline (talk) 15:54, 3 December 2011 (UTC)[reply]

– These are more popular names of the automata theory concepts. Even within the articles the objects are referred as (non)deterministic finite automaton. Ashutosh Gupta (talk) 15:04, 25 November 2011 (UTC)[reply]

Support per nom. —Ruud 20:37, 25 November 2011 (UTC)[reply]

The above discussion is preserved as an archive of a requested move. Please do not modify it. Subsequent comments should be made in a new section on this talk page. No further edits should be made to this section.

DFA cannot recognize aⁿbⁿ?[edit]

The article states that:

Many simple languages, including any problem that requires more than constant space to solve, cannot be recognized by a DFA. [...] Another simpler example is the language consisting of strings of the form aⁿbⁿ — some finite number of a's, followed by an equal number of b's.

I don't think that is correct. Consider the following state diagram:

S₀ → a → S₁ → b → S₂ (ab)

S₀ → a → S₁ → a → S₃ → b → S₄ → b → S₅ (aabb)

S₀ → a → S₁ → a → S₃ → a → S₆ → b → S₇ → b → S₈ → b → S₉ (aaabbb)

And so forth.

As long as n is limited to a specific finite number, it looks like a DFA can indeed be built for the language aⁿbⁿ. Granted, it will take a large number of states (perhaps on the order of 2ⁿ?) to embed the counting of a's and b's, but it still seems possible. Or did I miss something in the text? — Loadmaster (talk) 17:48, 27 January 2014 (UTC)[reply]

The wording is intended to describe the language

\{a^{n}b^{n}:n\in \mathbf {N} \}

which has infinitely many words (indeed, exactly one of length 2n for every n≥0) and I think it does so. This is not a regular language, and cannot be recognised by a DFA. Deltahedron (talk) 19:38, 27 January 2014 (UTC)[reply]

I thought so, but the wording was not clear, so I changed it (adding "arbitrary" to indicate an unbounded n). — Loadmaster (talk) 22:01, 27 January 2014 (UTC)[reply]

Removed "Accept and Generate Modes" section[edit]

I've removed the "Accept and Generate Modes" section on that grounds that it's original research by a defunct user. A search through my textbooks, university library search engine and Google didn't reveal anyone else who's talking about generate modes for DFAs. Even if that section isn't original research, we should consider whether it lends undue weight to a rarely-discussed type of DFA. Chip Wildon Forster (talk) 18:42, 31 January 2015 (UTC)[reply]

Classifiers[edit]

The description of classifiers claims that a classifier has "has more than two terminal states", which seems to imply that a DFA cannot have more than two terminal states. Should this be "more than two classes of terminal states"? A DFA can obviously have three terminal states, but just two classes (accept and reject). This is not clear from the current wording on classifiers. — Preceding unsigned comment added by 131.174.142.107 (talk) 14:16, 9 October 2019 (UTC)[reply]

Definition of local automaton[edit]

Deterministic_finite_automaton#Local_automata says "A local automaton is a DFA for which all edges with the same label lead to a single vertex." I suggest adding "not necessarily complete" before "DFA". If we require the DFA to be complete as specified in Deterministic_finite_automaton#Formal_definition, then it is not true that local automata can accept all local languages.

For example, consider the local language $A$ consisting of all strings on ${a,b,c}$ that do not contain $ab$ . If $M$ is a complete DFA in which all edges with the same label lead to a single vertex, then $A$ is not the language accepted by $M$ . Proof: Note that $ccc\in A$ but $abc\notin A$ . Let $q$ be the state of $M$ such that all edges with label $c$ lead to $q$ . Then $M$ terminates at $q$ with input $abc$ or $ccc$ , so it either accepts both or rejects both. Therefore $A$ is not the language accepted by $M$ .

I have not checked the references in this section to see how they define DFA. "Local languages and the Berry-Sethi algorithm" (at https://www.irif.fr/~jep/PDF/BerrySethi.pdf) has a proof that every local language is accepted by a local automaton (Proposition 2.1). Immediately before that, its definition of local automaton includes "not necessarily complete". 2600:1700:6EC0:35C0:D5F2:3A1:B655:C5F7 (talk) 22:12, 4 September 2020 (UTC)[reply]

Done I believe you are right. Although every DFA can be made complete by adding an "error" state, completing a local DFA will destroy locality. Also, your proof looks convincing. - Jochen Burghardt (talk) 17:10, 5 September 2020 (UTC)[reply]

Recognizing valid email addresses[edit]

The lead section mentions recognizing syntactically valid email addresses as an application for deterministic finite automata. I am not sure if this is supposed to be an in-joke: the syntax and grammar of email addresses at least as defined by the original RFC 822 is quite complex, and using DFAs (at least in the form of regular expressions) is kind of seen as an anti-pattern, at least as far as I understand it. See e.g. the discussion at Talk:Email address#Complexity of email addresses here on Wikipedia and the top answer to the question "How to validate an email address using a regular expression?" on Stack Overflow. – Tea2min (talk) 11:37, 4 November 2020 (UTC)[reply]

Ooops! I found the sentence "... decides whether or not online user input such as email addresses are valid", and just inserted "syntactically" to indicate that no automaton can check whether a specified mailbox actually exists at the provider site. I wasn't aware of the complexity of address rules at Talk:Email_address#Complexity_of_email_addresses (the grammar there isn't even a regular one). I guess you suggest to replace this exmaple by a simpler one; and I agree with that. What about e.g. floating point numbers, integer numbers, or identifiers in some fixed programming language, or password rules (like "at least an upper case, a lower case letter, a digit, and a punctuation char must occur")? The available images at commons:Category:Deterministic finite state automata are not too convincing for that purpose. - Jochen Burghardt (talk) 13:03, 4 November 2020 (UTC)[reply]

Identifiers in C are probably a nice example since you have a small alphabet (letters, digits, underscore). Password rules ("at least an upper case, a lower case letter, a digit, and a punctuation char must occur"), while simple to describe textually, would probably already be too big. (I think you need at least 1 + 4 + 6 + 4 + 1 states?) – Tea2min (talk) 15:17, 4 November 2020 (UTC)[reply]

Section "Complete and incomplete"[edit]

> According to the above definition, deterministic finite automata are always complete: they define a transition for each state and each input symbol.

Isn't the description wrong? I understand the sentence to mean that each state must have a transition and the automaton must have a transition for each input symbol, but as far as I know, each state must have a transition for each input symbol.

My suggestion: "they define from each state a transition for each input symbol."

Sicro --- 46.223.160.111 (talk) 09:19, 24 October 2021 (UTC)[reply]

Done The old description is ok (if read as "given a state and an input symbol, a transition on them is defined"), but apparently can be misleading; your suggestion is more clear, so I changed the text. - Jochen Burghardt (talk) 16:13, 24 October 2021 (UTC)[reply]

Simplified diagram representation[edit]

It should be mentioned in this article that several transitions labeled with different input symbols leading from state A to state B can be combined into one transition to display the diagram in a simplified way. Technically, however, they are separate transitions, of course.

See at the top of page 3: https://www.cs.toronto.edu/~amir/teaching/csc236f15/materials/lec10.pdf

Sicro --- 46.223.160.163 (talk) 11:40, 28 October 2021 (UTC)[reply]

I agree that this should be explained somewhere. Unfortunately, diagram notation isn't explained at all in this article. Instead it links to State diagram which handles the huge amount of existing variants; thus is is not that easy to find the right place to insert your information. - Jochen Burghardt (talk) 12:29, 28 October 2021 (UTC)[reply]