Pushdown Automata Homework Examples For Kindergarten

By:

Presentation on theme: "127 The Chomsky Hierarchy(review) Recursively Enumerable Sets Turing Machines Post System Markov Algorithms,  -recursive Functions Regular Expression."— Presentation transcript:

1 127 The Chomsky Hierarchy(review) Recursively Enumerable Sets Turing Machines Post System Markov Algorithms,  -recursive Functions Regular Expression Context-sensitive Languages Context-free Languages Regular Languages Linear-bounded Automata Pushdown Automata Finite State Automata.......... LanguagesMachinesOther Models

2 128 Proof of the Proper Containment of the Language Hierarchy Now we will prove the vertical relationship (i.e., the proper containment) of the Chomsky hierarchy. We only prove the lower two levels. (Refer the text or other related books for proofs of the remaining upper two levels.) It is trivial to prove that every regular language is also a context-free language, and every language accepted by a finite state machine can also be accepted by a pushdown automaton. Hence, for the proof it is enough to show a context-free language that is not regular. To do this we first show the following popular, so called, pumping lemma. Pumping lemma. Let L be a regular language. Then there exists a constant n such that, for every string z  L with |z|  n, there exist u,v,w   * such that z = uvw with |uv|  n and |v|  1, such that for all i  0, uv i w  L.

3 129 Proof of the lemma. If L is regular, then it is accepted by a DFA M = (Q, , , q 0, F ). Let |Q| = n, and consider a string z   * of length m  n. Let z = a 1 a 2... a m, and for every i, 1  i  m, let  ( q 0, a 1 a 2... a i ) = q i. Since n  m, i.e., the number of states is less than or equal to the length of the string, there should be two integers j and k, 0  j < k  n, such that q j = q k. (We can prove this using the pigeonhole principle or proof by induction.) This implies that path q j... q k is a cycle of the state transition graph. Let u = a 1 a 2... a j, v = a j+1 a j+2... a k and w = a k+1 a k+2... a m. Obviously, |uv|  n and |v|  1. Clearly, if a 1 a 2... a m = uvw  L (i.e, q m is in F), then we have a 1 a 2... a j (a j+1... a k ) i a k+1... a m = uv i w  L, for all i  0. amam akak a1a1 a2a2 ajaj a j+1 qkqk q0q0 q1q1 qjqj qmqm.....

4 130 Application of the Pumping Lemma We need to understand clearly what the lemma says before we use it. The lemma can be rewritten as follows. It shows important logical segments to work with. (1) For all regular languages L, (2) there exists a constant n such that (3) for all z  L with |z|  n, (4) there exist u, v and w such that (i) z = uvw, (ii) |uv|  n, (iii) |v|  1, and (iv) for all i  0, uv i w  L.

5 131 Application of the Pumping Lemma Application Example. L = {a i b i | i  0 } is not regular. Proof (by contradiction). Suppose L is regular. Then it should satisfy parts (2) - (4) of the pumping lemma. Let n be the constant (of part (2)) and pick z = a n b n from L. Clearly, z meets the length condition of part (3). Now, we examine if part (4) above is true for z. If not, L does not satisfies the lemma, which implies that L cannot be a regular language. Consider any u, v and w such that Notice that by conditions (i), (ii) and (iii) above, v = a j, 1  j  n, i.e., string v contains only a’s. We do not know exactly how many a’s are in v. It is enough to know that it has only a’s. Now, consider string z' = uv 2 w = uvvw. Clearly, z' has more a’s than b’s, and hence, z' cannot be a member of L. We are in a contradiction. L is not regular. (i) z = a n b n = uvw (ii) |uv|  n (iii) |v|  1. aa.........abb.........b uvw nn

6 132 Ruminating on the Application of the Pumping Lemma Recall that, to disprove a statement which uses a universal quantifier (for example “for all z”), it is enough to find one counter example. The lemma uses two universal quantifiers in parts (3) and (4)-(iv). For these parts, it is enough to pick one counter example, because our objective is to disprove the statement of the lemma. We picked z = a n b n for part (3), and i = 2 for part (4)-(iv). You may choose others if you like. But for these parts, one counter example is enough. To disprove a statement which uses an existential quantifier (for example “there is n..”, or “there exists u, v, w …”), we need to consider all such cases. Hence, in the proof we use n like a variable. Whatever the value of the constant n is, we are using that value of n. Notice that for u, v and w, we should consider all possible u, v and w which satisfy conditions (i), (ii) and (iii).

7 133 Question: Is the language L = {xyx | x = abc, y  {a,b,c} * } regular? Prove your answer. Answer (wrong): L is not regular. Suppose that it is, and let n be the constant of the pumping lemma. Consider z = abcyabc, where y  {a,b,c} * such that |y| > n. Obviously, |z|  n. Let z = uvw, where u = a, v = b and w = cyabc. Then uv 2 w = abbcyabc is not in L! Contradiction! What is wrong with this proof? Actually L is a regular language. We can easily construct an FA which accepts L. Ruminating on the Application of the Pumping Lemma (cont’ed) The following example shows a typical case of “malicious” application of the pumping lemma, which we frequently come across while grading tests for this course.

8 134 A non-regular language which satisfies the pumping lemma We learned that all regular languages satisfy the pumping lemma. We may ask the following questions: Is there a non-regular language which satisfies the pumping lemma? The answer is positive. Here is an example. Let L 1 = { x| x  {a, b} *, and x has no substring of the form aa or bb}. L 1, which is regular, can also be expressed in terms of a regular expression, for example; (ab) * + b(ab) * + (ab) * a +(ba) *. Note that L 1 has strings of all lengths. Let L 2 = { x| x  {a, b} * and |x| is a prime }, and let = {a,b} * - L 1, the complement of L 1. Consider the language; L = (L 1  L 2 ) . We will show that L is not regular and satisfies the pumping lemma. Suppose L is regular, then L - must be regular because is regular. Use the pumping lemma with a z  L - by pumping z = uvw up to z' = uv p+1 w, where p = |z|. Let j = |v|. Then we have |z'| = | uv p+1 w| = |uvw| + |v p | = p + jp = p(j+1), which is not prime. (Notice that p is a prime number.) This implies that z is not in L -. Hence, L - is not regular. We are in a contradiction. It follows that L cannot be regular. L1L1 L1L1 L1L1 L1L1 L1L1 L1L1 L1L1

9 135 Now, we show that L satisfies the pumping lemma. We choose n = 3 as the constant of the lemma. For z  L such that |z|  3, let z = c 1 c 2 c 3 x, where c 1, c 2, c 3  {a, b}, and x   *. We will show that there exist u, v, w   * such that (i) z = uvw, (ii) |uv|  n, (iii) |v|  1, (iv) uv i w  L, for all i  0. Consider the following three cases depending on c 1, c 2, and c 3, : Case 1: c 1  c 2  c 3. Since c 1, c 2, c 3  {a, b}, we have c 1 = c 3. Pump c 2, i.e., let v = c 2. Clearly, for all i  0, z' = uv i w is in L (actually, is in except for the case i = 1). (Notice that in this case z can be either in L 1  L 2 or.) Case 2: c 1 = c 2. Then z is in. Let v = c 3. Then z' = uv i w is in L (actually, in ), for all i  0. Case 3: c 2 = c 3. Let v = c 1. Then z' = uv i w is in L, for all i  0 as in Case 2. L1L1 L1L1 L1L1 L1L1

10 136 The Pumping lemma for context-free languages (uvwxy Theorem) Theorem. For every CFL L whose size is not finite, there is a positive integer p such that, for every z  L with |z|  p, there exist u, v, w, x, y   * such that z = uvwxy with |vwx|  p and |vx|  1 such that for all i  0, uv i wx i y  L. This lemma can be rewritten as shown below to show the logical context. Notice that in this lemma p has the same role as n has in the pumping lemma for regular languages, and v and x, the “pumping sites”, can be anywhere in z. Recall that the “pumping site” v in the regular language pumping lemma can only be located in the prefix of z of length n. (1) For all context-free language L, (2) there exists a constant p such that (3) for all z  L such that |z|  p, (4) there exist u, v, w, x, y, and z such that (i) z = uvwxy, (ii) |uwx|  p, (iii) |ux|  1, and (iv) for all i  0, uv i wx i y  L.

11 137 Proof of the Pumping lemma for context-free languages Proof. (Sketch. See the text for detailed proof.) Consider the following CFG in the Chomsky normal form. S  AB A  DE B  FD E  FG G  DB | g F  HG D  d H  h The derivation tree for string z = d(hd) 3 hg(d) 3 dghdghd is given in the following page. Notice the pattern of nonterminal sequence F-G-B-F-G-B-F-G-B which occurs on the longest stem of the parse tree. It has a repeating pattern of F-G-B, which has leaves “hd” to the left and “d” to the right. It is easy to see that we can “grow” the stem by adding as many repeating patterns of F-G-B as we want or cut the it short by deleting the pattern. The resulting tree is still a parse tree of the grammar. It follows that the grammar has a parse tree for z' = d(hd) i hg(d) i dghdghd, for all i  0, which implies that z' is also in the language of the grammar.

12 138 Z = d (h d) 3 h g (d) 3 d h g d h g d u v w x y SABADEEFG GDB | gFHG DdHh

13 139 In general, consider a parse tree of a CFG in CNF whose height is greater than the number of nonterminal symbols of the grammar. Along the longest stem of the tree, at least one nonterminal symbol, say A, appears more than once. Suppose that this parse tree derives a sting z. Clearly, repeating (or deleting) the pattern from A to A along the stem will result in another parse tree of the grammar which derives string z' = uv i wx i y, i  0, where uvwxy = z, |vx|  1 and |vwx|  |V N |, the number of nonterminal symbols. Proof of the Pumping lemma for context-free languages(cont’ed)

14 140 Applications of the pumping lemma for CFL's Example: The language L = { a n b n c n | n  0 }is not a CFL. Suppose L is a CFL, and consider z = a p b p c p, where p is the constant of the pumping lemma for CFL's. Clearly, z is in L. Let z = uvwxy, where u,v,w,x and y are some substrings of z which satisfy conditions (i) - (iii) above. Since |vwx|  p, substrings v and x together contain at most two different symbols (see the illustration below). Since |vx|  1, v and x together have at least one symbol. It follows that z' = uv 2 wx 2 y does not have the same number of a's, b's and c's. Hence, z' is not in L. This contradicts the lemma. Thus language L is not a CFL. (i) z = a p b p c p = uvwxy (ii) |vwx|  p (iii) |vx|  1. aa.........abb.........bcc......... c u vwx ppp y

15 141 Notice the difference between the pumping lemma for regular languages and the lemma for CFL's. With the pumping lemma for regular languages, you have privilege of pumping only on the prefix of length n of the string z that you have chosen. With the pumping lemma for CFL's, you can no longer confine the “pumpable site” within the prefix of z of length p. Notice that vwx, under the condition that |uvx|  p, can be positioned anywhere because there is no restriction on the length of the strings u and y. Hence, in general, we should deal with more cases that result from pumping a string of a CFL than those when we pump a string of a regular language. To deal with this difficulty, they developed more complex pumping lemmas for CFL's (for example, see the Ogden's lemma in “Introduction to Automata Theory, Languages, and Computation by Hopcroft, Motwani and Ullman). Ruminating on the Application of the Pumping Lemma for Contex-free Languages

Presentation on theme: "컴파일러 입문 제 3 장 정규 언어."— Presentation transcript:

1 컴파일러 입문제 3 장정규 언어

2 목 차 3.1 정규 문법과 정규 언어 3.2 정규 표현 3.3 유한 오토마타 3.4 정규 언어의 속성
목 차3.1 정규 문법과 정규 언어3.2 정규 표현3.3 유한 오토마타3.4 정규 언어의 속성Regular Language

3 정규 문법과 정규 언어A study of the theory of regular languages is often justified by the fact that they model the lexical analysis stage of a compiler.Type 3 Grammar(N. Chomsky)RLG : A → tB, A → tLLG : A → Bt, A → twhere, A,B ∈ VN and t ∈ VT*.It is important to note that grammars in which left-linear productions are intermixed with right-linear productions are not regular.For example,G : S → aR S → c R → SbL(G) = {ancbn | n  0} is a cfl.Regular Language

4 Definition (1) A grammar is regular if each rule is
i) A ® aB, A ® a, where a Î VT, A, B Î VN.ii) if S ® ε Î P, then S doesn't appear in RHS.우선형 문법 A ® tB, A ® t의 형태에서 t가 하나의 terminal로 이루어진 경우로 정규 문법에 관한 속성을 체계적으로 전개하기 위하여 바람직한 형태이다.(2) A language is said to be a regular language(rl) if it can be generated by a regular grammar.ex) L = { anbm| n, m ≥1 } is rl.S ® aS | aAA ® bA | bRegular Language

5 ⇒ These forms of productions can be easily removed. (Text pp.175-181)
[Theorem] The production forms of regular grammar can be derived from those of RLG.(RLG => RG) (Text p.69)(proof)A ® tB, where t Î VT*.Let t = a1a2... an, ai Î VT.A ® a1A1A1 ® a2A2.An-1 ® anB.If t = e, then A ® B (single production) or A ® e (epsilon production).⇒ These forms of productions can be easily removed.(Text pp )ex) S ® abcA ⇒ S ® aS1, S1 ® bS S2 ® cAA ® bcA ⇒ A ® bA1, A1 ® cAA ® cd ⇒ A ® cA1', A1' ® dRegular Language

6 Equivalence 1. 언어 L은 우선형 문법에 의해 생성된다. 2. 언어 L은 좌선형 문법에 의해 생성된다.
정규 언어[예] L = {anbm | n,m ≥ 1} : rlS ® aS | aAA ® bA | bRegular Language

7 토큰의 구조를 정의하는데 정규 언어를 사용하는 이유
(1) 토큰의 구조는 간단하기 때문에 정규 문법으로 표현할 수 있다.(2) context-free 문법보다는 정규 문법으로부터 효율적인 인식기를 구현할 수 있다.(3) 컴파일러의 전반부를 모듈러하게 나누어 구성할 수 있다.(Scanner + Parser)문법의 형태가 정규 문법이면 그 문법이 나타내는 언어의 형태를 체계적으로 구하여 정규 표현으로 나타낼 수 있다.if G = rg, L: re.Regular Language

8 정규 표현A notation that allows us to describe the structures of sentences in regular language.The methods for specifying the regular languages(1) regular grammar(rg)(2) regular expression(re)(3) finite automata(fa)Regular Language

9 Definition :A regular expression over the alphabet T and the language denoted by that expression are defined recursively as follows :I. Basis : f , e , a Î T.(1) f is a regular expression denoting the empty set.(2) e is a regular expression denoting {e}.(3) a where a Î T is a regular expression denoting {a}.II. Recurse : + , • , *If P and Q are regular expressions denoting Lp and Lqrespectively, then(1) (P + Q) is a regular expression denoting Lp U Lq. (union)(2) (P • Q) is a regular expression denoting Lp  Lq. (concatenation)(3) (P*) is a regular expression denoting (closure){e} U Lp U Lp2 U ... U Lpn ...Note : precedence : + < • < *II. Nothing else is a regular expression.Regular Language

10 (1) L(a*) = {e, a, aa, aaa, … } = {an | n  0}
ex) (0+1)* denotes {0,1}*.(0+1)*011 denotes the set of all strings of 0s and 1sending in 011.Definition : if α is α regular expression, L(α) denotes the language associated with α. (Text p.72)Let a and b be regular expressions. Then,(1) L(α+ β) = L(α)  L(β)(2) L(α β) = L(α) L(β)(3) L(α*) = L(α)*examples :(1) L(a*) = {e, a, aa, aaa, … } = {an | n  0}(2) L((aa)*(bb)*b) = {a2nb2m+1| n,m  0}(3) L((a+b)*b(a+ab)*) --- 연습문제 3.2 (3) - text p.115= { b, ba, bab, ab, bb, aab, bbb, … }Regular Language

11 A1. α+β = β+α A2. (α+β) +γ = α+ (β+γ)
Definition : Two regular expressions are equal if and only if they denote the same language.α= β if L(α) = L(β).Axioms : Some algebraic properties of regular expressions. Let a, b and g be regular expressions. Then, (Text p.73)A1. α+β = β+α A2. (α+β) +γ = α+ (β+γ)A3. (αβ) γ = α (βγ) A4. α(β+γ) = αβ +αγA5. (β + γ) α = βα + γα A6. α+α=αA7. α + f = α A8. αf = f = fαA9. e α = α = α e A10. α* = e +α•α*A11. α* = (e + α)* A12. (α* )* = α*A13. α* + α = α * A14. α* + α+ = α*A15. (α + β)* = (α* β *) *Regular Language

12 Definitions : regular expression equations.
All of these identities(=Axioms) are easily proved by the definition of regular expression.A8. αf = f = f α(proof) αf = { xy | x Î Lα and yÎ Lf }Since y Î Lf is false, (x Î Lα and y Î Lf) is false.Thus αf = f .Definitions : regular expression equations.::= the set of equations whose coefficient areregular expressions.ex) α,β가 정규 표현이면, X = αX+β가 정규 표현식이다. 이때, X의 의미는 nonterminal 심볼이며 우측의 식이 그 nonterminal이 생성하는 언어의 형태이다.Regular Language

13 ▶ The solution of the regular expression equation
X = αX + βWhen we substitute X = α*β in both side of the equation, each side of the equation represents the same language.X = αX + β= α(α*β) + β= αα*β + β = (αα* + ε)β = α*β.fixed point iterationX = αX + β= α(αX + β) + β= α2X + αβ + β = α2X + (ε + α)β.= αk+1X + (ε + α + α αk )β= (ε + α + α αk + ...)β = α*β.Regular Language

14 Not all regular expression equations have unique solution.
X = αX + β(a) If ε is not in α, then X = α*β is the unique solution.(b) If ε is in α, then X = α*(β + L) for some language L.So it has an infinity of solutions.⇒ Smallest solution : X = α*β.ex) X = X + a : not unique solution⇒ X = a + b or X = b*a or X = (a + b)* etc.X = X + a X = X + a= a + b + a = b*a + a= a + a + b = (b* + ε) a= a + b = b*aRegular Language

15 Finding a regular expression denoting L(G) for a given rg G.
L(A) where A  VN denotes the language generated by A.By definition, if S is a start symbol, then L(G)= L(S).Two steps :1. Construct a set of simultaneous equations from G.A ® aB, A ® aL(A) = {a}·L(B) U {a}  A = aB + aIn general, X ® α |β| γ ⇒ X = α + β + γ.2. Solve these equations.X = αX + β Û X = α*β.if G = rg, L: re.Regular Language

16 ex2) S ® aA | bB | b A ®bA | ε B ®bS
ex1) S ® aS S ® bR S ® ε R ® aSL(S) = {a}L(S) U {b}L(R) U{ε}L(R) = {a}L(S)ree: S = aS + bR + εR = aSS = aS + baS + ε= (a + ba)S + ε= (a + ba)* ε = (a + ba)*ex2) S ® aA | bB | b A ®bA | ε B ®bSree: S = aA + bB + bA = bA + ε ⇒ A = b*ε = b*B = bS S = ab* + bbS + b= bbS + ab* + b= (bb)*(ab*+b)Regular Language

17 ex7) A1 = (01* + 1) A1 + A2 ex8) A ® aB | bA
ex3) A ® 0B | 1A ex4) S ® aA | bSB ® 1A | 0C A ® aS | bBC ® 0C | 1C | ε B ® aB | bB | εex5) S ® 0A | 1B | 0 ex6) X1 = 0X2 + 1X1 + εA ® 0A | 0S | 1B X2 = 0X3 + 1X2B ® 1B | 1 | X3 = 0X1 + 1X3ex7) A1 = (01* + 1) A1 + A2 ex8) A ® aB | bAA2 = A1 + 00A B ® aB | bCA3 = A1 + A2 + ε C ® bD | aBD ® bA | aB |εex9) X ® α1X + α2Y + α3 ex10) PR ® b DL SL eY ® β1X + β2Y + β DL ® d ; DL | ε SL ® SL ; s | sRegular Language

18 인식기(Recognizer)☞ A recognizer for a language L is a program that takes as input string x and answers “yes ” if x is a sentence of L and “no ” otherwise.Turing MachineLinear Bounded AutomataPushdown AutomataFinite AutomataRegular Language

19 유한 오토마타 Definition : fa where, Q : finite, non-empty set of states.
A finite automaton M over an alphabet  is a system (Q, , , q0, F)where, Q : finite, non-empty set of states. : finite input alphabet. : mapping function.q0  Q : start(or initial) state.F ⊆ Q : set of final states.mapping  : Q x  ® 2Q.i,e. (q,a) = {p1, p2, ... , pn}DFA , NFA.G = (VN, VT, P, S)re : f, e, a, + , • , *M = (Q, , , q0, F)Regular Language

20 목차 - FA 1. DFA 2. NFA 3. Converting NFA into DFA 4. Minimization of FA
5. Closure Properties of FARegular Language

21 1. Deterministic Finite Automata(DFA)
deterministic if (q,a) consists of one state.We shall write "(q,a) = p " instead of (q,a) = {p} if deterministic.If δ(q,a) always has exactly one number,We say that M is completely specified.extension of  : Q x  ⇒ Q x *(q,  ) = q(q,xa) = ((q,x),a), where x  * and a  .A sentence x is said to be accepted by Mif (q0, x) = p , for some p  F.The language accepted by M :L(M) = { x | (q0,x)  F }Regular Language

22  p q r 1 Input symbols ex) M = ( {p, q, r}, {0, 1}, , p, {r} )
 : (p,0) = q (p,1) = p(q,0) = r (q,1) = p(r,0) = r δ(r,1) = r1001 Î L(M) ?(p,1001) = (p,001) = (q,01) = (r,1) = r  F．∴  L(M).1010 Î L(M) ?(p,1010) = (p,010) = (q,10) = (p,0) = q  F.∴  L(M). : matrix 형태로  transition table.ex)pqr1Input symbolsRegular Language

23 Definition : State (or Transition) diagram for automaton.
The state diagram consists of a node for every stateand a directed arc from state q to state p with labela   if (q,a) = p.Final states are indicated by a double circle and the initial state is marked by an arrow labeled start.Regular Language

24 Algorithm : w  L(M). ? assume M = (Q, , , q0, F); begin
currentstate := q0; (* start state *)get(nextsymbol);while not eof dobegin currentstate := (currentstate, nextsymbol);get(nextsymbol)end;if currentstate in F then write(‘Valid String’)else write(‘Invalid String’);end.?Regular Language

25 2. Nondeterministic Finite Automata(NFA)
nondeterministic if (q,a) = {p1, p2, ..., pn}In state q, scanning input data a, moves input head one symbol right and chooses any one of p1, p2, ..., pn as the next state.ex) NFA (Nondeterministic Finite Automata)M = ( {q0,q1,q2,q3,qf}, {0,1}, , q0, {qf} )if (q,a) = f, then (q,a) is undefined.Regular Language

26 To define the language recognized by NFA, we must extend .
(i)  : Q x * → 2Q( q, ε ) = { q }( q, xa ) = U (p,a), where a  VT and x  VT*.p  ( q, x )(ii)  : 2Q x * → 2Q({p1, p2, ..., pk}, x) =Definition : A sentence x is accepted by Mif there is a state p in both F and (q0, x).ex)  L(M) ?(q0, 1011) = ({q1,q3}, 011) = ({q1,q2},11)= ({q1,q3},1) = {q1,q3,qf} 1011  L(M) ( ∵ {q1,q3,qf} ∩ {qf}  Φ)ex)  L(M) ?Regular Language

27 Nondeterministic behavior
q0q q3q q q q q q qfIf the number of states |Q| = m and input length |x| = n, then there are mn nodes.In general, NFA can not be easily simulated by a simple program, but DFA can be simulated easily.And so we shall see DFA is constructible from the NFA.Regular Language

28 3. Converting NFA into DFA
NFA : easily describe the real world.DFA : easily simulated by a simple program.===> Fortunately, for each NFA we can find a DFA acceptingthe same language.Accepting Sequence(NFA)(q0, a1a2 ... an) = ({q1,q2, … ,qi}, a2a3 ... an)= ({p1,p2, … ,pj}, ai ... an)= {r1,r2, ... ,rk}Since the states of the DFA represent subsets of the set of all states of the NFA, this algorithm is often called the subset construction.Regular Language

29 [Theorem] Let L be a language accepted by NFA. Then
there exists DFA which accepts L. Text p.86(proof) Let M = (Q, , , q0, F) be a NFA accepting L.Define DFA M' = (Q', , ', q0', F') such that(1) Q' = 2Q, {q1, q2, ..., qi} ∈ Q', where qi ∈ Q.denote a set of Q' as [q1, q2, ..., qi].(2) q0' = {q0} = [q0](3) F' = {[r1, r2, ..., rk] | ri ∈ F}(4) ' : ' ([q1, q2, ...,qi], a) = [p1, p2, ..., pj]if ({q1, q2, ..., qj}, a) = {p1, p2, ..., pj}.Now we must prove that L(M) = L(M’) i.e,' (q0',x)  F' Û (q0, x) ∩ F  f.we can easily show that by inductive hypothesis on the length of the input string x.Regular Language

30 ex1) M = ({q0,q1}, {0,1}, , q0, {q1}),Þ dfa M' = (Q', , ', q0', F'),where Q' = 2Q = {[q0], [q1], [q0,q1]}q0' = [q0]F' = {[q1], [q0,q1]}δ' :δ'([q0],0) = δ({q0},0) = {q0,q1} = [q0,q1]δ'([q0],1) = {q0} = [q0]δ' ([q1],0) = δ(q1,0) = fδ' ([q1],1) = δ(q1,1) = {q0,q1} = [q0,q1]δ' ([q0,q1],0) = δ({q0,q1},0) = {q0,q1} = [q0,q1]δ' ([q0,q1],1) = δ({q0,q1},1) = {q0,q1} = [q0,q1]d1q{q, q}fRegular Language

31 State renaming : [q0] = A, [q1] = B, [q0,q1] = C.
Since B is an inaccessible state, it can be removed.Regular Language

32 Definition : we call a state p accessible if there is w such that (q0, w) Þ (p, ε) , where q0 is the initial state.ex2) NFA Þ DFANFA : q0 {q1,q2} {q1,q3}q1 {q1,q2} {q1,q3}q2 {qf} q3  {qf}qf {qf} {qf}DFA : ’q0 q1q2 q1q3q1q2 q1q2qf q1q3q1q3 q1q2 q1q3qfq1q2qf q1q2qf q1q3qfq1q3qf q1q2qf q1q3qf*Regular Language

33 ∪ Definition :  - NFA M = (Q, , , q0, F)  : Q  (   {} )  2Q
 - CLOSURE : 을 보고 갈 수 있는 상태들의 집합 s가 하나의 상태-CLOSURE(s) = {s}{q|(p, )=q, p  -CLOSURE(s)} T가 하나 이상의 상태 집합인 경우-CLOSURE(T) =ex)  - NFA에서 CLOSURE를 구하기CLOSURE (A) = {A, B, D}CLOSURE({A,C}) = CLOSURE(A)  CLOSURE(C) = {A, B, C, D}-CLOSURE(q)∪q∈TRegular Language

34  Ex)  - NFA Þ DFA A = [1,3,4], B = [2], C = [3,4], D = [4] a b  c
CLOSURE(1) = {1,3,4} [1,3,4]aCLOSURE(2) = {2} [2]bCLOSURE(3) = {3,4} [3,4]c[2]CLOSURE(4) = {4} [4][3,4][4]Regular Language

35 4. Minimization of FA State minimization => state merge
Definition :ω Î * distinguishes q1 from q2 if (q1,ω) = q3,(q2,ω) = q4 and exactly one of q3, q4 is in F.Algorithm : equivalence relation(º) ⇒ partition.(1) º : final state인가 아닌 가로 partition.(2) º : input symbol에 따라 다른 equivalence class 로 가는가?그 symbol로 distinguish 된다고 함.:(3) º : 더 이상 partition이 일어나지 않을 때까지.Þ The states that can not be distinguished are merged into asingle state.Regular Language

36 º : {A,F}, {B, C, D, E} : 처음에 final, nonfinal로 분할한다.
Ex)º : {A,F}, {B, C, D, E} : 처음에 final, nonfinal로 분할한다.º : {A,F}, {B,E}, {C,D} : {B, C, D, E} 가 input symbol b에 의해partition 됨º : {A,F}, {B,E}, {C,D}.Regular Language

37 How to minimize the number of states in a fa.
<step 1> Delete all inaccessible states;<step 2> Construct the equivalence relations;<step 3> Construct fa M’ = (Q’, , ’, q0’, F’),(a) Q’ : set of equivalence classes under º Let [p] be the equivalence class of state p under º.(b) ’([p],a) = [q] if (p,a) = q.(c) q0’ is [q0].(d) F' = {[q] | q Î F}.Definition : M is said to be reduced.if (1) no state in Q is inaccessible and(2) no two distinct states of Q are indistinguishableRegular Language

38 º : {A, B, C, D}, {E, F} º : {A}, {C}, {B, D}, {E, F}
ex) Find the minimum state finite automaton for the language specified by the finite automaton M = ({A,B,C,D,E,F}, {0,1}, , A, {E,F}),where  is given byº : {A, B, C, D}, {E, F} º : {A}, {C}, {B, D}, {E, F}Regular Language

39 Programming <연습문제 3.20> --- 교과서 121쪽 Data Structure Input Design
Regular Language

40 5. Closure properties of FA
[Theorem] If L1 and L2 are finite automaton languages (FAL),then so are (i) L1 U L2 (ii) L1 • L2 (iii) L1*.(proof) M1 = (Q1, , 1, q1, F1)M2 = (Q2, , 2, q2, F2), Q1 Ç Q2 = f (∵ renaming)(i) M = (Q1 U Q2 U {q0}, , , q0, F)where, (1) q0 is a new state.(2) F = F1 U F2 if e  L1 U L2.F1 U F2 U {q0} if e Î L1 U L2.(3) (a) (q0,a) = (q1,a) U (q2,a) for all a Î .(b) (q,a) = 1(q,a) for all q Î Q1, a Î .(c) (q,a) = 2(q,a) for all q Î Q2, a Î .새로운 시작 상태를 만들어 각각의 fa에 마치 각 fa의 시작 상태에서 온 것처럼 연결한다. 그리고 e 를 인식하면 새로 만든 시작 상태도 종결 상태로 만든다.ex) p.98 [예 28]Regular Language

41 (ii) M = (Q1 U Q2, , , q0, F) (1) F = F2 if q2 Ï F2
F1 U F2 if q2 Î F2(2) (a) (q,a) = 1(q,a) for all q Î Q1 - F1.(b) (q,a) = 1(q,a) U 2(q2,a) for all q Î F1.(c) (q,a) = 2(q,a) for all q Î Q2.M1의 종결 상태에서 M2의 시작 상태에서 온 것처럼 연결한다. 그리고 M1의 시작 상태가 접속한 오토마타의 시작 상태가 된다.Regular Language

42 ※ re ===> fa : scanner generator
정규 언어의 속성※ re ===> fa : scanner generatorRegular Language

43 목 차 1. RG & FA 2. FA & RE 3. Closure Properties of Regular Language
4. The Pumping Lemma for Regular LanguageRegular Language

44 1. RG & FAGiven rg, there exists a fa that accepts the same language generated by rg and vice versa.rg Þ faGiven rg, G = (VN, VT, P, S) , construct M = (Q, , , q0, F).(1) Q = VN U {f}, where f is a new final state.(2)  = VT.(3) q0 = S.(4) F = {f} if e Ï L(G)= {S, f} otherwise.(5)  : if A ® aB  P then (A,a) ' B.if A ® a  P then (A,a) ' f.Regular Language

45 (proof) If  is accepted by fa then it is accepted in some sequence of
moves through states, ending in f.But if (A,a) = B and B ¹ f , then A ® aB is a productions.Also if (A,a) = f then A ® a is a production.So we can use the same series of productions to generate  in GThus S => .ex) p.101 [예 29]*Regular Language

46 fa Þ rg Given M = (Q, , , q0, F), construct G = (VN, VT, P, S).
(1) VN = Q(2) VT = (3) S = q0(4) P : if (q,a) = r then q ® ar.if p Î F then p ® e.ex)Regular Language