Context Free Languages (CFLs)

Context Free Languages (CFLs)

 

Pumping lemma (for CFLs)

notion image
 
There is repeated variable
Suppose that L is a CFL and G is a grammar in CNF such that L(G) = L\{λ}
  1. Every parse tree T for G is a binary tree since every production is of the form A → BC or A → σ
  1. If height(T) = k, then T has at most 2k−1 leaves (proof by induction) and, thus, yield(T) is a string of length ≤ 2^{k-1}
  1. Suppose that G has k variables. This implies that if yield(T) is “long” (length > 2k−1), then height(T) ≥ k + 1
     
     

    At least one variable must be repeated

    Find repeated variable A
    Suppose that the long string is z and the repeated variable is A.
    Then, S ∗ =⇒uAy ∗ =⇒uvAxy ∗ =⇒uvwxy = z
    This means that A ∗ =⇒vAx ∗ =⇒vvAxx ∗ =⇒v3Ax3 ∗ =⇒··· ∗ =⇒viAxi ∗ =⇒···
    notion image

    Proof Procedure

    example
    notion image
     
    Let L be a CFL. Then, there exists a positive integer n such that if z is any string in L with |z|≥ n, then we can write z = uvwxy, where
    1. |vwx|≤ n: That is the middle part is not too long
    1. vx 6= λ: Since v and x are the pieces to be pumped, this condition says at least one of the strings we pump must not be empty
    1. ∀i ≥ 0, uviwxiy ∈ L when in all kind of vwx cases
    Once again, think of it as a game
    1. Language unabitablily has n
    1. 2You choose z ∈ L such that |z|≥ n and present it by n
    1. then it has u,v,w,x and y that satisfy the properties
    1. You choose contradiction i ≥ 0 such that uviwxiy / ∈ L

    we use contradiction of math: uv^iwx^iy ∈ L

     

    Closure properties

    Closed

    • union
    • catenation
    • Kleene star
    • homomorphisms
    • inverse homomorphisms
    • reversal
    • substitution
     

    Not Closed

    • intersection - by example
      • L intersection R is CFL
    proof
    Let P be a PDA that accepts L by final state and A be a DFA that accepts R. The PDA recognizing L∩R simulates P and R simultaneously and accepts if both P and A accept the input.
    1. Like in the “Cartesian product” construction, states of the new machine are pairs of states, where one element of the pair corresponds to the state of P and the other corresponds to the state of A
    1. Whenever an input symbol is read, both P and A are simulated; if P needs to make a λ move, then A remains stationary
    • complementation Given a CFL L, its complement L is not context-free
    proof
    We have two reasons:
    1. If CFLs are closed under complement, then by De Morgan’s law and by the fact that CFLs are closed under union, CFLs would be closed under intersection. But, we know it’s not true.
    1. L = {x | x not of the form ww} is a CFL. But L = {ww | w ∈{a,b}∗} is not a CFL. • If L were a CFL, then L0 = L∩a∗b∗a∗b∗ = {aibjaibj | i,j ≥ 0} would be a CFL. But L0 is not a CFL! (Notice a∗b∗a∗b∗ is regular.)
    • difference If L1 and L2 are CFLs, then L1 \L2 is not a CFL.
    proof
    CFLs are not closed under complement, and complement is a special case of set difference. (L = Σ∗\L)

    At proving, Consider a PDA A for L. We make three copy of A, call Aeven, Aodd - Method and substitution used a lot
    non-CFL: closed under reversal - reverse the RHS of all production rules if some type of language set(ex. RL, CFL) they are closed under reversal and complementation → non-language also closed under them
    If L is a CFL and R is a regular language, then L\R is a CFL - Proof: L\R = L∩R.
     

    Substitutions

    new operation denoted by s(w)

    we replace each symbol in the strings of one language by an entire language. (a generalization of the homomorphism)
    • S(σ) is set of word
    • s(w) is catanation of s(alphabet of w)
    • s(L) is union of s(string of language)
     

    Substitution Theorem - closed property to CFL

    Substitution theorem: Let L be a CFL over Σ and s(·) be a substitution such that s(a) is a CFL ∀a ∈ Σ. Then, s(L) is a CFL.
    Proof: From s(·), we replace each terminal a by the start symbol of a CFG for language s(a).
    Let G = (V,T,S,P) be a CFG for L and Ga = (Va,Ta,Sa,Pa), for each s(a). We construct G0 = (V 0,T0,S0,P0) for s(L), where
    1. V 0 = ([ a∈Σ Va)∪V
    2. T0 = ([ a∈Σ Ta)
    3. P0 = ([ a∈Σ Pa) plus the productions of P but each terminal a occurrence in the rule is replaced by Sa
     

    Non-DCFL

    L = {aibi}∪{aib2i} for n ≥ 0 is context-free but not deterministic context-free.
    Proof: L has a CFG:
    S → S1 | S2 S1 → aS1b | λ S2 → aS2bb | λ
    and, thus, is context-free. Non-DCFL We show that there is no DPDA recognizing L. For contradiction, assume that there is a DPDA A for L; namely, L(A) = {aibi}∪{aib2i}.
    Now we construct a PDA recognizing aibici using A, which should be impossible.
    notion image
    Now we construct a PDA recognizing L∪aibici using A.
    notion image
    C is a PDA recognizing L∪aibici—contradiction. Therefore, L = {aibi}∪{aib2i} is not a DCFL.
     

    Decision problem

    We can test membership in polynomial time
    notion image

    notion image
    A CFL is defined by a CFG or is accepted by a PDA.
    1. Size of grammar is an upper bound on the number of variables, the number of productions, and the length of each production
    1. Size of a PDA is an upper bound on the number of states, the number of transitions, the number of symbols pushed into the stack in any step and so on
     

    CYK Algorithm

    Testing membership : Decide whether or not a string w in a CFL L.
    The core of CYK algorithm: Given a CFG G = (V,T,S,P) in CNF and an input string w = a1a2···an, Construct a table, where each cell Xij contains all variables A ∈ V such that A ∗ =⇒aiai+1···aj.
    A naive way to test membership takes exponential time in the size of |w|. We discuss an efficient technique based on the idea of dynamic programming: CYK algorithm:
    notion image
    Find X_45 which has double varible(x_44x_55 catenation) CNF form in righht side - if null →cannot catenation
     

    Recommendations