No algorithgm - make higher binding strength terminal to lower part of tree
- no grouping of sequences - make precedence grouping symbol
- Since there is no precedence - solve by binding strength factor : cannot be broken apart by an adjacent term : cannot be broken by small binding strength expressions : all Identifier : operand for operation like terminal
- various left / right most derivation imply various parse tree
→ left, right distinct derivation go along
Simplifications
- Remove λ-productions
- Remove unit productions
- Remove redundant symbols - unreachable, non-terminating
We only consider CFLs that do not have the empty string λ since it makes easy to understand many claims and proofs. This does not lose generality
- just by adding new start symbol method
math: S_0 \rightarrow S \, | \, \lambda
- To any nontrivial results, there is method of obtaining G'
math: L(G') = L(G) \, \backslash \, {\lambda}
Substitution rule
G' constructed by replacing variable A,B with A' in G has same CFL - by first and second line, we can construct final line (y can be sentential form)
Proof - easy
- if
math: w \in L(G),
thenmath: w \in L(G')
- by assumption,
math: S →^*_G w
- if this derivation does not use production
math: A → x_1Bx_2
, it is immediate thatmath: S \rightarrow^*_{G'} w
- if it does, at G,
math: S →^*_G u_1Au_2 →_G u_1x_1Bx_2u_2 →_G u_1x_1y_jx_2u_2
- if it does, at G,
math: S →^*_{G'} u_1Au_2 →_{G'} u_1x_1y_jx_2u_2
- we reach the same sentential form with G and G'
- by induction of rewriting steps, we have then part
- if
math: w \in L(G'),
thenmath: w \in L(G)
case also has same sentential form
2. Removing redundant symbols
All terminals in T are always terminating
- non-terminating : unable to derive any terminal strings
- unreachable : does not appear in any sentential form
Redundant : non-terminating remove by substitution and unreachable removed directly
Algorithms
- detect terminating symbols
A in V has a production that consists solely of terminating symbols then A is terminating - start from terminal and lambda
- detect reachable symbols (terminal is symbol either)
if A is reachable then all symbols in A are reachable - start from S
3. Removing λ-productions
λ-free : G property if G has no null production
λ-variable :
math: A →^+ λ
If no λ-productions → length monotonically increasing
- Find set
math: V_N
of all λ-variables of G (a) Find all λ-productions and put that variable intomath: V_N
(b) Foll all production B →math: A_1A_2 \dots A_n
, wheremath: A_1A_2 \dots A_n \in V_N
then put B intomath: V_N
- For each element A of
math: V_N
- Remove all λ-productions
- Then finally remove all non-terminating symbols
λ-productions = empty production = null production production means of 'or' part, not all of right side of p in P
4. Removing unit productions
Unit production : A → B by substitution rule
Claim
Given a CFG G = (V,T,P,S) without λ-productions, there is an equivalent CFG G' = (V 0,T,S,P') that does not have any unit productions
Method
If a unit production is A → A, then, we can remove it from P immediately without affecting L(G).
In case of A → B, for each variable A, we find all variables such that A* ⇒ B using a dependency graph for variables in V based on P. (A*⇒B holds when there is a path from A to B in the graph.)
1. Put into P' all non-unit productions of P
And then remove unreachable
2. For all A and B of A*⇒B(has transitive property), we add to P' A → y1 | y2 |···| yn, where B → y1 | y2 |···| yn is the set of all rules in P' with B on the left from not dependency which has not B
- in this picture remove B
- final is new + right down - unit production
So by these 4 Simplification, given a CFG, there is an equivalent CFG that does not have any redundant symbols, λ-productions or unit productions.