a cfg is in chomsky normal form if the productions are in the following forms −
- a → a
- a → bc
- s → ε
where a, b, and c are non-terminals and a is terminal.
algorithm to convert into chomsky normal form −
step 1 − if the start symbol s occurs on some right side, create a new start symbol s’ and a new production s’→ s.
step 2 − remove null productions. (using the null production removal algorithm discussed earlier)
step 3 − remove unit productions. (using the unit production removal algorithm discussed earlier)
step 4 − replace each production a → b1…bn where n > 2 with a → b1c where c → b2 …bn. repeat this step for all productions having two or more symbols in the right side.
step 5 − if the right side of any production is in the form a → ab where a is a terminal and a, b are non-terminal, then the production is replaced by a → xb and x → a. repeat this step for every production which is in the form a → ab.
problem
convert the following cfg into cnf
s → asa | ab, a → b | s, b → b | ε
solution
(1) since s appears in r.h.s, we add a new state s0 and s0→s is added to the production set and it becomes −
s0→s, s→ asa | ab, a → b | s, b → b | ∈
(2) now we will remove the null productions −
b → ∈ and a → ∈
after removing b → ε, the production set becomes −
s0→s, s→ asa | ab | a, a → b | s | ∈, b → b
after removing a → ∈, the production set becomes −
s0→s, s→ asa | ab | a | as | sa | s, a → b | s, b → b
(3) now we will remove the unit productions.
after removing s → s, the production set becomes −
s0→s, s→ asa | ab | a | as | sa, a → b | s, b → b
after removing s0→ s, the production set becomes −
s0→ asa | ab | a | as | sa, s→ asa | ab | a | as | sa
a → b | s, b → b
after removing a→ b, the production set becomes −
s0 → asa | ab | a | as | sa, s→ asa | ab | a | as | sa
a → s | b
b → b
after removing a→ s, the production set becomes −
s0 → asa | ab | a | as | sa, s→ asa | ab | a | as | sa
a → b |asa | ab | a | as | sa, b → b
(4) now we will find out more than two variables in the r.h.s
here, s0→ asa, s → asa, a→ asa violates two non-terminals in r.h.s.
hence we will apply step 4 and step 5 to get the following final production set which is in cnf −
s0→ ax | ab | a | as | sa
s→ ax | ab | a | as | sa
a → b |ax | ab | a | as | sa
b → b
x → sa
(5) we have to change the productions s0→ ab, s→ ab, a→ ab
and the final production set becomes −
s0→ ax | yb | a | as | sa
s→ ax | yb | a | as | sa
a → b a → b |ax | yb | a | as | sa
b → b
x → sa
y → a