Creating a probabilistic Context Free Grammar from CFG

In Probabilistic Context-free Grammar (PCFG), probabilities are attached to all the production rules present in CFG. The sum of these probabilities is 1. It generates the same parse structures as CFG, but it also assigns a probability to each parse tree. The probability of a parsed tree is obtained by taking the product of probabilities of all the production rules used in building the tree.

Let's see the following code in NLTK, that illustrates the formation of rules in PCFG:

>>> import nltk >>> from nltk.corpus import treebank >>> from itertools import islice >>> from nltk.grammar import PCFG, induce_pcfg, toy_pcfg1, toy_pcfg2 >>> gram2 = PCFG.from string(""" A -> B B [.3] | C B C [.7] B -> ...

Get Natural Language Processing: Python and NLTK now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.