WordNet is a semantically oriented dictionary of English, similar to a traditional thesaurus but with a richer structure. NLTK includes the English WordNet, with 155,287 words and 117,659 synonym sets. We’ll begin by looking at synonyms and how they are accessed in WordNet.
Benz is credited with the invention of the motorcar.
Benz is credited with the invention of the automobile.
Since everything else in the sentence has remained unchanged, we can conclude that the words motorcar and automobile have the same meaning, i.e., they are synonyms. We can explore these words with the help of WordNet:
>>> from nltk.corpus import wordnet as wn >>> wn.synsets('motorcar') [Synset('car.n.01')]
Thus, motorcar has just one possible
meaning and it is identified as
car.n.01, the first noun sense of
car. The entity
car.n.01 is called a synset, or “synonym set,” a collection of
synonymous words (or “lemmas”):
>>> wn.synset('car.n.01').lemma_names ['car', 'auto', 'automobile', 'machine', 'motorcar']
Each word of a synset can have several meanings, e.g., car can also signify a train carriage, a gondola, or an elevator car. However, we are only interested in the single meaning that is common to all words of this synset. Synsets also come with a prose definition and some example sentences:
>>> wn.synset('car.n.01').definition ...