Exercises

  1. ○ Define a string s = 'colorless'. Write a Python statement that changes this to “colourless” using only the slice and concatenation operations.

  2. ○ We can use the slice notation to remove morphological endings on words. For example, 'dogs'[:-1] removes the last character of dogs, leaving dog. Use slice notation to remove the affixes from these words (we’ve inserted a hyphen to indicate the affix boundary, but omit this from your strings): dish-es, run-ning, nation-ality, un-do, pre-heat.

  3. ○ We saw how we can generate an IndexError by indexing beyond the end of a string. Is it possible to construct an index that goes too far to the left, before the start of the string?

  4. ○ We can specify a “step” size for the slice. The following returns every second character within the slice: monty[6:11:2]. It also works in the reverse direction: monty[10:5:-2]. Try these for yourself, and then experiment with different step values.

  5. ○ What happens if you ask the interpreter to evaluate monty[::-1]? Explain why this is a reasonable result.

  6. ○ Describe the class of strings matched by the following regular expressions:

    1. [a-zA-Z]+

    2. [A-Z][a-z]*

    3. p[aeiou]{,2}t

    4. \d+(\.\d+)?

    5. ([^aeiou][aeiou][^aeiou])*

    6. \w+|[^\w\s]+

    Test your answers using nltk.re_show().

  7. ○ Write regular expressions to match the following classes of strings:

    1. A single determiner (assume that a, an, and the are the only determiners)

    2. An arithmetic expression using integers, addition, and multiplication, such as 2*3+8

  8. ○ Write a utility function that takes ...

Get Natural Language Processing with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.