○ Try using the Python interpreter as a calculator, and typing
12 / (4 +
○ Given an alphabet of 26 letters, there are 26 to the power
26 ** 10, 10-letter
strings we can form. That works out to
L at the end just indicates that this is
Python’s long-number format). How many hundred-letter strings are
○ The Python multiplication operation can be applied to lists.
What happens when you type
'Python'] * 20, or
○ Review Computing with Language: Texts and Words on
computing with language. How many words are there in
text2? How many distinct words are
○ Compare the lexical diversity scores for humor and romance fiction in Table 1-1. Which genre is more lexically diverse?
○ Produce a dispersion plot of the four main protagonists in Sense and Sensibility: Elinor, Marianne, Edward, and Willoughby. What can you observe about the different roles played by the males and females in this novel? Can you identify the couples?
○ Find the collocations in
○ Consider the following Python expression:
len(set(text4)). State the purpose of this
expression. Describe the two steps involved in performing this
○ Review A Closer Look at Python: Texts as Lists of Words on lists and strings.
Define a string and assign it to a variable, e.g.,
my_string = 'My String' (but
put something more interesting in the string). Print the
contents of this variable in two ways, first by simply typing
the variable name and pressing Enter, then by using the
Try adding the string to itself using
my_string + my_string, or multiplying
it by a number, e.g.,
3. Notice that the strings are joined together without
any spaces. How could you fix this?
○ Define a variable
to be a list of words, using the syntax
my_sent = ["My", "sent"] (but with your
own words, or a favorite saying).
to convert this into a string.
split() to split
the string back into the list form you had to start with.
○ Define several variables containing lists of words, e.g.,
phrase2, and so on. Join them together in
various combinations (using the plus operator) to form whole
sentences. What is the relationship between
len(phrase1 + phrase2) and
len(phrase1) + len(phrase2)?
○ Consider the following two expressions, which have the same value. Which one will typically be more relevant in NLP? Why?
○ We have seen how to represent a sentence as a list of words,
where each word is a sequence of characters. What does
sent1 do? Why? Experiment with other
○ The first sentence of
text3 is provided to you in the variable
sent3. The index of
is 1, because
sent3 gives us
'the'. What are the indexes of
the two other occurrences of this word in
○ Review the discussion of conditionals in Back to Python: Making Decisions and Taking Control. Find all words in the Chat Corpus
text5) starting with the letter
b. Show them in alphabetical order.
○ Type the expression
range(10) at the interpreter prompt. Now
range(10, 20, 2), and
range(20, 10, -2). We will see a variety
of uses for this built-in function in later chapters.
text9.index() to find
the index of the word sunset. You’ll need to
insert this word as an argument between the parentheses. By a
process of trial and error, find the slice for the complete sentence
that contains this word.
>>> sorted(set([w.lower() for w in text1])) >>> sorted([w.lower() for w in set(text1)])
Review the discussion of looping with conditions in Back to Python: Making Decisions and Taking Control. Use a combination of
if statements to loop over the words of
the movie script for Monty Python and the Holy
Ending in ize
Containing the letter z
Containing the sequence of letters pt
All lowercase letters except for an initial capital (i.e.,
Print all words beginning with sh.
Print all words longer than four characters
We have been using sets to store vocabularies. Try the
following Python expression:
< set(text1). Experiment with this using different
set(). What does it
do? Can you think of a practical application for this?