In the last chapter, I showed some simple patterns that allow you to avoid having to specify exactly what you want to wait for. In this chapter, I will describe how to use patterns that you are already probably familiar with from the shell—glob patterns. I will also describe what happens when patterns do not match. I will go over some other basic situations such as how to handle timeouts. Finally I will describe what to do at theends of scripts and processes.
Suppose you want to match all of the input and the only thing you know about it is that
hi occurs within it. You are not sure if there is more to it, or even if another
hi might appear. You just want to get it all. To do this, use the asterisk (
*). The asterisk is a wildcard that matches any number of characters. You can write:
expect "hi*" send "$expect_out(0,string) $expect_out(buffer)"
hi matched the literal
hi while the
* matched the string "
losophic\n“. The first
p was not matched by anything in the pattern so it shows up in
expect_out(buffer) but not in
Earlier I said that
* matches any number of characters. More precisely, it tries to match the longest string possible while still allowing the pattern itself to match. With the input buffer of "
philosophic\n“, compare the effects of the following two commands:
expect "hi*" expect "hi*hi"
In the first one, the
losophic\n. This is the longest possible string that the
* can match while still allowing the
hi to match
hi. In the second expect, the
* only matches
losop, thereby allowing the second
hi to match. If the
* matched anything else, the entire pattern would fail to match.
This could conceivably match in two ways corresponding to the two occurrences of “hi” in the string.
What actually happens is possibility (1). The first
philosop. As before, each
* tries to match the longest string possible allowing the total pattern to match, but the
*’s are matched from left to right. The leftmost
*’s match strings before the rightmost
*’s have a chance. While the outcome is the same in this case (that is, the whole pattern matches), I will show cases later where it is necessary to realize that pattern matching proceeds from left to right.
Patterns match at the earliest possible character in a string. In Chapter 3 (p. 74), I showed how the pattern
hi matched the first
philosophic. However, in the example above, the subpattern
hi matched the second
hi. Why the difference?
The difference is that
hi was preceded by "
*“. Since the
* is capable of matching anything, the leading
* causes the match to start at the beginning of the string. In contrast, the earliest point that the bare
hi can match is the first
hi. Once that
hi has matched, it cannot match anything else—including the second
In practice, a leading
* is usually redundant. Most patterns have enough literal letters that there is no choice in how the match occurs. The only remaining difference is that the leading
* forces the otherwise unmatched leading characters to be stored in
expect_out(0,string). However, the characters will already be stored in
expect_out(buffer) so there is little merit on this point alone.
* appears at the right end of a pattern, it matches everything left in the input buffer (assuming the rest of the pattern matches). This is a useful way of clearing out the entire buffer so that the next
expect does not return a mishmash of things that were received previously and things that are brand new.
* matches anything. This is like saying, “I don’t care what’s in the input. Throw it away.” This pattern always matches, even if nothing is there. Remember that * matches anything, and the empty string is anything! As a corollary of this behavior, this command always returns immediately. It never waits for new data to arrive. It does not have to since it matches everything.
In the examples demonstrating
* so far, each string was entered by a person who pressed return afterwards. This is typical of most programs, because they run in what is called cooked mode. Cooked mode includes the usual line-editing features such as backspace and delete-previous-word. This is provided by the terminal driver, not the program. This simplifies most programs. They see the line only after you have edited it and pressed return.
Unfortunately, output from processes is not nearly so well behaved. When you watch the output of a program such as
cat for that matter), it may seem as if lines appear on your screen as atomic units. But this is not guaranteed. For example, in the previous chapter, I showed that when
ftp starts up it looks like this:
ftp ftp.uu.netConnected to ftp.uu.net. 220 ftp.UU.NET FTP server (Version 6.34 Thu Oct 22 14:32:01 EDT 1992) ready. Name (ftp.uu.net:don):
Even though the program may have printed "
Connected to ftp.uu.net.\n" all at once—perhaps by a single
printf in a C program—the UNIX kernel can break this into small chunks, spitting out a few characters eachtime to the terminal. For example, it might print out "
Conn" and then”
ecte" and then "
d to" and so on. Fortunately, computers are so fast that humans do not notice the brief pauses in the middle of output. The reason the system breaks up output like this is that programs usually produce characters faster than the terminal driver can display them. The operating system will obligingly wait for the terminal driver to effectively say, “Okay, I’ve displayed that last bunch of characters. Send me a couple more.” In reality, the system does not just sit there and wait. Since it is running many other programs at the same time, the system switches its attention frequently to other programs. Expect itself is one such “other program” in this sense.
When Expect runs, it will immediately ask for all the characters that a program produced only to find something like "
Conn“. If told to wait for a string that matches "
Name*:“, Expect will keep asking the computer if there is any more output, and it will eventually find the output it is looking for.
As I said, humans are slow and do not notice this chunking effect. In contrast, Expect is so fast that it is almost always waiting. Thus, it sees most output come as chunks rather than whole lines. With this in mind, suppose you wanted to find out the version of
ftp that a host is using. By looking back at the output, you can see that it is contained in the greeting line that begins "
220" and ends with "
ready.“. Naively, you could wait for that line as:
expect "220*" ;# dangerous
If you are lucky, you might get the entire line stored in
$expect_out(0,string). You might even get the next line in there as well. But more likely, you will only get a fragment, such as "
220 f" or "
220 ftp.UU.NE“. Since the pattern
220* matches either of these,
expect has no reason to wait further and will return. As I stated earlier,
expect returns with whatever is the longest string that matches the pattern. The problem here is that the remainder of the line may not have shown up yet!
Leaving off the
e would be too short. This would allow the pattern to match the
server rather than
ready. It is possible to make the overall pattern even shorter by looking for more unusual patterns. But quite often you trade off readability. There is an art to choosing patterns that are correct, yet not too long but still readable. A good guideline is to give more priority to readability. The pattern matching performed by Expect is very inexpensive.
 The more likely reason to see scripts that begin many patterns with "
*" is that prior to Expect version 4, all patterns were anchored, with the consequence that most patterns required a leading "