This chapter introduces you to string manipulation in R. You’ll learn the basics of how strings work and how to create them by hand, but the focus of this chapter will be on regular expressions, or regexps for short. Regular expressions are useful because strings usually contain unstructured or semi-structured data, and regexps are a concise language for describing patterns in strings. When you first look at a regexp, you’ll think a cat walked across your keyboard, but as your understanding improves they will soon start to make sense.
This chapter will focus on the stringr package for string manipulation. stringr is not part of the core tidyverse because you don’t always have textual data, so we need to load it explicitly.
You can create strings with either single quotes or double quotes.
Unlike other languages, there is no difference in behavior. I recommend
", unless you want to create a string that contains
"This is a string"
'To put a "quote" inside a string, use single quotes'
If you forget to close a quote, you’ll see
+, the continuation
> "This is a string without a closing quote + + + HELP I'M STUCK
If this happens to you, press Esc and try again!
To include a literal single or double quote in a string you can use
to “escape” it:
# or '"'
# or "'"
That means ...