Character Strings

In R, character strings are defined by double quotation marks:

a<-"abc"
b<-"123"

Numbers can be characters (as in b, above), but characters cannot be numbers.

as.numeric(a)

[1] NA
Warning message:
NAs introduced by coercion
as.numeric(b)

[1] 123

One of the initially confusing things about character strings is the distinction between the length of a character object (a vector) and the numbers of characters in the strings comprising that object. An example should make the distinction clear:

pets<-c("cat","dog","gerbil","terrapin")

Here, pets is a vector comprising four character strings:

length(pets)

[1]  4

and the individual character strings have 3, 3, 6 and 7 characters, respectively:

nchar(pets)

[1] 3 3 6 7

When first defined, character strings are not factors:

class(pets)

[1] "character"

is.factor(pets)

[1]  FALSE

However, if the vector of characters called pets was part of a dataframe, then R would coerce all the character variables to act as factors:

df<-data.frame(pets)
is.factor(df$pets)

[1]  TRUE

There are built-in vectors in R that contain the 26 letters of the alphabet in lower case (letters) and in upper case (LETTERS):

letters

 [1] "a"     "b"  "c"  "d"  "e"  "f"  "g"  "h"  "i"  "j"  "k"  "l"  "m"  "n"  "o"  "p"
[17] "q"  "r"  "s"  "t"  "u"  "v"  "w"  "x"  "y"  "z"

LETTERS

 [1] "A"  "B"  "C"  "D"  "E"  "F"  "G"  "H"  "I"  "J"  "K"  "L"  "M"  "N"  "O"  "P"
[17] "Q"  "R"  "S"  "T"  "U"  "V"  "W"  "X"  "Y"  "Z"

To discover which number in the alphabet the letter n is, you can use the which ...

Get The R Book now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.