Reading Files with Different Numbers of Values per Line

Here is a case where you might want to use scan because the data are not configured like a dataframe. The file rt.txt has different numbers of values per line (a neighbours file in spatial analysis, for example; see p. 769). In this example, the file contains five lines with 1, 2, 4, 2 and 1 numbers respectively: in general, you will need to find out the number of lines of data in the file by counting the number of end-of-line control character "\n" using the length function like this:

line.number<-length(scan("c:\\temp\\rt.txt",sep="\n"))

The trick is to combine the skip and nlines options within scan to read one line at a time, skipping no lines to read the first row, skipping one row to read the second line, and so on. Note that since the values are numbers we do not need to specify what:

(my.list<-sapply(0:(line.number-1),
  function(x) scan("c:\\temp\\rt.txt",skip=x,nlines=1,quiet=T)))

[[1]]
[1]    138

[[2]]
[1]    27  44

[[3]]
[1]    19  20  345 48

[[4]]
[1]    115 23  66

[[5]]
[1]    59

The scan function has produced a list of vectors, each of a different length. You might want to know the number of numbers in each row, using length with lapply like this:

unlist(lapply(my.list,length))

[1] 1 2 4 2 1

Alternatively, you might want to create a vector containing the last element from each row:

unlist(lapply(1:length(my.list), function(i) my.list[[i]][length(my.list[[i]])]))

[1] 138 44 48 2366 59

Get The R Book now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.