12.4. How to Process Every Character in a Text File
Problem
You want to open a text file and process every character in the file.
Solution
If performance isn’t a concern, write your code in a straightforward, obvious way:
val
source
=
io
.
Source
.
fromFile
(
"/Users/Al/.bash_profile"
)
for
(
char
<-
source
)
{
println
(
char
.
toUpper
)
}
source
.
close
However, be aware that this code may be slow on large files. For instance, the following method that counts the number of lines in a file takes 100 seconds to run on an Apache access logfile that is ten million lines long:
// run time: took 100 secs
def
countLines1
(
source
:
io.Source
)
:
Long
=
{
val
NEWLINE
=
10
var
newlineCount
=
0L
for
{
char
<-
source
if
char
.
toByte
==
NEWLINE
}
newlineCount
+=
1
newlineCount
}
The time can be significantly reduced by using the getLines
method to retrieve one line at a
time, and then working through the characters in each line. The
following line-counting algorithm counts the same ten million lines in
just 23 seconds:
// run time: 23 seconds
// use getLines, then count the newline characters
// (redundant for this purpose, i know)
def
countLines2
(
source
:
io.Source
)
:
Long
=
{
val
NEWLINE
=
10
var
newlineCount
=
0L
for
{
line
<-
source
.
getLines
c
<-
line
if
c
.
toByte
==
NEWLINE
}
newlineCount
+=
1
newlineCount
}
Both algorithms work through each byte in the file, but by using
getLines
in the second algorithm, the
run time is reduced dramatically.
Note
Notice that there’s the equivalent of two for
loops in the second example. ...
Get Scala Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.