10.21. Extracting Unique Elements from a Sequence
Problem
You have a collection that contains duplicate elements, and you want to remove the duplicates.
Solution
Call the distinct
method on the
collection:
scala>val x = Vector(1, 1, 2, 3, 3, 4)
x: scala.collection.immutable.Vector[Int] = Vector(1, 1, 2, 3, 3, 4) scala>val y = x.distinct
y: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3, 4)
The distinct
method returns a
new collection with the duplicate values removed. Remember to assign the
result to a new variable. This is required for both immutable and
mutable collections.
If you happen to need a Set
,
converting the collection to a Set
is
another way to remove the duplicate elements:
scala> val s = x.toSet
s: scala.collection.immutable.Set[Int] = Set(1, 2, 3, 4)
By definition a Set
can only
contain unique elements, so converting an Array
, List
, Vector
, or other sequence to a Set
removes the duplicates. In fact, this is
how distinct
works. The source code
for the distinct
method in GenSeqLike
shows that it uses an instance of
mutable.HashSet
.
Using distinct with your own classes
To use distinct
with your own
class, you’ll need to implement the equals
and hashCode
methods. For example, the following
class will work with distinct
because it implements those
methods:
class
Person
(
firstName
:
String
,
lastName
:
String
)
{
override
def
toString
=
s
"$firstName $lastName"
def
canEqual
(
a
:
Any
)
=
a
.
isInstanceOf
[
Person
]
override
def
equals
(
that
:
Any
)
:
Boolean
=
that
match
{
case
that
:
Person ...
Get Scala Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.