With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

No credit card required

Ignoring Duplicate Elements

Problem

You want to select all nodes that are unique in a given context based on uniqueness criteria.

Solution

Selecting unique nodes is a common application of the `preceding` and `preceding-sibling` axes. If the elements you select are not all siblings, then use `preceding`. The following code produces a unique list of products from `SalesBySalesperson.xml`:

```<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:template match="/">
<products>
<xsl:for-each select="//product[not(@sku=preceding::product/@sku)]">
<xsl:copy-of select="."/>
</xsl:for-each>
</products>
</xsl:template>

</xsl:stylesheet>
If the elements are all siblings then use preceding-sibling.
<products>
<product sku="10000" totalSales="10000.00"/>
<product sku="10000" totalSales="990000.00"/>
<product sku="10000" totalSales="1110000.00"/>
<product sku="20000" totalSales="50000.00"/>
<product sku="20000" totalSales="150000.00"/>
<product sku="20000" totalSales="150000.00"/>
<product sku="25000" totalSales="920000.00"/>
<product sku="25000" totalSales="2920000.00"/>
<product sku="30000" totalSales="5500.00"/>
<product sku="30000" totalSales="115500.00"/>
<product sku="70000" totalSales="10000.00"/>
</products>

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:template match="/">
<products>
<xsl:for-each select="product[not(@sku=preceding-sibling::product/@sku)]">
<xsl:copy-of select="."/>
</xsl:for-each>
</products>
</xsl:template>

</xsl:stylesheet>```

To avoid `preceding`, which can be inefficient, travel up to the ancestors that are siblings, and then use `preceding-sibling` and travel down to the nodes you want to test:

```<xsl:for-each select="//product[not(@sku=../preceding-sibling::*/product/@sku)]">
<xsl:copy-of select="."/>
</xsl:for-each>```

If you are certain that the elements are sorted so that duplicate nodes are adjacent (as in the earlier products), then you only have to consider the immediately preceding sibling:

```<xsl:for-each
select="/salesperson/product[not(@name=preceding-sibling::product[1]/@name]">
<!-- do something with each uniquiely named product -->
</xsl:for-each>```

Discussion

In XSLT Version 2.0 (or Version 1.0 in conjunction with the `node-set( )` extension function), you can also do the following:

```<xsl:stylesheet version="1.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:template match="/">

<xsl:variable name="products">
<xsl:for-each select="//product">
<xsl:sort select="@sku"/>
<xsl:copy-of select="."/>
</xsl:for-each>
</xsl:variable>

<products>
<xsl:for-each select="\$products/product">
<xsl:variable name="pos" select="position(  )"/>
<xsl:if test="\$pos = 1 or
not(@sku = \$products/preceding-sibling::product[1]/@sku">
<xsl:copy-of select="."/>
</xsl:if>
</xsl:for-each>
</products>

</xsl:template>```

However, I have never found this technique to be faster than using the `preceding` axis. This technique does have an advantage in situations where the duplicate testing is not trivial. For example, consider a case where duplicates are determined by the concatenation of two attributes.

```<xsl:stylesheet version="1.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:template match="/">

<xsl:variable name="people">
<xsl:for-each select="//person">
<xsl:sort select="concat(@lastname,@firstname)"/>
<xsl:copy-of select="."/>
</xsl:for-each>
</xsl:variable>

<products>
<xsl:for-each select="\$people/person">
<xsl:variable name="pos" select="position(  )"/>
<xsl:if test="\$pos = 1 or
concat(@lastname,@firstname) !=
concat(people/person[\$pos - 1]/@lastname,
people/person[\$pos - 1]/@firstname)">
<xsl:copy-of select="."/>
</xsl:if>
</xsl:for-each>
</products>

</xsl:template>```

When you attempt to remove duplicates, the following examples do not work:

```<xsl:template match="/">
<products>
<xsl:for-each select="//product[not(@sku=preceding::product[1]/@sku)]">
<xsl:sort select="@sku"/>
<xsl:copy-of select="."/>
</xsl:for-each>
</products>
</xsl:template>```

Do not sort to avoid considering all but the immediately preceding element. The axis is relative to the node’s original order in the document. The same situation applies when using `preceding-sibling`. The following code is also sure to fail:

```<xsl:template match="/">

<xsl:variable name="products">
<xsl:for-each select="//product">
<!— sort removed from here —>
<xsl:copy-of select="."/>
</xsl:for-each>
</xsl:variable>

<products>
<xsl:for-each select="\$products/product">
<xsl:sort select="@sku"/>
<xsl:variable name="pos" select="position(  )"/>
<xsl:if test="\$pos = 1 or
@sku != \$products/product[\$pos - 1]/@sku">
<xsl:copy-of select="."/>
</xsl:if>
</xsl:for-each>
</products>
</xsl:template>```

This code fails because `position( )` returns the position after sorting, but the contents of `\$products` has not been sorted; instead, an inaccessible copy of it was.