You should avoid code duplication like the plague! It's very insidious: you fix a bug and you think you're done. Unfortunately, this is not always true, as one developer may have copied/pasted in the past the buggy part you have just fixed. Fortunately, code duplication tools are available, and they have Maven plug-ins. Let's explore how to use the CPD (which stands for Copy Paste Detector) and Simian (http://www.redhillconsulting.com.au/products/simian/) tools.[3]
CPD is actually part of the PMD project, and it's contained in the Maven PMD
plug-in. To use it, start by adding maven-pmd-plugin
to the reports
section of the POM. Then tell the
plug-in to generate the CPD report by adding the following property to
your project's project.properties
file:
maven.pmd.cpd.enable = true
Figure 4-7 shows a typical CPD report.
By default CPD reports duplicates that share more than 100
tokens. To configure it differently use the maven.pmd.cpd.minimumtokencount
Maven
property. For example, to detect duplicates of 50 tokens or more, use
this:
maven.pmd.cpd.minimumtokencount = 50
Just as with code best practice detection, the duplicate reports are not very helpful when it comes to fixing the code. A better strategy is to set a high duplicate threshold and fail the build if duplicates are found. Then, as your project progresses, slowly decrease the duplicate threshold to uncover more and more duplicates. Of course, there's a minimum threshold that you'll have to find out (say, around 5-10 lines). Unfortunately, the PMD plug-in does not yet support failing the build upon duplicate detection.
Now let's use the Simian plug-in, which has such a feature. Start by
adding maven-simian-plugin
to your
project's reports
section. Then add
the following property:
maven.simian.failonduplication = true
Running the Simian plug-in on the same project as the one in Figure 4-7 leads to the report shown in Figure 4-8.
Clicking any duplication link leads to a page showing the
duplicated source code. You can configure the duplication threshold
using the maven.simian.linecount
property. There are other interesting properties in the Simian plug-in
reference documentation (http://maven.apache.org/reference/plugins/simian/properties.html).
[3] Simian is a commercial product, but you can get free licenses for noncommercial/nongovernment projects. Simian has a 15-day evaluation period.
Get Maven: A Developer's Notebook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.