Feature generation

We will perform feature generation using the following steps:

  1. We will create a default pipeline, as described previously:
ArrayList<Pipe> pipeList = new ArrayList<Pipe>(); 
pipeList.add(new Input2CharSequence("UTF-8")); 
Pattern tokenPattern = Pattern.compile("[\\p{L}\\p{N}_]+"); 
pipeList.add(new CharSequence2TokenSequence(tokenPattern)); 
pipeList.add(new TokenSequenceLowercase()); 
pipeList.add(new TokenSequenceRemoveStopwords(new    File(stopListFilePath), "utf-8", false, false, false)); 
pipeList.add(new TokenSequence2FeatureSequence()); 
pipeList.add(new FeatureSequence2FeatureVector()); 
pipeList.add(new Target2Label()); 
SerialPipes pipeline = new SerialPipes(pipeList); 

Note that we added an additional FeatureSequence2FeatureVector ...

Get Machine Learning in Java - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.