O'Reilly logo

Natural Language Processing with Java and LingPipe Cookbook by Krishna Dayanidhi, Breck Baldwin

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Tuning sentence detection

Lots of data will resist the charms of IndoEuropeanSentenceModel, so this recipe will provide a starting place to modify sentence detection to meet new kinds of sentences. Unfortunately, this is a very open-ended area of system building, so we will focus on techniques rather than likely formats for sentences.

How to do it...

This recipe will follow a well-worn pattern: create evaluation data, set up evaluation, and start hacking. Here we go:

  1. Haul out your favorite text editor and mark up some data—we will stick to the [ and ] markup approach. The following is an example that runs afoul of our standard IndoEuropeanSentenceModel:
    [All decent people live beyond their incomes nowadays, and those who aren't respectable live beyond ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required