O'Reilly logo

Instant Jsoup How-to by Pete Houston

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Extracting data using DOM (Must know)

As the input is ready for extraction, we will begin with HTML parsing using the DOM method.

Note

If you don't know what DOM is, you can have a quick start with the DOM tutorial at:

http://www.w3schools.com/htmldom/

Let's move on to the details of how it works in Jsoup.

Getting ready

This section will parse the content of the page at, http://jsoup.org.

The index.html file in the project is provided if you want to have a file as input, instead of connecting to the URL.

How to do it...

The following screenshot shows the page that is going to be parsed:

How to do it...

By viewing the source code for this HTML page, we know the site structure. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required