Cover by Andrea Tagarelli

Safari, the world’s most comprehensive technology and business learning platform.

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required

O'Reilly logo

Chapter 7

XML Document Clustering:

An Algorithmic Perspective

Panagiotis Antonellis

University of Patras, Greece

Abstract

The wide use of XML as the de facto standard of storing and exchanging information through Internet has led a wide spectrum of heterogeneous applications to adopt XML as their information representation model. The heterogeneity of XML data sources has brought up the problem of efficiently clustering a set of XML documents. However, traditional clustering algorithms cannot be applied due to the semistructured nature of XML, which contains both structure and content features. Hence, special techniques should be used that would take into account the XML semantics in order to address the problem of XML clustering. The described approaches, ...

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required