10.1 Introduction

One of the central topics in analyzing structured data is the detection of dense substructures, often called clusters, modules, or communities. Such sets of strongly interrelated entities provide interesting information in many different contexts, for instance, interaction of proteins, webpage linking, citation of articles, email communication between individuals, spatial proximity of objects in an image, 3D-contact of atoms in a macromolecule, co-occurrence of words in documents, or similarity of genomic sequences. In graph terminology, the entities are referred to as nodes, and interactions are represented by edges between the nodes. Different strengths of interactive relationships can be indicated by edge weights.

This chapter presents an enumerative approach to find node sets that satisfy an explicit interaction density criterion [1], conceptually generalizing traditional clique search [2]. Beyond that, the described framework can enumerate dense cluster patterns from asymmetric, bipartite graph structures and from higher-order associations that involve more than two entities at the same time, forming a hypergraph [3,4]. Moreover, the method allows to integrate additional user-defined constraints in order to systematically discover relevant substructures. The density of an entity set is generally defined as the total interaction weight between entities within the set divided by the maximum possible amount of interaction. We describe a reverse search approach ...

Get Statistical and Machine Learning Approaches for Network Analysis now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.