CLASSIFICATION AND CLUSTERING FOR HOMELAND SECURITY APPLICATIONS

JIAWEI HAN AND XIAOLEI LI

University of Illinois at Urbana-Champaign, Champaign, Illinois

1 REPRESENTATION

Proper representation is the first step to utilize methods from classification and clustering [1]. To put it plainly, one has to take information from the real world, the analog world so-to-speak, and store them inside a computer, the digital world. Only after this, classification and clustering algorithms can operate on the real-world problem. This may seem like a simple step, but it can often be the most difficult part of the problem. A proper representation requires an accurate, concise, and static representation of something that can be dynamic and fluid in the real world. And without a good representation, the best algorithms will not be able to operate effectively.

image

FIGURE 1 Feature space with “color” and “type”.

To better explain, consider the example of a computer system observing vehicles at a border crossing. The goal of the system might be to automatically flag suspicious vehicles for the border agents to examine more closely. In order for this system to work, the first step is to represent the features of the vehicles inside the computer. This is not like how a border agent might describe a vehicle to his or her colleague. Some features he or she might use include the vehicle's brand, year, color, size, ...

Get Wiley Handbook of Science and Technology for Homeland Security, 4 Volume Set now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.