6Parallel Computing for Support Vector Machines

6.1. Introduction

As introduced previously, the essential computation of SVMs is to solve a quadratic problem, which is both time and memory costly. This presents a challenge when solving large-scale problems. Despite several optimizing or heuristic methods such as shrinking, chunking [OSU 97], kernel caching [JOA 99], approximation of a kernel matrix [CHA 07], sequential minimal optimization (SMO) [PLA 99a] and a primal estimated subgradient solver [SHA 07], a more sophisticated and satisfactory resolution is always expected for this challenging problem. As stated previously, the building’s energy system is extremely complex involving large number of influence factors, making tackling large-scale datasets common.

With the development of chip technologies, computers with multicores or multiprocessors are becoming more available and affordable in the modern market. This chapter, therefore, attempts to investigate and demonstrate how SVMs can benefit from this modern platform when solving the problem of predicting building energy consumption. A new parallel SVM that is particularly suitable to this platform is proposed. The decomposition method and inner SMO solver compose the main procedure of training. A shared cache is designed to store the kernel columns. For the purpose of achieving easy implementation without sacrificing performance, the new parallel programming framework MapReduce is chosen to perform the underlying parallelism. ...

Get Data Mining and Machine Learning in Building Energy Analysis now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.