Using Weka with Greenplum

As saw seen in Chapter 3, Advanced Analytics – Paradigms, Tools, and Techniques, Weka is a Java-based analytics framework and an alternative to R. As it is a Java-based analytics API, it can connect to any database that supports or has a JDBC driver. Weka comes with a support to a wide range of database and in order to connect to Greenplum, we would need to use the DatabaseUtils.props.postgresql properties file and should be extracted to the HOME directory.

To connect to Postgres/Greenplum from Weka, configure the following properties in the DatabaseUtils.props.postgresql properties file:

jdbcDriver = org.postgresql.Driver
jdbcURL= jdbc:postgresql://<<domain>>:<<port>>/<<dbName>>

Weka has an API InstanceQuery that can be ...

Get Getting Started with Greenplum for Big Data Analytics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.