Summary

The concepts presented in this chapter are just the beginning of the road to using Blaze. There are many other ways it can be used and data sources it can connect with. Treat this as a starting point to build your understanding of polyglot persistence.

Note, however, that these days most of the concepts explained in this chapter can be attained natively within Spark, as you can use SQLAlchemy directly within Spark making it easy to work with a variety of data sources. The advantage of doing so, despite the initial investment of learning the API of SQLAlchemy, is that the data returned will be stored in a Spark DataFrame and you will have access to everything that PySpark has to offer. This, by no means, implies that you never should never ...

Get Learning PySpark now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.