Manipulating DataFrames

In the previous recipe, we saw how to create a DataFrame. The next natural step, after creating DataFrames, is to play with the data inside them. Other than the numerous functions that help us to do that, we also find other interesting functions that help us sample the data, print the schema of the data, and so on. We'll take a look at them one by one in this recipe.

Note

The code and the sample file for this recipe could be found at https://github.com/arunma/ScalaDataAnalysisCookbook/blob/master/chapter1-spark-csv/src/main/scala/com/packt/scaladata/spark/csv/DataFrameCSV.scala.

How to do it...

Now, let's see how we can manipulate DataFrames using the following subrecipes:

Printing the schema of the DataFrame
Sampling data in ...

Get Scala Data Analysis Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Scala Data Analysis Cookbook by Arun Manivannan

Manipulating DataFrames

Note

How to do it...

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly