Data I/O
Reading & Writing Data
Following code iluustrates how to read and write files using dataframes:
Following are the details about the above code block:
First, SparkConf and SparkSession are imported
Since Spark is being used locally, both a sparkcontext and a sparksession are already running. The parameters can be updated, such our application's name.
Next, the SparkSession config is being printed.
After that, data is being read from a json file into a Spark Dataframe and then written to a csv file.
Imperative vs Declarative Programming
There are 2 different ways to manipulate data in Spark. The first is Imperative Programming which uses DataFrames and Python. Second is Declarative Programming using SQL.
Imperative programming is concerned about the "How" while Declarative programming cares about the "What". In most cases, Declarative systems are an abstraction layer over an Imperative system that takes care of figuring out the necessary steps to achieve the result.
Last updated