A DataFrame is a Dataset organized into named columns. It is conceptually equivalent to a table in a relational database or a data frame in R/Python, but with richer optimizations under the hood. I tried to explain the creation of dataframe using csv file and manipulate the data and store the processed records into another file or table for further processing. Data transformation using spark data frame is very easy and spark provided various functions to help the transformation.
please go through spark documentation for more detail
I used databricks community edition for this demo.