Dataframe introduction



A DataFrame is a Dataset organized into named columns. It is conceptually equivalent to a table in a relational database or a data frame in R/Python, but with richer optimizations under the hood. I tried to explain the creation of dataframe using csv file and manipulate the data and store the processed records into another file or table for further processing. Data transformation using spark data frame is very easy and spark provided various functions to help the transformation.




I used databricks community edition for this demo.






Comments

  1. DataFrames are versatile tools, simplifying tasks like data cleaning, exploration, and transformation in data science and analytics workflows. Why No Game Video is intersted to watch.

    ReplyDelete

Post a Comment

Popular posts from this blog

Microsoft BI Implementation - Cube back up and restore using XMLA command

Databricks - incorrect header check