Spark Dataframe Cheat Sheet



Having a good cheatsheet at hand can significantly speed up the development process.One of the best cheatsheet I have came across is sparklyr’s cheatsheet.

This PySpark cheat sheet covers the basics, from initializing Spark and loading your data, to retrieving RDD information, sorting, filtering and sampling your data. But that's not all. You'll also see that topics such as repartitioning, iterating, merging, saving your data and stopping the SparkContext are included in the cheat sheet. This cheat sheet will help you learn PySpark and write PySpark apps faster. Everything in here is fully functional PySpark code you can run or adapt to your programs. These snippets are licensed under the CC0 1.0 Universal License. Continue Reading HBase Shell Commands Cheat Sheet. Spark todate – Convert String to Date format. Post author: NNK. Spark DataFrame example of how to retrieve the last day of a month from a Date using Scala language and Spark SQL Date and Time functions. Df.distinct #Returns distinct rows in this DataFrame df.sample#Returns a sampled subset of this DataFrame df.sampleBy #Returns a stratified sample without replacement Subset Variables (Columns) key 3 22343a 3 33 3 3 3 key 3 33223343a Function Description df.select #Applys expressions and returns a new DataFrame Make New Vaiables 1221.

For my work, I’m using Spark’s DataFrame API in Scala to create data transformation pipelines. These are some functions and design patterns that I’ve found to be extremely useful.

Spark Dataframe Cheat Sheet

Load data

Get SparkContext information

Get Spark version

Get number of partitions

Count number of rows

Print schema

Spark Dataframe Cheat Sheet Printable

Preview top 20 rows

Design pattern for constructing as data transformation pipeline

Drop duplicate rows

Spark Dataframe Cheat Sheet Download

For an exhaustive list of the functions, you can check out the Spark’s Dataset class documentation.

Spark Dataframe Cheat Sheet Template

Hope you’ve found this cheatsheet useful. Thank you!