This post is meant to help you making your first step into data processing with Apache Spark using python API. In the age of Big Data processing, Hadoop map reduce (open source implementation of google map reduce model) has set down the foundation for processing “embarrassingly parallel” operations on distributed machines. Sadly, it shows programmability limitations and degradation in […]