Welcome to Code & Create!
Within this workshop, we'll be looking at Apache Spark with guest speaker, Neeraj Bhadani - Data Scientist at Expedia Group.
Neeraj has more than a decade of experience building software, currently working in AI & Data Science team at Expedia Group. Prior to Expedia Group, he worked on various Big Data projects, dealt directly with clients as a Technical specialist, and migrated various ETL pipelines to Apache Spark. He also received a Gold Medal for securing first place in his batch during his undergraduate days.
Apache Spark is a General-purpose computing engine that has in-memory computing capabilities. It can be used for a variety of workloads like Batch processing, Iterative problems, stream processing, etc. It is designed to be highly scalable and provides various APIs like Scala, Python, R, Java, and SQL. It can be easily integrated with other BIG Data tools as well.
In this workshop, we cover the following topics:
Introduction
Architecture
RDD: Resilient Distributed Dataset
Demo
Q & A