This is a course on big data tools on the Google Cloud Platform. Students will begin by studying transformations and actions on Spark RDDs, moving on to Spark data frames and Spark SQL. The majority of the class will be programming in Python with PySpark, with an introduction to Scala. Other topics include Spark streaming, BigQuery ML, Apache Beam, and possibly Pub/Sub. All work will be done on the Google Cloud Platform and Google Data Studio.