Apache Spark examples on AWS & Valohai
Run Apache Spark applications on AWS EMR clusters using Valohai automation.
Overview
This example demonstrates:
Launching AWS EMR clusters from Valohai
Running Spark batch jobs remotely
Managing EMR configuration via Valohai parameters
Steps
1
Setup AWS IAM users
Create a new IAM role to access EMR and S3.
2
Import and Run the examples on Valohai
Start with running the run-debug-with-minimal-configuration example step.
3
Running your own Spark applications
The valohai.yaml in the project includes examples for a minimal configuration as well as maximal example. The maximal example should cover most of the configuration options.
GitHub Repository
The repository walks you through the steps above:
Last updated
Was this helpful?
