This post is your first step before running any notebook. We verify Spark starts, the UI responds, and you can write/read Parquet.
Downloads at the end: go to Downloads.
At a glance
- Confirm Spark starts without errors.
- Verify Spark UI and version.
- Write/read Parquet on the local volume.
Run it yourself
Use the Spark Docker stack from this blog.
Links:
1) Start Spark and check version
This confirms Spark is alive.
| |
Expected output (example):
'3.5.1'
Open the UI at http://localhost:4040 and confirm the app name.
2) Simple count
A basic count validates jobs execute correctly.
| |
Expected output:
1000000
3) Write and read Parquet
This validates that local volumes are mounted correctly.
| |
Expected output:
1000000
Notes from practice
- If UI does not load, check the port in Docker.
- If the path fails, review volume mounts.
- This post is the base before Delta Table 101.
Downloads
If you do not want to copy code, download the notebook or the .py.