SeriesKafka 101
1/3. Local Kafka with CLI, your first run2/3. Kafka consumer groups, explained3/3. Read Kafka with Spark Streaming
This post connects Spark Structured Streaming to a local Kafka topic and reads messages in real time. Ref: Structured Streaming + Kafka.
Downloads at the end: go to Downloads.
Quick takeaways
- Spark can read Kafka topics directly using the Kafka connector.
- You can validate end-to-end streaming locally.
- This is the bridge between ingestion and processing.
Run it yourself
- Local Docker: default path for this blog.
| |
Links:
Produce messages
| |
Read with Spark Structured Streaming
| |
Expected output: You should see new rows in the console when you produce messages.
What to verify
- Messages appear in the Spark console sink.
- The streaming query stays active while you produce data.
- Stopping the producer does not crash the query.
Downloads
If you want to run this without copying code, download the notebook or the .py export.