SeriesKafka 101
1/3. Local Kafka with CLI, your first run2/3. Kafka consumer groups, explained3/3. Read Kafka with Spark Streaming
This post shows how consumer groups distribute work and how offsets move. It is the mental model you need before streaming with Spark. Ref: Consumer groups.
Downloads at the end: go to Downloads.
Quick takeaways
- Consumer groups split partitions across instances.
- Offsets track where each consumer group is in the topic.
- Rebalancing is normal when consumers start or stop.
Run it yourself
- Local Docker: main path for this blog.
| |
Links:
Start two consumers in the same group
| |
Open a second terminal and run the same command. Produce a few messages and observe how they are split.
Check group offsets
| |
Expected output (example):
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG
demo-events 0 42 42 0
What to verify
- Each consumer receives a subset of partitions.
- Offsets advance as messages are consumed.
- Rebalancing happens when a consumer stops.
Downloads
If you want to run this without copying code, download the notebook or the .py export.