This post shows how consumer groups distribute work and how offsets move. It is the mental model you need before streaming with Spark. Ref: Consumer groups.

Downloads at the end: go to Downloads.

Quick takeaways

  • Consumer groups split partitions across instances.
  • Offsets track where each consumer group is in the topic.
  • Rebalancing is normal when consumers start or stop.

Run it yourself

  • Local Docker: main path for this blog.
1
docker compose up

Links:


Start two consumers in the same group

1
kafka-console-consumer.sh --topic demo-events --bootstrap-server localhost:9092 --group demo-group

Open a second terminal and run the same command. Produce a few messages and observe how they are split.


Check group offsets

1
kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group demo-group

Expected output (example):

TOPIC  PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG
demo-events 0  42  42  0

What to verify

  • Each consumer receives a subset of partitions.
  • Offsets advance as messages are consumed.
  • Rebalancing happens when a consumer stops.

Downloads

If you want to run this without copying code, download the notebook or the .py export.