Build a Spark streaming Data Source

Implement a minimal Data Source API reader with real offsets, a clear schema, and a usable format. You will compare the naive batch approach vs real streaming and run it end-to-end.

February 1, 2026 · 3 min · 441 words · pw

Kafka consumer groups, explained

Explains offsets, partitions, and rebalances with a runnable example that shows how consumption is split across consumers and what happens when scaling or failures occur.

February 1, 2026 · 1 min · 194 words · pw

Local Kafka with CLI, your first run

Kafka CLI first steps: create topics, produce events, and consume them from console in a reproducible local environment. Perfect for practice without cloud dependencies.

February 1, 2026 · 1 min · 207 words · pw

Read Kafka with Spark Streaming

Connect local Kafka to Spark Structured Streaming, define a schema, and run a continuous read. Includes simple metrics and validations to confirm the stream is working.

February 1, 2026 · 1 min · 210 words · pw