What If We Could Rebuild Kafka From Scratch?
Update April 25: This post is being discussed on Hacker News, lobste.rs, and /r/apachekafka The last few days I spent some time digging into the recently announced KIP-1150 ("Diskless Kafka"), as well AutoMQ’s Kafka fork, tightly integrating Apache Kafka and object storage, such as S3. Following the example set by WarpStream, these projects aim to substantially improve the experience of using Kafka in cloud environments, providing better elasticity, drastically reducing cost, and paving the way towards native lakehouse integration. This got me thinking, if we were to start all over and develop a durable cloud-native event log from scratch—Kafka.next if you will—which traits and characteristics would be desirable for this to have? Separating storage and compute and object store support would be table stakes, but what else should be there? Having used Kafka for many years for building event-driven applications as well as for running realtime ETL and change data capture pipelines, here’s my personal wishlist:
https://www.morling.dev/blog/what-if-we-could-rebuild-kafka-from-scratch/