- This event has passed.
Seattle Apache Kafka Meetup – LinkedIn Edition
August 14 @ 5:30 pm - 8:30 pm
Greetings! Mark your calendars for a special edition of the Seattle Apache Kafka meetup featuring 3 great talks by speakers from LinkedIn, where it all began for Apache Kafka. Speakers for this session are traveling from the Bay Area to present in the Seattle area – please make the best use of this opportunity! Please RSVP.
Date: August 14, 2019
Location: Salesforce Bellevue [masked]th Ave NE) – 1st Floor Conference Room
05:30 PM – Doors open
05:30 PM – 06:00 PM – Checkin, Food and drink
06:00 PM – 08:00 PM – Talks
1) Cruise Control: Effortless management of Kafka clusters
Presenter: Efe Gencer, Senior Engineer, LinkedIn
Kafka has become the de facto standard for streaming data with high-throughput, low-latency, and fault-tolerance. However, its rising adoption raises new challenges. In particular, the growing cluster sizes, increasing volume and diversity of user traffic, and aging network and server components induce an overhead in managing the system. This overhead makes it infeasible for human operators to constantly monitor, identify, and mitigate issues. The resulting utilization imbalance across brokers leads to unpredictable client performance due to the high variation in their throughput and latency. Finally, properly expanding, shrinking, or upgrading clusters also incurs a management overhead. Hence, adopting a principled approach to manage Kafka clusters is integral to the sustainability of the infrastructure.
This talk will describe how LinkedIn alleviates the management overhead of large-scale Kafka clusters using Cruise Control. To this end, first, we will discuss the reactive and proactive techniques that Cruise Control uses to support admin operations for cluster maintenance, enable anomaly detection with self-healing, and provide real-time monitoring for Kafka clusters. Next, we will examine how Cruise Control performs in production. Finally, we will conclude with questions and further discussion.
2) More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
Presenter: Celia Kung, Engineering Manager, LinkedIn
For several years, LinkedIn has been using Kafka MirrorMaker as the mirroring solution for copying data between Kafka clusters across data centers. However, as LinkedIn data continued to grow, mirroring trillions of Kafka messages per day across data centers uncovered the scale limitations and operability challenges of Kafka MirrorMaker. To address such issues, we have developed a new mirroring solution, built on top our stream ingestion service, Brooklin. Brooklin’s mirroring solution aims to provide improved performance and stability, while facilitating better management through finer control of data pipelines. Through flushless Kafka produce, dynamic management of data pipelines, per-partition error handling and flow control, we are able to increase throughput, better withstand consume and produce failures and reduce overall operating costs. As a result, we have eliminated the major pain points of Kafka MirrorMaker.
In this talk, we will dive deeper into the challenges LinkedIn has faced with Kafka MirrorMaker, how we tackled them with Brooklin and our plans for iterating further on this new mirroring solution.
3) Apache Samza 1.0: What’s New and What’s Next in Stream Processing
Presenter: Prateek Maheshwari, Staff Engineer, LinkedIn
Apache Samza reached a major milestone with its recent 1.0 release. In this talk, we take a look at the major new features and enhancements in Samza 1.0. We also take a sneak peek at what’s next on our roadmap. Both Stream Processing veterans and developers new to Stream Processing will discover useful new features to leverage for their applications.