Podcasts about kafka streams

  • 30PODCASTS
  • 108EPISODES
  • 42mAVG DURATION
  • 1MONTHLY NEW EPISODE
  • Jun 8, 2026LATEST

POPULARITY

20192020202120222023202420252026


Best podcasts about kafka streams

Latest podcast episodes about kafka streams

Streaming Audio: a Confluent podcast about Apache Kafka
10 Years of Kafka Streams with Matthias J. Sax | Ep. 32

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Jun 8, 2026 29:41


Tim Berglund talks to Matthias J. Sax (Confluent) about 10 whole years of Kafka Streams! Matthias' first job: electrician-in-training on BMW's assembly lines. His challenge: reflecting on 10 years of Kafka Streams growth, major milestones, and what comes next.SEASON 2 Hosted by Tim Berglund, Adi Polak and Viktor Gamov Produced and Edited by Noelle Gallagher, Peter Furia and Nurie Mohamed Music by Coastal Kites Artwork by Phil Vo  

bmw edited confluent matthias j kafka streams
GOTO - Today, Tomorrow and the Future
Kafka for Architects • Ekaterina Gorshkova & Viktor Gamov

GOTO - Today, Tomorrow and the Future

Play Episode Listen Later May 19, 2026 28:48


This interview was recorded for the GOTO Book Club.http://gotopia.tech/bookclubEkaterina Gorshkova - Apache Kafka Engineer at SOFTEC & Author of "Kafka for Architects"Viktor Gamov - Principal Developer Advocate at Confluent & Co-Author of "Kafka in Action"Check out more here:https://gotopia.tech/episodes/440RESOURCESEkaterinahttps://www.linkedin.com/in/ekaterina-gorshkova-978bb6https://medium.com/@katyagorshkovaViktorhttps://bsky.app/profile/gamussa.devhttps://x.com/gAmUssAhttps://github.com/gamussahttps://www.linkedin.com/in/vikgamovhttps://gamov.ioLinks45% off discount code (expires on 25 May 2026): GOTOKGKafkaAffiliate link: https://hubs.la/Q044HgTvhttps://current.confluent.io/londonDESCRIPTIONApache Kafka has evolved far beyond a simple message broker — it has become a foundational layer for modern enterprise software. In this GOTO Book Club episode, Ekaterina Gorshkova, author of "Kafka for Architects", shares how her decade-long journey with Kafka — starting in a Czech bank's integration team in 2015 — shaped her understanding of what it really takes to design Kafka-based systems at scale. The conversation covers core architectural decisions, real-world patterns for enterprise integration, the role of Kafka Streams, and how to avoid the classic pitfalls of building systems that "only three engineers understand".The episode also looks forward: Ekaterina and host Viktor Gamov explore how Kafka is increasingly becoming the connective tissue for AI-driven systems, acting as an orchestration layer between intelligent agents, real-time data, and business workflows. Her book's central argument is that while AI and tooling change fast, the fundamental knowledge of how to design robust, event-driven systems is durable and career-proof. Kafka for Architects is framed not just as a technical manual, but as a roadmap for architects who want to get Kafka right from day one — requirements, design, testing, and all.RECOMMENDED BOOKSEkaterina Gorshkova • Kafka for Architects • https://amzn.to/42mDarUDylan Scott, Viktor Gamov & Dave Klein • Kafka in Action • https://amzn.to/4vJ3KcjViktor Gamov, Tartakovsky, Rasputnis & Fain • Enterprise Web Development • https://amzn.to/3CezL0RShapira, Palino, Sivaram & Petty • Kafka: The Definitive Guide • https://amzn.to/3RPtdLPBill Bejeck • Kafka Streams in Action • https://amzn.to/3CGJiiMBlueskyInstagramLinkedInFacebookCHANNEL MEMBERSHIP BONUSJoin this channel to get early access to videos & other perks:https://www.youtube.com/channel/UCs_tLP3AiwYKwdUHpltJPuA/joinLooking for a unique learning experience?Attend the next GOTO conference near you! Get your ticket: gotopia.techSUBSCRIBE TO OUR YOUTUBE CHANNEL - new videos posted daily!

The GeekNarrator
Many Databases 1 LSM Engine - OpenData

The GeekNarrator

Play Episode Listen Later May 1, 2026 74:02


The episode explores why modern databases keep reinventing the same distributed-systems machinery and argues that a major part of database cost is the operational tax of running replication-heavy systems. Our guest, Almog Gavra, co-founder of Responsive, explains how his team pivoted from operating Kafka Streams as a service to building SlateDB and the “Open Data” manifesto: an object-storage-native LSM foundation that can power multiple database types (vector, time series, logs, key-value) with shared tuning knobs and failure modes. They discuss why distributed-systems complexity is often harder than query engines, how LSM trees provide a tunable tradeoff between read/write/space amplification, caching layers and cost transparency, separating readers/writers, stateless ingest, single-writer availability and fencing via S3 compare-and-set, offloading compaction, and how the architecture enables near-free snapshots. They also cover when this approach doesn't fit: OLTP that can stay on Postgres and ultra-low-latency workloads where cold object-store misses are unacceptable.Chapters:00:00 Introduction08:36 Open Data Manifesto18:34 Specialized vs General25:10 SlateDB Architecture32:51 LSM Trees as Tuning Dial38:58 Tuning Without Overload39:46 Cost Aware Config Knobs41:51 Latency Cost Durability Tradeoffs46:46 Caching Strategies And Layers50:23 Split Readers And Writers52:43 Single Writer Versus Multi Writer55:16 Scaling And Partitioning Writes58:58 Failure Modes And Fencing01:05:23 Compaction As Separate Worker01:09:28 Snapshots And Garbage Collection01:10:25 When Open Data Is Not FitImportant links and references:OpenData: http://github.com/opendata-oss/opendataOpenData manifesto: https://www.opendata.dev/blog/manifestoReach out to Almog: https://www.linkedin.com/in/agavra/ or https://x.com/almoggavraDostovesky paper on LSM: https://nivdayan.github.io/dostoevsky.pdfLatency/Cost/Durability Triad: https://materializedview.io/p/cloud-storage-triad-latency-cost-durabilitySlateDB: https://github.com/slatedb/slatedb"how SSTs work": https://www.bitsxpages.com/p/sorted-string-tables-sst-from-firstFor memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!

Streaming Audio: a Confluent podcast about Apache Kafka
Building Banking Systems with Kafka Streams with Mateo Rojas | Ep. 28

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Apr 20, 2026 44:46


Adi Polak talks to Mateo Rojas (LittleHorse) about his career working with Kafka Streams. Mateo's first job: building a real-money policy management platform on early Kafka Streams. His challenge: working at LittleHorse with Kafka as a workflow engine and deciding whether it should be the source of truth.SEASON 2 Hosted by Tim Berglund, Adi Polak and Viktor Gamov Produced and Edited by Noelle Gallagher, Peter Furia and Nurie Mohamed Music by Coastal Kites Artwork by Phil Vo  

Streaming Audio: a Confluent podcast about Apache Kafka
Hacking Kafka Streams with Sophie Blee‑Goldman | Ep. 15

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Jan 19, 2026 34:33


Tim Berglund talks to Sophie Blee-Goldman (Responsive) about her career in container orchestration and Kafka Streams. Sophie's first job: interning at Google. Her challenge: helping a hyper-growth customer whose Kafka Streams app was about to hit partition-based scalability limits.SEASON 2 Hosted by Tim Berglund, Adi Polak and Viktor Gamov Produced and Edited by Noelle Gallagher, Peter Furia and Nurie Mohamed Music by Coastal Kites Artwork by Phil Vo

Streaming Audio: a Confluent podcast about Apache Kafka
Reimagining Stream Processing with Matthias J. Sax | Ep. 9

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Nov 17, 2025 36:42


Viktor Gamov talks to Matthias J. Sax (Confluent) about his career in stream processing and, specifically, Kafka Streams. Matthias' first job: an electrician-in-training on BMW's assembly lines. His challenge: building Kafka Streams at Confluent with a focus on API design, backward compatibility, and a library-first approach that also fits microservices.SEASON 2 Hosted by Tim Berglund, Adi Polak and Viktor Gamov Produced and Edited by Noelle Gallagher, Peter Furia and Nurie Mohamed Music by Coastal Kites Artwork by Phil Vo

bmw edited reimagining api confluent stream processing matthias j kafka streams
airhacks.fm podcast with adam bien
LittleHorse Likes Sun

airhacks.fm podcast with adam bien

Play Episode Listen Later May 4, 2025 63:46


An airhacks.fm conversation with Colt McNealy (@coltmcnealy) about: first computing experience with Sun workstations and network computing, background in hockey and other sports, using system76 Linux laptops for development, starting programming in high school with Java and later learning C, fortran, assembly, C++ and python, working at a real estate company with kubernetes and Kafka, the genesis of LittleHorse from experiencing challenges with distributed microservices and workflow management, LittleHorse as an open source workflow orchestration engine using Kafka as a commit log rather than a message queue, building a custom distributed database optimized for workflow orchestration, the recent move to fully open source licensing, comparison with AWS Step Functions but with more capabilities and open source benefits, using RocksDB and Kafka Streams for the underlying implementation, performance metrics of 12-40ms latency between tasks and hundreds of tasks per second, the multi-tenant architecture allowing for serverless offerings, integration with Kafka for event-driven architectures, the distinction between orchestration and choreography in distributed systems, using Java 21 with benefits from virtual threads and generational garbage collection, plans for Java 25 adoption, the naming story behind "Little Horse" and its competition with MuleSoft, the Sun Microsystems legacy and innovation culture, recent adoption of Quarkus for some components, the "Know Your Customer" flow as the Hello World example for Little Horse, the importance of observability and durability in workflow management, plans for serverless offerings and multi-tenant architecture, the balance between open source core and commercial offerings Colt McNealy on twitter: @coltmcnealy

Engineering Kiosk
#177 Stream Processing & Kafka: Die Basis moderner Datenpipelines mit Stefan Sprenger

Engineering Kiosk

Play Episode Listen Later Jan 7, 2025 67:40


Data Streaming und Stream Processing mit Apache Kafka und dem entsprechenden Ecosystem.Eine ganze Menge Prozesse in der Softwareentwicklung bzw. für die Verarbeitung von Daten müssen nicht zur Laufzeit, sondern können asynchron oder dezentral bearbeitet werden. Begriffe wie Batch-Processing oder Message Queueing / Pub-Sub sind dafür geläufig. Es gibt aber einen dritten Player in diesem Spiel: Stream Processing. Da ist Apache Kafka das Flaggschiff, bzw. die verteilte Event Streaming Platform, die oft als erstes genannt wird.Doch was ist denn eigentlich Stream Processing und wie unterscheidet es sich zu Batch Processing oder Message Queuing? Wie funktioniert Kafka und warum ist es so erfolgreich und performant? Was sind Broker, Topics, Partitions, Producer und Consumer? Was bedeutet Change Data Capture und was ist ein Sliding Window? Auf was muss man alles acht geben und was kann schief gehen, wenn man eine Nachricht schreiben und lesen möchte?Die Antworten und noch viel mehr liefert unser Gast Stefan Sprenger.Bonus: Wie man Stream Processing mit einem Frühstückstisch für 5-jährige beschreibt.Unsere aktuellen Werbepartner findest du auf https://engineeringkiosk.dev/partnersDas schnelle Feedback zur Episode:

javaswag
#61 - Григорий Скобелев - Кафка, шардирование и роль техлида в стартапе

javaswag

Play Episode Listen Later May 21, 2024 91:49


В 61 выпуске подкаста Javaswag поговорили с Григорием Скобелевым о Кафке, шардировании Постгреса и роли техлида в стартапе 00:00:00 Введение и работа с шейдерами 00:03:49 Разработка в Java и работа над биллингом 00:07:54 Коробочное решение для тарификации и обработки событий 00:09:23 Требования к работе в телекоммуникационных компаниях 00:13:04 Kafka Streams и работа с потоковыми данными 00:15:13 CDC (Change Data Capture) и использование Kafka Streams 00:21:13 Публичные выступления и их роль в развитии разработчика 00:22:09 Инженерная культура в компании Яндекс.Деньги 00:25:54 Инструменты разработки: плагины и тулзы 00:28:36 Создание плагинов для Gradle и Maven 00:31:49 Полезные тулзы для ускорения работы 00:36:34 Шардирование базы данных: проблемы и применение 00:39:21 Шардирование в PostgreSQL и его преимущества 00:43:39 Использование идентификаторов пользователей для маршрутизации запросов 00:50:00 Роль техлида в компании и его ответственности 00:53:16 Трансляция бизнес-требований в технические 00:56:33 Подготовка архитектуры к росту и увеличению нагрузки 00:57:57 Нагрузочное тестирование и оптимизация ресурсов 00:59:32 Кросс-языковое взаимодействие команды и выбор языка программирования 01:06:32 Выбор технологий и инструментов для микросервисов 01:07:00 Database per service подход 01:09:43 Взаимодействие между микросервисами 01:11:09 Контрактный подход 01:14:29 Прогрев приложений 01:16:42 Обмен опытом с другими техлидами 01:19:56 Проблемы с аптаймом и возможные решения 01:20:53 Оценка работы техлида и его влияние на команду 01:22:19 Важность развития в разных технологиях 01:27:00 Ответ на предыдущее непопулярное мнение 01:29:31 Непопулярное мнение Гость - https://www.linkedin.com/in/grigoriy-skobelev-757030167/ Ссылки: Подкаст «Между скобок» – https://youtube.com/@mezhdu_skobok Гитхаб Гриши с выступлениями – https://github.com/GSkoba/talks Телеграм-группа с обсуждением книжек – https://t.me/backend_megdu_skobkah Курс по Gradle - https://www.youtube.com/watch?v=Ajs8pTbg8as&list=PLWQK2ZdV4Yl2k2OmC_gsjDpdIBTN0qqkE Кип сейф! 🖖

Developer Voices
ByteWax: Rust's Research Meets Python's Practicalities (with Dan Herrera)

Developer Voices

Play Episode Listen Later May 8, 2024 61:54


Bytewax is a curious stream processing tool that blends a Python surface with a Rust core to produce something that's in a similar vein to Kafka Streams or Apache Flink, but with a fundamentally different implementation. This week we're going to take a look at what it does, how it works in theory, and how the marriage of Python and Rust works in practice…–The original Naiad Paper: https://dl.acm.org/doi/10.1145/2517349.2522738Timely Dataflow: https://github.com/TimelyDataflow/timely-dataflowBytewax the Library: https://github.com/bytewax/bytewaxBytewax the Service: https://bytewax.io/PyO3, for calling Rust from Python: https://pyo3.rs/v0.21.2/Kris on Mastodon: http://mastodon.social/@krisajenkinsKris on LinkedIn: https://www.linkedin.com/in/krisjenkins/Kris on Twitter: https://twitter.com/krisajenkins--#softwaredevelopment #dataengineering #apachekafka #timelydataflow

The Data Stack Show
184: Kafka Streams and Operationalizing Event Driven Applications with Aprurva Mehta of Responsive

The Data Stack Show

Play Episode Listen Later Apr 3, 2024 58:27


Highlights from this week's conversation include:Apruva's background in streaming technology (0:48)Developer experience and Kafka streams (2:47)Motivation to bootstrap a startup (4:09)Meeting the Confluent founders and early work at Confluent (6:59)Projects at Confluent and transition to engineering management (10:34)Overview of Responsive and event-driven applications (12:55)Defining event-driven applications (15:33)Importance of latency and state in event-driven applications (18:54)Low Latency and Stateful Processing (21:52)In-Memory Storage and Evolution of Kafka (25:02)Motivation for KSQL and Kafka Streams (29:46)Category Creation and Database-like Interface (34:33)Developer Experience with Kafka and Kafka Streams (38:50)Kafka Streams Functionality and Operational Challenges (41:44)Metrics and Tuning Configurations (43:33)Architecture and Decoupling in Kafka Streams (45:39)State Storage and Transition from RocksDB (47:48)Future of Event-Driven Architectures (56:30)Final thoughts and takeaways (57:36)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Developer Voices
Bringing Pure Python to Apache Kafka (with Tomáš Neubauer)

Developer Voices

Play Episode Listen Later Apr 3, 2024 66:29


The “big data infrastructure” world is dominated by Java, but the data-analysis world is dominated by Python. So if you need to analyse and process huge amounts of data, chances are you're in for a less-than-ideal time. The impedance mismatch will probably make your life hard somehow. So there are a lot of projects and companies trying to solve that problem. To bridge those two worlds seamlessly, and many of the popular solutions see SQL as the glue. But this week we're going to look at another solution - ignore Java, treat Kafka as a protocol, and build up all the infrastructure tools you need with a pure Python library. It's a lot of work, but in theory it would make Python the one language for data storage, analysis and processing, at scale. Tempting, but is it feasible? Joining me to discuss the pros, cons, and massive scope of that approach is Tomáš Neubauer. He started off doing real time data analysis for the Maclaren's F1 team, and is now deep in the Python mines effectively rewriting Kafka Streams in Python. But how? How much work is actually involved in porting those ideas to Python-land, and how do you even get started? And perhaps most fundamental of all - even if you succeed, will that be enough to make the job easy, or will you still have to scale the mountain of teaching people how to use the new tools you've built? Let's find out.– Quix Streams on Github: https://github.com/quixio/quix-streamsQuix Streams getting started guide: https://quix.io/get-started-with-quix-streamsQuix: https://quix.io/ Tomáš on LinkedIn: https://www.linkedin.com/in/tom%C3%A1%C5%A1-neubauer-a10bb144Tomáš on Twitter: https://twitter.com/TomasNeubauer0Kris on Mastodon: http://mastodon.social/@krisajenkinsKris on LinkedIn: https://www.linkedin.com/in/krisjenkins/Kris on Twitter: https://twitter.com/krisajenkins --#podcast #softwaredevelopment #datascience #apachekafka #streamprocessing

Real-Time Analytics with Tim Berglund
Kafka Streams Enhancements with Confluent's Matthias Sax | Ep. 45

Real-Time Analytics with Tim Berglund

Play Episode Listen Later Mar 18, 2024 30:16


Follow: https://stree.ai/podcast | Sub: https://stree.ai/sub | Today, Tim dives into the world of Kafka Streams with Matthias Sax, Software Engineer at Confluent and core contributor to Apache Kafka. Matthias updates us on the latest in Interactive Queries, their enhancements in recent releases, insights on stream processing and how Kafka Streams stands out in the real-time analytics landscape. Remember to use the 30% discount Tim mentioned for the Real-Time Analytics Summit: https://stree.ai/rtapod30 (Code: RTAPOD30)

Real-Time Analytics with Tim Berglund
Best of 2023: A Gentle Introduction to Kafka Streams with Anna McDonald

Real-Time Analytics with Tim Berglund

Play Episode Listen Later Jan 2, 2024 26:58


Follow: https://stree.ai/podcast | Sub: https://stree.ai/sub | Looking back at our favorite episodes from 2023, Tim Berglund chats with Anna McDonald about the fascinating world of Kafka Streams. Anna, a customer success technical architect at Confluent, shares her insights on the core concepts of Kafka Streams, including the all-important table and stream abstractions. They delve into the benefits of statefulness and durability, such as active and standby tasks, which ensure seamless failover, and how Kafka Streams stores state in RocksDB and in Kafka itself. New episodes every Monday resume on January 8, 2024!

gentle kafka keyboard best of 2023 quill confluent rocksdb kafka streams anna mcdonald
The New Stack Podcast
How Apache Flink Delivers for Deliveroo

The New Stack Podcast

Play Episode Listen Later Sep 20, 2023 20:38


Deliveroo, a prominent food delivery company, relies on Apache Flink, a distributed processing engine, to enhance its three-sided marketplace, connecting delivery drivers, restaurants, and customers. Seeking to improve real-time data streaming and gain insights into customer behavior, Deliveroo transitioned to Flink, comparing it to alternatives like Apache Spark and Kafka Streams. Flink, with feature parity to their previous platform, offered stability and scalability. They initially experimented with Flink on Kubernetes but turned to the Amazon Managed Service for Flink (MSF) for enhanced support and maintenance.Engineers from Deliveroo, Felix Angell and Duc Anh Khu, emphasized the need for flexibility in data modeling to accommodate their fast-paced product development. However, flexibility can be complex, often requiring data model adjustments. They expressed the desire for a self-serve configuration feature in MSF, allowing easy customization of low-level settings and auto-scaling based on application metrics. This move to Flink and MSF has empowered Deliveroo to focus on core responsibilities like continuous integration and delivery while efficiently managing their data processing needs.Learn more from The New Stack about Apache Flink and AWS:Kinesis, Kafka and Amazon Managed Service for Apache FlinkApache Flink for Real Time Data AnalysisApache Flink for Unbounded Data Streams

The Cloud Pod
227: The Cloud Pod Peeps at Azure's Explicit Proxy

The Cloud Pod

Play Episode Listen Later Sep 14, 2023 51:58


FINOS Open Source in Fintech Podcast
Enabling Real Time Regulatory Compliance with Kafka Streams and Morphir - Anna McDonald, Technical Voice of the Customer, Confluent

FINOS Open Source in Fintech Podcast

Play Episode Listen Later Aug 30, 2023 26:04


In this episode of the podcast, Grizz sits down with Anna McDonald, Technical Voice of the Customer at Confluent to talk about her OSFF talk: "Enabling Real Time Regulatory Compliance with Kafka Streams and Morphir". We talk about Kafka Streams, Morphir, Open Regulation, and what it's like to figure out your passion for coding at 5 years old. She will be speaking at the Open Source in Finance Forum on November 1st in New York: ⁠⁠https://sched.co/1PzH7 ⁠⁠⁠⁠⁠ Anna McDonald LinkedIn: https://www.linkedin.com/in/jbfletch/ NYC November 1 - Open Source in Finance Forum: ⁠⁠⁠⁠⁠https://events.linuxfoundation.org/open-source-finance-forum-new-york/⁠⁠⁠⁠⁠ 2022 State of Open Source in Financial Services Download: ⁠⁠⁠⁠⁠⁠https://www.finos.org/state-of-open-source-in-financial-services-2022⁠⁠⁠⁠⁠ All Links on Current Newsletter Here: ⁠⁠⁠⁠⁠⁠⁠https://www.finos.org/newsletter⁠⁠⁠⁠⁠⁠⁠ - more show notes to come A huge thank you to all our sponsors for Open Source in Finance Forum New York ⁠⁠⁠⁠https://events.linuxfoundation.org/open-source-finance-forum-new-york/⁠⁠⁠⁠that will take place this November 1st at the New York Marriott Marquis This event wouldn't be possible without our sponsors. A special thank you to our Leader sponsors: Databricks, where you can unify all your data, analytics, and AI on one platform. And Red Hat - Open to change—yesterday, today, and tomorrow. And our Contributor and Community sponsors: Adaptive/Aeron, Discover, FinOps Foundation, instaclustr, mend.io, Open Mainframe Project, OpenJS Foundation, OpenLogic by Perforce, Orkes, Red Hat, Sonatype, and Tidelift. If you would like to sponsor or learn more about this event, please send an email to ⁠⁠⁠⁠sponsorships@linuxfoundation.org⁠⁠⁠⁠. Grizz's Info | ⁠⁠⁠⁠⁠⁠https://www.linkedin.com/in/aarongriswold/⁠⁠⁠⁠⁠⁠ | ⁠⁠⁠⁠⁠⁠grizz@finos.org⁠⁠⁠⁠⁠⁠ ►► ⁠⁠⁠⁠⁠⁠Visit FINOS www.finos.org⁠⁠⁠⁠⁠⁠ ►► ⁠⁠⁠⁠⁠⁠Get In Touch: info@finos.org

SaaS for Developers
Building SaaS on Kafka Streams

SaaS for Developers

Play Episode Listen Later Aug 7, 2023 53:48


Colt McNealy is re-imagining the future of microservices orchestration and he decided to build it entirely on Kafka Streams. In this conversation we discuss how Kafka Streams provides the low latency, reliability, availability and elasticity that is needed for the next generation of microservices orchestration. Colt also shares the most exciting up and coming improvements in Kafka Streams community and the roadmap he'd dictate if he was the benevolent dictator of Kafka Streams.

saas kafka streams
Real-Time Analytics with Tim Berglund
Digging Into Interactive Queries in Kafka Streams with Bill Bejeck | Ep. 16

Real-Time Analytics with Tim Berglund

Play Episode Listen Later Jul 24, 2023 30:01


Follow: https://stree.ai/podcast | Sub: https://stree.ai/sub | New episodes every Monday! On today's episode, Tim Berglund sits down for a chat with Bill Bejeck, a prominent figure in the world of Kafka and real-time analytics. They dive into topics around Apache Kafka, Kafka Streams and interactive queries, diving deep into each one. Bill describes interactive queries as a way to scrutinize the state of a Kafka Streams application, whether that's a simple key lookup or an analysis of complex aggregations. The conversation also explores the functionality of KTables and how Kafka Streams manage state. If you've ever wondered about interactive queries or Kafka Streams at large, this is the episode for you.Anna's previous episodes: https://youtu.be/K14Kn0D-I4Yhttps://youtu.be/nCLN15W_WOcBill's book, Kafka Streams in Action: https://www.manning.com/books/kafka-streams-in-actionKafka Streams 101 course: https://developer.confluent.io/courses/kafka-streams/get-started/?utm_medium=sem&utm_source=google&utm_campaign=ch.sem_br.nonbrand_tp.prs_tgt.dsa_mt.dsa_rgn.namer_lng.eng_dv.all_con.confluent-developer&utm_term=&creative=&device=c&placement=&gad=1&gclid=CjwKCAjwx_eiBhBGEiwA15gLN00L7kvbE0vwVuL9IIGu78PBhzaTTzZU3REN-z2FTr968azH4KouiRoCV4oQAvD_BwE

Real-Time Analytics with Tim Berglund
Kafka Streams and the Complexity of Time with Anna McDonald, Confluent | Ep. 12

Real-Time Analytics with Tim Berglund

Play Episode Listen Later Jun 20, 2023 29:46


In this episode of the Real-Time Analytics Podcast, host Tim Berglund continues his conversation with Anna McDonald about Kafka Streams and the complexities of stream processing related to time. They explore the different types of windows available in Kafka Streams, including hopping, tumbling, session, and sliding windows. Anna provides insightful explanations and examples of each window type, highlighting their unique features and use cases. Don't miss out on this informative and engaging conversation on real-time analytics and Kafka Streams.Part 1 of Anna's episode: https://youtu.be/K14Kn0D-I4YAnna's Real-Time Analytics Summit 2023 presentation: https://youtu.be/tratRsV1TiI

complexity confluent kafka streams anna mcdonald
Streaming Audio: a Confluent podcast about Apache Kafka
Apache Kafka 3.5 - Kafka Core, Connect, Streams, & Client Updates

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Jun 15, 2023 11:25 Transcription Available


Apache Kafka® 3.5 is here with the capability of previewing migrations between ZooKeeper clusters to KRaft mode. Follow along as Danica Fine highlights key release updates.Kafka Core:KIP-833 provides an updated timeline for KRaft.KIP-866 now is preview and allows migration from an existing ZooKeeper cluster to KRaft mode.KIP-900 introduces a way to bootstrap the KRaft controllers with SCRAM credentials.KIP-903 prevents a data loss scenario by preventing replicas with stale broker epochs from joining the ISR list. KIP-915 streamlines the process of downgrading Kafka's transaction and group coordinators by introducing tagged fields.Kafka Connect:KIP-710 provides the option to use a REST API for internal server communication that can be enabled by setting `dedicated.mode.enable.internal.rest` equal to true. KIP-875 offers support for native offset management in Kafka Connect. Connect cluster administrators can now read offsets for both source and sink connectors. This KIP adds a new STOPPED state for connectors, enabling users to shut down connectors and maintain connector configurations without utilizing resources.KIP-894 makes `IncrementalAlterConfigs` API available for use in MirrorMaker 2 (MM2), adding a new use.incremental.alter.config configuration which takes values “requested,” “never,” and “required.”KIP-911 adds a new source tag for metrics generated by the `MirrorSourceConnector` to help monitor mirroring deployments.Kafka Streams:KIP-339 improves Kafka Streams' error-handling capabilities by addressing serialization errors that occur before message production and extending the interface for custom error handling. KIP-889 introduces versioned state stores in Kafka Streams for temporal join semantics in stream-to-table joins. KIP-904 simplifies table aggregation in Kafka by proposing a change in serialization format to enable one-step aggregation and reduce noise from events with old and new keys/values. KIP-914 modifies how versioned state stores are used in Kafka Streams. Versioned state stores may impact different DSL processors in varying ways, see the documentation for details.Kafka Client:KIP-881 is now complete and introduces new client-side assignor logic for rack-aware consumer balancing for Kafka Consumers. KIP-887 adds the `EnvVarConfigProvider` implementation to Kafka so custom configurations stored in environment variables can be injected into the system by providing the map returned by `System.getEnv()`.KIP 641 introduces the `RecordReader` interface to Kafka's clients module, replacing the deprecated MessageReader Scala trait. EPISODE LINKSSee release notes for Apache Kafka 3.5Read the blog to learn moreDownload and get started with Apache Kafka 3.5Watch the video version of this podcast

Real-Time Analytics with Tim Berglund
A Gentle Introduction to Kafka Streams with Anna McDonald (Confluent) | Ep. 11

Real-Time Analytics with Tim Berglund

Play Episode Listen Later Jun 12, 2023 25:59 Transcription Available


Follow: https://stree.ai/podcast | Sub: https://stree.ai/sub | New episodes every Monday! Join Tim Berglund as he chats with Anna McDonald about the fascinating world of Kafka Streams. Anna, a customer success technical architect at Confluent, shares her insights on the core concepts of Kafka Streams, including the all-important table and stream abstractions. They delve into the benefits of statefulness and durability, such as active and standby tasks, which ensure seamless failover, and how Kafka Streams stores state in RocksDB and in Kafka itself. With a teaser for the next episode, this conversation promises an exciting exploration of data ingestion and time management in Kafka Streams. Don't miss out on this insightful discussion!Starting with Apache Kafka: https://developer.confluent.io/learn-kafka/apache-kafka/events/KIP-392 information: https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica 

starting gentle kafka confluent apache kafka rocksdb kafka streams anna mcdonald
Streaming Audio: a Confluent podcast about Apache Kafka
Apache Kafka 3.4 - New Features & Improvements

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Feb 7, 2023 5:13 Transcription Available


Apache Kafka® 3.4 is released! In this special episode, Danica Fine (Senior Developer Advocate, Confluent), shares highlights of the Apache Kafka 3.4 release. This release introduces new KIPs in Kafka Core, Kafka Streams, and Kafka Connect.In Kafka Core:KIP-792 expands the metadata each group member passes to the group leader in its JoinGroup subscription to include the highest stable generation that consumer was a part of. KIP-830 includes a new configuration setting that allows you to disable the JMX reporter for environments where it's not being used. KIP-854 introduces changes to clean up producer IDs more efficiently, to avoid excess memory usage. It introduces a new timeout parameter that affects the expiry of producer IDs and updates the old parameter to only affect the expiry of transaction IDs.KIP-866 (early access) provides a bridge to migrate between existing Zookeeper clusters to new KRaft mode clusters, enabling the migration of existing metadata from Zookeeper to KRaft. KIP-876 adds a new property that defines the maximum amount of time that the server will wait to generate a snapshot; the default is 1 hour.KIP-881, an extension of KIP-392, makes it so that consumers can now be rack-aware when it comes to partition assignments and consumer rebalancing. In Kafka Streams:KIP-770 updates some Kafka Streams configs and metrics related to the record cache size.KIP-837 allows users to multicast result records to every partition of downstream sink topics and adds functionality for users to choose to drop result records without sending.And finally, for Kafka Connect:KIP-787 allows users to run MirrorMaker2 with custom implementations for the Kafka resource manager and makes it easier to integrate with your ecosystem.Tune in to learn more about the Apache Kafka 3.4 release!EPISODE LINKSSee release notes for Apache Kafka 3.4Read the blog to learn moreDownload Apache Kafka 3.4 and get startedWatch the video version of this podcastJoin the Community 

Engenharia de Dados [Cast]
Confluent Community Catalysts Brazukas: Dissecando o Apache Kafka [Round 1]

Engenharia de Dados [Cast]

Play Episode Listen Later Feb 2, 2023 77:12


Nesse episódio Luan Moreno & Mateus Oliveira entrevistam João Bosco, atualmente como Software & Solution Strategist no Nubank e Marcelo Costa, atualmente como Head of IT na Cia. Hering. Ambos os convidados e apresentadores são Confluent Community Catalysts.Confluent Community Catalysts são profissionais que investem seu tempo em divulgar, contribuir seja no código, ou respondendo ativamente nos forums e perguntas do Stack Overflow sobre Apache Kafka, sendo reconhecidos pela comunidade e pela Confluent pelo trabalho exercido.Nesta mesa redonda conversamos sobre os seguintes temas:Conceitos de Apache KafkaEvolução de Tecnologias de Mensageria para Plataforma de StreamingHistórias das Trincheiras sobre Apache Kafka e CuriosidadesDesafios para Implementação Inicial com Apache Kafka e AdoçãoAprenda com a experiência de profissionais que trabalharam diariamente com Apache Kafka usando as melhores práticas de mercado para construir uma plataforma robusta de streaming em tempo-real que é líder de mercado atualmente.Marcelo CostaJoão BoscoConfluent Catalyst Luan Moreno = https://www.linkedin.com/in/luanmoreno/

Software Defined Talk
Episode 396: Aloha to your strategy

Software Defined Talk

Play Episode Listen Later Jan 13, 2023 80:21


This week we discuss digital transformation at Southwest and Delta Airlines, Shopify cancels all meetings, Salesforce's M&A strategy, and A.I. is everywhere. Plus, thoughts on bike lanes… Watch the YouTube Live Recording of Episode 396 (https://youtu.be/tmm8rH9fZEE) Runner-up Titles Work trying to get on my personal calendar Traveling with an infant =BLACKSWAN(A1:G453) Socks in a Costco Can't do the business case on savings until you loose it. Pay transparency for you, not me We don't pay for things on the Internet Semper Nimbus Privatus Rundown Dutch residents are the most physically active on earth, (https://twitter.com/BrentToderian/status/1611901297552396289) Digital Transformation Travel Edition Delta plans to offer free Wi-Fi starting Feb. 1 (https://www.cnbc.com/2023/01/05/delta-plans-to-offer-free-wi-fi-starting-feb-1.html) The Southwest Airlines Meltdown (https://www.nytimes.com/2023/01/10/podcasts/the-daily/the-southwest-airlines-meltdown.html) Southwest's Meltdown Could Cost It Up to $825 Million (https://www.nytimes.com/2023/01/06/business/southwest-airlines-meltdown-costs-reimbursement.html) Southwest pilots union writes scathing letter to airline executives after holiday travel fiasco (https://www.yahoo.com/now/southwest-pilots-union-writes-scathing-011720946.html) Southwest makes frequent flyer miles offer while lots of luggage remains in limbo (https://www.cnn.com/travel/article/southwest-airlines-frequent-flyer-miles-meltdown/index.html) Point of Sale: Scan and Pay (https://twitter.com/pitdesi/status/1602843962602975233?s=20&t=YdGNYzReSf4r1twJ1hRfbA) Work Life Shopify Tells Employees to Just Say No to Meetings (https://www.bloomberg.com/news/articles/2023-01-03/shopify-ceo-tobi-lutke-tells-employees-to-just-say-no-to-meetings) Netflix Revokes Some Staff's Access to Other People's Salary Information (https://apple.news/A--bGmZgJTQCgHQ-9QdWu4w) U.S. Moves to Bar Noncompete Agreements in Labor Contracts (https://www.nytimes.com/2023/01/05/business/economy/ftc-noncompete.html) Gartner HR expert: Quiet hiring will dominate U.S. workplaces in 2023 (https://www.cnbc.com/2023/01/04/gartner-hr-expert-quiet-hiring-will-dominate-us-workplaces-in-2023.html) Netflix revokes some staff's access to other people's salary information (https://www.marketwatch.com/story/netflix-revokes-some-staffs-access-to-other-peoples-salary-information-11673384493) SFDC Salesforce: There's no more Slack left to cut (https://www.theregister.com/2023/01/10/salesforce_comment/) Salesforce to Lay Off 10 Percent of Staff and Cut Office Space (https://www.nytimes.com/2023/01/04/business/salesforce-layoffs.html) After layoffs, Salesforce CEO still blasts worker productivi (https://www.sfgate.com/tech/article/salesforce-ceo-blasts-worker-productivity-17708474.php)ty (https://www.sfgate.com/tech/article/salesforce-ceo-blasts-worker-productivity-17708474.php) AI is everywhere Google execs warn company's reputation could suffer if it moves too fast on AI-chat technology (https://www.cnbc.com/2022/12/13/google-execs-warn-of-reputational-risk-with-chatgbt-like-tool.html) Microsoft and OpenAI Working on ChatGPT-Powered Bing in Challenge to Google (https://www.theinformation.com/articles/microsoft-and-openai-working-on-chatgpt-powered-bing-in-challenge-to-google?utm_source=newsletter&utm_medium=email&utm_campaign=newsletter_axioslogin&stream=top) Microsoft eyes $10 billion bet on ChatGPT (https://www.semafor.com/article/01/09/2023/microsoft-eyes-10-billion-bet-on-chatgpt) Wolfram|Alpha as the Way to Bring Computational Knowledge Superpowers to ChatGPT (https://writings.stephenwolfram.com/2023/01/wolframalpha-as-the-way-to-bring-computational-knowledge-superpowers-to-chatgpt/) Relevant to your Interests 2023 Bum Steer of the Year: Austin (https://www.texasmonthly.com/news-politics/2023-bum-steer-of-year-austin/) Twitter's Rivals Try to Capitalize on Musk-Induced Chaos (https://www.nytimes.com/2022/12/07/technology/twitter-rivals-alternative-platforms.html) On Organizational Structures and the Developer Experience (https://redmonk.com/sogrady/2022/12/13/org-structure-devx/) KubeCon + CloudNativeCon North America 2022 Transparency Report | Cloud Native Computing Foundation (https://www.cncf.io/reports/kubecon-cloudnativecon-north-america-2022-transparency-report/) Inside the chaos at Washington's most connected military tech startup (https://www.vox.com/recode/23507236/inside-disruption-rebellion-defense-washington-connected-military-tech-startup) Elon Musk Starts Week As World's Second Richest Person (https://www.forbes.com/sites/mattdurot/2022/12/12/elon-musk-starts-week-as-worlds-second-richest-person/) 10 Tesla Investors Lose $132.5 Billion From Musk's Twitter Fiasco (https://www.investors.com/etfs-and-funds/sectors/tesla-stock-investors-lose-132-5-billion-from-musks-twitter-fiasco/) Rackspace's ransomware messaging dilemma (https://www.axios.com/newsletters/axios-login-83146574-380f-4e37-965d-7fd79bce7278.html?chunk=2&utm_term=emshare#story2) Heads-Up: Amazon S3 Security Changes Are Coming in April of 2023 (https://aws.amazon.com/blogs/aws/heads-up-amazon-s3-security-changes-are-coming-in-april-of-2023/) A MultiCloud Rant (https://www.lastweekinaws.com/blog/a_multicloud_rant/) Great visualization of the revenue breakdown of the 4 largest tech companies. (https://twitter.com/Carnage4Life/status/1603012861017862144?s=20&t=HC2UuMCHBB408xae6tZpbQ) AG Paxton's Google Suit Makes the Perfect the Enemy of the Good (https://truthonthemarket.com/2022/12/14/ag-paxtons-google-suit-makes-the-perfect-the-enemy-of-the-good/) AWS simplifies Simple Storage Service to prevent data leaks (https://www.theregister.com/2022/12/14/aws_simple_storage_service_simplified/) Creating the ultimate smart map with new map data initiative launched by Linux Foundation (https://venturebeat.com/virtual/creating-the-ultimate-smart-map-with-new-map-data-initiative-launched-by-linux-foundation/) Spotify's grand plan to monetize developers via its open source Backstage project (https://techcrunch.com/2022/12/15/spotifys-plan-to-monetize-its-open-source-backstage-developer-project/?guccounter=1&guce_referrer=aHR0cHM6Ly93d3cubGlua2VkaW4uY29tLw&guce_referrer_sig=AQAAAAlyOmdhogtX6nuQkNHQ7mVSyci6aMv7X6QwRTvS9PHGJmjO_wjCqsJXXPKI36A9MkIclSIQoHQ_dz7wJ-WzfaYQT_clMcUijiC28ZQhEau4NOcU-70wy5m0Q9LLmtvWuQbWQQEccEbQH2Lvg4_GqfnQBYNPZWRcgpx7XMLas_2R) VMware offers subs for server consolidation vSphere cut (https://www.theregister.com/2022/12/15/vsphere_plus_standard/) Senior execs to leave VMware before acquisition by Broadcom (https://www.bizjournals.com/sanjose/news/2022/12/13/three-senior-execs-to-leave-vmware.html#:~:text=Mark%20Lohmeyer%2C%20who%20heads%20cloud,Raghuram%20announced%20in%20a%20memo) China Bans Exports of Loongson CPUs to Russia, Other Countries: Report (https://www.tomshardware.com/news/china-bans-exports-of-its-loongson-cpus-to-russia-other-countries) Dropbox buys form management platform FormSwift for $95M in cash (https://techcrunch.com/2022/12/16/dropbox-buys-form-management-platform-formswift-for-95m-in-cash/) Sweep, a no-code config tool for Salesforce software, raises $28M (https://techcrunch.com/2022/12/15/sweep-a-no-code-config-tool-for-salesforce-software-raises-28m/) Twitter Aided the Pentagon in its Covert Online Propaganda Campaign (https://theintercept.com/2022/12/20/twitter-dod-us-military-accounts/) Okta's source code stolen after GitHub repositories hacked (https://www.bleepingcomputer.com/news/security/oktas-source-code-stolen-after-github-repositories-hacked/) Workday appoints VMware veteran as co-CEO (https://www.theregister.com/2022/12/21/workday_co_ceo/) Top Paying Tools (https://softwaredefinedtalk.slack.com/archives/C04EK1VBK/p1671635825838769) Winging It: Inside Amazon's Quest to Seize the Skies (https://www.wired.com/story/amazon-air-quest-to-seize-the-skies/) CIS Benchmark Framework Scanning Tools Comparison (https://www.armosec.io/blog/cis-kubernetes-benchmark-framework-scanning-tools-comparison/) MSG defends using facial recognition to kick lawyer out of Rockettes show (https://arstechnica.com/tech-policy/2022/12/facial-recognition-flags-girl-scout-mom-as-security-risk-at-rockettes-show/) OpenAI releases Point-E, an AI that generates 3D models (https://techcrunch.com/2022/12/20/openai-releases-point-e-an-ai-that-generates-3d-models/) No, You Haven't Won a Yeti Cooler From Dick's Sporting Goods (https://www.wired.com/story/email-scam-dicks-sporting-goods-yeti-cooler/) The Lastpass hack was worse than the company first reported (https://www.engadget.com/the-lastpass-hack-was-worse-than-the-company-first-reported-000501559.html?utm_source=facebook&utm_medium=news_tab) IRS delays tax reporting change for 1099-K on Venmo, Paypal business payments (https://www.cnbc.com/2022/12/23/irs-delays-tax-reporting-change-for-1099-k-on-venmo-paypal-payments.html) Cyber attacks set to become ‘uninsurable', says Zurich chief (https://www.ft.com/content/63ea94fa-c6fc-449f-b2b8-ea29cc83637d) Google Employees Brace for a Cost-Cutting Drive as Anxiety Mounts (https://www.nytimes.com/2022/12/28/technology/google-job-cuts.html) IBM beat all its large-cap tech peers in 2022 as investors shunned growth for safety (https://www.cnbc.com/2022/12/27/ibm-stock-outperformed-technology-sector-in-2022.html) Europe Taps Tech's Power-Hungry Data Centers to Heat Homes (https://www.wsj.com/articles/europe-taps-techs-power-hungry-data-centers-to-heat-homes-11672309944?mod=djemalertNEWS) List of defunct social networking services (https://en.wikipedia.org/wiki/List_of_defunct_social_networking_services) 2023 Predictions | No Mercy / No Malice (https://www.profgalloway.com/2023-predictions/) Twitter rival Mastodon rejects funding to preserve nonprofit status (https://arstechnica.com/tech-policy/2022/12/twitter-rival-mastodon-rejects-funding-to-preserve-nonprofit-status/) TSMC Starts Next-Gen Mass Production as World Fights Over Chips (https://www.bloomberg.com/news/articles/2022-12-29/tsmc-mass-produces-next-gen-chips-to-safeguard-global-lead) Microsoft and FTC pre-trial hearing set for January 3rd (https://www.engadget.com/pre-trial-hearing-between-microsoft-and-ftc-set-for-january-3rd-203320387.html) The infrastructure behind ATMs (https://www.bitsaboutmoney.com/archive/the-infrastructure-behind-atms/) Apple is increasing battery replacement service charges for out-of-warranty devices (https://techcrunch.com/2023/01/03/apple-is-increasing-battery-replacement-service-charges-for-out-of-warranty-devices/) Snowflake's business and how the weakening economy is impacting cloud vendors (https://twitter.com/LiebermanAustin/status/1607376944873754626) Shift Happens: A book about keyboards (https://shifthappens.site/) Amazon to cut 18,000 jobs (https://www.axios.com/2023/01/05/amazon-layoffs-18000-jobs) CircleCI security alert: Rotate any secrets stored in CircleCI (https://circleci.com/blog/january-4-2023-security-alert/) Video game workers form Microsoft's first U.S. labor union (https://www.nbcnews.com/tech/tech-news/video-game-workers-form-microsofts-first-us-labor-union-rcna64103) World's Premier Investors Line Up to Partner with Netskope as the SASE Security and Networking Platform of Choice (https://www.prnewswire.com/news-releases/worlds-premier-investors-line-up-to-partner-with-netskope-as-the-sase-security-and-networking-platform-of-choice-301712417.html) omg.lol - A lovable web page and email address, just for you (https://home.omg.lol/) Alphabet led a $100 million funding of Chronosphere, a startup that helps companies monitor and cut cloud bills. (https://twitter.com/theinformation/status/1611165698868367360) Confluent expands Kafka Streams capabilities, acquires Apache Flink vendor (https://venturebeat.com/enterprise-analytics/confluent-acquires-apache-flink-vendor-immerok-to-expand-data-stream-processing/) Excel & Google Sheets AI Formula Generator - Excelformulabot.com (https://excelformulabot.com/) Has the Internet Reached Peak Clickability? (https://tedgioia.substack.com/p/has-the-internet-reached-peak-clickability) Adobe's CEO Sizes Up the State of Tech Now (https://www.wsj.com/articles/adobes-ceo-sizes-up-the-state-of-tech-now-11673151167?mod=djemalertNEWS) Researchers Hacked California's Digital License Plates, Gaining Access to GPS Location and User Info (https://jalopnik.com/researchers-hacked-californias-digital-license-plates-1849966295) Microsoft's New AI Can Simulate Anyone's Voice With 3 Seconds of Audio (https://slashdot.org/story/23/01/10/0749241/microsofts-new-ai-can-simulate-anyones-voice-with-3-seconds-of-audio?utm_source=slashdot&utm_medium=twitter) Observability platform Chronosphere raises another $115M at a $1.6B valuation (https://techcrunch.com/2023/01/10/observability-platform-chronosphere-raises-another-115m-at-a-1-6b-valuation/) Why IBM is no longer interested in breaking patent records–and how it plans to measure innovation in the age of open source and quantum computing (https://fortune.com/2023/01/06/ibm-patent-record-how-to-measure-innovation-open-source-quantum-computing-tech/) New research aims to analyze how widespread COBOL is (https://www.theregister.com/2022/12/14/cobol_research/) Companies are still waiting for their cloud ROI (https://www.infoworld.com/article/3675374/companies-are-still-waiting-for-their-cloud-roi.html) What TNS Readers Want in 2023: More DevOps, API Coverage (https://thenewstack.io/what-tns-readers-want-in-2023-more-devops-api-coverage/) Tech Debt Yo-Yo Cycle. (https://twitter.com/wardleymaps/status/1605860426671177728) How a single developer dropped AWS costs by 90%, then disappeared (https://scribe.rip/@maximetopolov/how-a-single-developer-dropped-aws-costs-by-90-then-disappeared-2b46a115103a) A look at the 2022 velocity of CNCF, Linux Foundation, and top 30 open source projects (https://www.cncf.io/blog/2023/01/11/a-look-at-the-2022-velocity-of-cncf-linux-foundation-and-top-30-open-source-projects/) The golden age of the streaming wars has ended (https://www.theverge.com/2022/12/14/23507793/streaming-wars-hbo-max-netflix-ads-residuals-warrior-nun) YouTube exec says NFL Sunday Ticket will have multiscreen functionality (https://awfulannouncing.com/youtube/nfl-sunday-ticket-multiscreen-mosaic-mode.html) (https://twitter.com/theinformation/status/1611165698868367360)## Nonsense The $11,500 toilet with Alexa inside can now be put inside your home (https://www.theverge.com/2022/12/19/23510864/kohler-numi-smart-toilet-alexa-ces-2022) Starbucks updating its loyalty program starting in February (https://www.axios.com/2022/12/28/starbucks-rewards-program-changes-coming) The revenue model of a popular YouTube channel about Lego. (https://paper.dropbox.com/doc/SDT-396--BwhY9F5kpz_BI2kkdw63ZpJ~Ag-MVMKwqqBEH5SzYKqYO2Jc) Conferences THAT Conference Texas Speakers and Schedule (https://that.us/events/tx/2023/schedule/), Round Rock, TX Jan 15th-18th Use code SDT for 5% off SpringOne (https://springone.io/), Jan 24–26. Coté speaking at cfgmgmtcamp (https://cfgmgmtcamp.eu/ghent2023/), Feb 6th to 8th, Ghent. State of Open Con 2023, (https://stateofopencon.com/sponsors/) London, UK, February 7th-8th 2023 CloudNativeSecurityCon North America (https://events.linuxfoundation.org/cloudnativesecuritycon-north-america/), Seattle, Feb 1 – 2, 2023 Southern California Linux Expo, (https://www.socallinuxexpo.org/scale/20x) Los Angeles, March 9-12, 2023 DevOpsDays Birmingham, AL 2023 (https://devopsdays.org/events/2023-birmingham-al/welcome/), April 20 - 21, 2023 SDT news & hype Join us in Slack (http://www.softwaredefinedtalk.com/slack). Get a SDT Sticker! Send your postal address to stickers@softwaredefinedtalk.com (mailto:stickers@softwaredefinedtalk.com) and we will send you free laptop stickers! Follow us on Twitch (https://www.twitch.tv/sdtpodcast), Twitter (https://twitter.com/softwaredeftalk), Instagram (https://www.instagram.com/softwaredefinedtalk/), LinkedIn (https://www.linkedin.com/company/software-defined-talk/) and YouTube (https://www.youtube.com/channel/UCi3OJPV6h9tp-hbsGBLGsDQ/featured). Use the code SDT to get $20 off Coté's book, Digital WTF (https://leanpub.com/digitalwtf/c/sdt), so $5 total. Become a sponsor of Software Defined Talk (https://www.softwaredefinedtalk.com/ads)! Recommendations Brandon: Industrial Garage Shelves (https://www.homedepot.com/p/Husky-5-Tier-Industrial-Duty-Steel-Freestanding-Garage-Storage-Shelving-Unit-in-Black-90-in-W-x-90-in-H-x-24-in-D-N2W902490W5B/319132842) Matt: Oxide and Friends: Breaking it down with Ian Brown (https://oxide.computer/podcasts/oxide-and-friends/1150480) Wu Tang Saga (https://www.imdb.com/title/tt9113406/) Season 3 coming next month! Coté: Mouth to Mouth (https://www.goodreads.com/en/book/show/58438631-mouth-to-mouth) by Antoine Wilson (https://www.goodreads.com/en/book/show/58438631-mouth-to-mouth). Photo Credits Header (https://unsplash.com/photos/euaDCtB_jyw) CoverArt (https://unsplash.com/photos/9xdho4stJQ8)

The GeekNarrator
Understanding ksqlDB with Matthias J. Sax

The GeekNarrator

Play Episode Listen Later Jan 13, 2023 61:55


Hey Everyone, In this episode I and Matthias talk about KsqlDb. We have covered the topic in great depth talking about its history, architecture, different concepts, use cases, limitations, comparison to Kafka Streams and so on. References: ksqlDB - https://ksqldb.io/ exactly once semantics podcast: https://youtu.be/twgbAL_EaQw Matthias Sax: https://twitter.com/MatthiasJSax and https://www.linkedin.com/in/mjsax/ Cheers, The GeekNarrator

cheers matthias j kafka streams ksqldb
The GeekNarrator
Kafka, Realtime analytics and Apache Pinot with Tim Berglund Part-2

The GeekNarrator

Play Episode Listen Later Jan 3, 2023 39:38


Hey everyone, This is the part-2 of our episode with Tim Berglund. We have covered some advanced topics on Kafka, Kafka Streams and Apache Pinot. I hope you like the discussion. Cheers, The GeekNarrator

The GeekNarrator
Kafka Streams Exactly Once Semantics With Matthias Sax

The GeekNarrator

Play Episode Listen Later Jan 3, 2023 85:57


Hey Everyone, In this episode I am joined by Matthias Sax, who works with Confluent to build the amazing world of Kafka. We have discussed in real depths of Kafka Streams and how Exactly once semantics is implemented. This episode will give you all the details you need to understand how beautifully Kafka imeplements EOS. I hope you like the episode. Cheers, The GeekNarrator

Streaming Audio: a Confluent podcast about Apache Kafka
Build a Real Time AI Data Platform with Apache Kafka

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Oct 20, 2022 37:18 Transcription Available


Is it possible to build a real-time data platform without using stateful stream processing? Forecasty.ai is an artificial intelligence platform for forecasting commodity prices, imparting insights into the future valuations of raw materials for users. Nearly all AI models are batch-trained once, but precious commodities are linked to ever-fluctuating global financial markets, which require real-time insights. In this episode, Ralph Debusmann (CTO, Forecasty.ai) shares their journey of migrating from a batch machine learning platform to a real-time event streaming system with Apache Kafka® and delves into their approach to making the transition frictionless. Ralph explains that Forecasty.ai was initially built on top of batch processing, however, updating the models with batch-data syncs was costly and environmentally taxing. There was also the question of scalability—progressing from 60 commodities on offer to their eventual plan of over 200 commodities. Ralph observed that most real-time systems are non-batch, streaming-based real-time data platforms with stateful stream processing, using Kafka Streams, Apache Flink®, or even Apache Samza. However, stateful stream processing involves resources, such as teams of stream processing specialists to solve the task. With the existing team, Ralph decided to build a real-time data platform without using any sort of stateful stream processing. They strictly keep to the out-of-the-box components, such as Kafka topics, Kafka Producer API, Kafka Consumer API, and other Kafka connectors, along with a real-time database to process data streams and implement the necessary joins inside the database. Additionally, Ralph shares the tool he built to handle historical data, kash.py—a Kafka shell based on Python; discusses issues the platform needed to overcome for success, and how they can make the migration from batch processing to stream processing painless for the data science team. EPISODE LINKSKafka Streams 101 courseThe Difference Engine for Unlocking the Kafka Black BoxGitHub repo: kash.pyWatch the video version of this podcastKris Jenkins' TwitterStreaming Audio Playlist Join the Confluent CommunityLearn more with Kafka tutorials, resources, and guides at Confluent DeveloperLive demo: Intro to Event-Driven Microservices with ConfluentUse PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)   

Streaming Audio: a Confluent podcast about Apache Kafka
Apache Kafka 3.3 - KRaft, Kafka Core, Streams, & Connect Updates

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Oct 3, 2022 6:42 Transcription Available


Apache Kafka® 3.3 is released! With over two years of development, KIP-833 marks KRaft as production ready for new AK 3.3 clusters only. On behalf of the Kafka community, Danica Fine (Senior Developer Advocate, Confluent) shares highlights of this release, with KIPs from Kafka Core, Kafka Streams, and Kafka Connect. To reduce request overhead and simplify client-side code, KIP-709 extends the OffsetFetch API requests to accept multiple consumer group IDs. This update has three changes, including extending the wire protocol, response handling changes, and enhancing the AdminClient to use the new protocol. Log recovery is an important process that is triggered whenever a broker starts up after an unclean shutdown. And since there is no way to know the log recovery progress other than checking if the broker log is busy, KIP-831 adds metrics for the log recovery progress with `RemainingLogsToRecover` and `RemainingSegmentsToRecover`for each recovery thread. These metrics allow the admin to monitor the progress of the log recovery.Additionally, updates on Kafka Core also include KIP-841: Fenced replicas should not be allowed to join the ISR in KRaft. KIP-835: Monitor KRaft Controller Quorum Health. KIP-859: Add metadata log processing error-related metrics. KIP-834 for Kafka Streams added the ability to pause and resume topologies. This feature lets you reduce rescue usage when processing is not required or modifying the logic of Kafka Streams applications, or when responding to operational issues. While KIP-820 extends the KStream process with a new processor API. Previously, KIP-98 added support for exactly-once delivery guarantees with Kafka and its Java clients. In the AK 3.3 release, KIP-618 offers the Exactly-Once Semantics support to Confluent's source connectors. To accomplish this, a number of new connectors and worker-based configurations have been introduced, including `exactly.once.source.support`, `transaction.boundary`, and more. Image attribution: Apache ZooKeeper™: https://zookeeper.apache.org/ and Raft logo:  https://raft.github.io/  EPISODE LINKSSee release notes for Apache Kafka 3.3.0 and Apache Kafka 3.3.1 for the full list of changesRead the blog to learn moreDownload Apache Kafka 3.3 and get startedWatch the video version of this podcast

kraft api streams ak java log ids kafka raft isr confluent apache kafka fenced kips kafka streams kafka connect apache zookeeper
Streaming Audio: a Confluent podcast about Apache Kafka
Capacity Planning Your Apache Kafka Cluster

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Aug 30, 2022 61:54 Transcription Available


How do you plan Apache Kafka® capacity and Kafka Streams sizing for optimal performance? When Jason Bell (Principal Engineer, Dataworks and founder of Synthetica Data), begins to plan a Kafka cluster, he starts with a deep inspection of the customer's data itself—determining its volume as well as its contents: Is it JSON, straight pieces of text, or images? He then determines if Kafka is a good fit for the project overall, a decision he bases on volume, the desired architecture, as well as potential cost.Next, the cluster is conceived in terms of some rule-of-thumb numbers. For example, Jason's minimum number of brokers for a cluster is three or four. This means he has a leader, a follower and at least one backup.  A ZooKeeper quorum is also a set of three. For other elements, he works with pairs, an active and a standby—this applies to Kafka Connect and Schema Registry. Finally, there's Prometheus monitoring and Grafana alerting to add. Jason points out that these numbers are different for multi-data-center architectures.Jason never assumes that everyone knows how Kafka works, because some software teams include specialists working on a producer or a consumer, who don't work directly with Kafka itself. They may not know how to adequately measure their Kafka volume themselves, so he often begins the collaborative process of graphing message volumes. He considers, for example, how many messages there are daily, and whether there is a peak time. Each industry is different, with some focusing on daily batch data (banking), and others fielding incredible amounts of continuous data (IoT data streaming from cars).  Extensive testing is necessary to ensure that the data patterns are adequately accommodated. Jason sets up a short-lived system that is identical to the main system. He finds that teams usually have not adequately tested across domain boundaries or the network. Developers tend to think in terms of numbers of messages, but not in terms of overall network traffic, or in how many consumers they'll actually need, for example. Latency must also be considered, for example if the compression on the producer's side doesn't match compression on the consumer's side, it will increase.Kafka Connect sink connectors require special consideration when Jason is establishing a cluster. Failure strategies need to well thought out, including retries and how to deal with the potentially large number of messages that can accumulate in a dead letter queue. He suggests that more attention should generally be paid to the Kafka Connect elements of a cluster, something that can actually be addressed with bash scripts.Finally, Kris and Jason cover his preference for Kafka Streams over ksqlDB from a network perspective. EPISODE LINKSCapacity Planning and Sizing for Kafka StreamsTales from the Frontline of Apache Kafka DevOpsWatch the video version of this podcastKris Jenkins' TwitterStreaming Audio Playlist Join the Confluent CommunityLearn more on Confluent DeveloperUse PODCAST100 to get $100 of free Cloud usage (details)  

Streaming Audio: a Confluent podcast about Apache Kafka
Blockchain Data Integration with Apache Kafka

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Jul 7, 2022 50:59 Transcription Available


How is Apache Kafka® relevant to blockchain technology and cryptocurrency? Fotios Filacouris (Staff Solutions Engineer, Confluent) has been working with Kafka for close to five years, primarily designing architectural solutions for financial services, he also has expertise in the blockchain. In this episode, he joins Kris to discuss how blockchain and Kafka are complementary, and he also highlights some of the use cases he has seen emerging that use Kafka in conjunction with traditional, distributed ledger technology (DLT) as well as blockchain technologies. According to Fotios, Kafka and the notion of blockchain share many traits, such as immutability, replication, distribution, and the decoupling of applications. This complementary relationship means that they can function well together if you are looking to extend the functionality of a given DLT through sidechain or off-chain activities, such as analytics, integrations with traditional enterprise systems, or even the integration of certain chains and ledgers. Based on Fotios' observations, Kafka has become an essential piece of the puzzle in many blockchain-related use cases, including settlement, logging, analytics and risk, and volatility calculations. For example, a bitcoin trading application may use Kafka Streams to provide analytics on top of the price action of various crypto assets. Fotios has also seen use cases where a crypto platform leverages Kafka as its infrastructure layer for real-time logging and analytics. EPISODE LINKSModernizing Banking Architectures with Apache KafkaNew Kids On the BloqWatch the video version of this podcastKris Jenkins' TwitterStreaming Audio Playlist Join the Confluent CommunityLearn more with Kafka tutorials, resources, and guides at Confluent DeveloperLive demo: Intro to Event-Driven Microservices with ConfluentUse PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)   

Streaming Audio: a Confluent podcast about Apache Kafka
How I Became a Developer Advocate

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Jun 9, 2022 29:48 Transcription Available


What is a developer advocate and how do you become one? In this episode, we have seasoned developer advocates, Kris Jenkins (Senior Developer Advocate, Confluent) and Danica Fine (Senior Developer Advocate, Confluent) answer the question by diving into how they got into the world of developer relations, what they enjoyed the most about their roles, and how you can become one.Developer advocacy is at the heart of a developer community—helping developers and software engineers to get the most out of a given technology by providing support in form of blog posts, podcasts, conference talks, video tutorials, meetups, and other mediums.   Before stepping into the world of developer relations, both Danica and Kris were hands-on developers. While dedicating professional time, Kris also devoted personal time to supporting fellow developers, such as running local meetups, writing blogs, and organizing hackathons.While Danica found her calling after learning more about Apache Kafka® and successfully implemented a mission-critical application for a financial services company—transforming 2,000 lines of codes into Kafka Streams. She enjoys building and sharing her knowledge with the community to make technology as accessible and as fun as possible.Additionally, the duo previews their developer advocacy trip to Singapore and Australia in mid-June, where they will attend local conferences and host in-person meetups on Kafka and event streaming. EPISODE LINKSIn-person meetup: Singapore | Sydney | MelbourneCoding in Motion: Building a Data Streaming App with JavaScript Practical Data Pipeline: Build a Plant Monitoring System with ksqlDBHow to Build a Strong Developer Community ft. Robin Moffatt and Ale MurrayDesigning Event-Driven SystemsWatch the video version of this podcastDanica Fine's TwitterKris Jenkins' TwitterStreaming Audio Playlist Join the Confluent CommunityLearn more with Kafka tutorials, resources, and guides at Confluent DeveloperUse PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)   

Streaming Audio: a Confluent podcast about Apache Kafka
Flink vs Kafka Streams/ksqlDB: Comparing Stream Processing Tools

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later May 26, 2022 55:55 Transcription Available


Stream processing can be hard or easy depending on the approach you take, and the tools you choose. This sentiment is at the heart of the discussion with Matthias J. Sax (Apache Kafka® PMC member; Software Engineer, ksqlDB and Kafka Streams, Confluent) and Jeff Bean (Sr. Technical Marketing Manager, Confluent). With immense collective experience in Kafka, ksqlDB, Kafka Streams, and Apache Flink®, they delve into the types of stream processing operations and explain the different ways of solving for their respective issues.The best stream processing tools they consider are Flink along with the options from the Kafka ecosystem: Java-based Kafka Streams and its SQL-wrapped variant—ksqlDB. Flink and ksqlDB tend to be used by divergent types of teams, since they differ in terms of both design and philosophy.Why Use Apache Flink?The teams using Flink are often highly specialized, with deep expertise, and with an absolute focus on stream processing. They tend to be responsible for unusually large, industry-outlying amounts of both state and scale, and they usually require complex aggregations. Flink can excel in these use cases, which potentially makes the difficulty of its learning curve and implementation worthwhile.Why use ksqlDB/Kafka Streams?Conversely, teams employing ksqlDB/Kafka Streams require less expertise to get started and also less expertise and time to manage their solutions. Jeff notes that the skills of a developer may not even be needed in some cases—those of a data analyst may suffice. ksqlDB and Kafka Streams seamlessly integrate with Kafka itself, as well as with external systems through the use of Kafka Connect. In addition to being easy to adopt, ksqlDB is also deployed on production stream processing applications requiring large scale and state.There are also other considerations beyond the strictly architectural. Local support availability, the administrative overhead of using a library versus a separate framework, and the availability of stream processing as a fully managed service all matter. Choosing a stream processing tool is a fraught decision partially because switching between them isn't trivial: the frameworks are different, the APIs are different, and the interfaces are different. In addition to the high-level discussion, Jeff and Matthias also share lots of details you can use to understand the options, covering employment models, transactions, batching, and parallelism, as well as a few interesting tangential topics along the way such as the tyranny of state and the Turing completeness of SQL.EPISODE LINKSThe Future of SQL: Databases Meet Stream ProcessingBuilding Real-Time Event Streams in the Cloud, On PremisesKafka Streams 101 courseksqlDB 101 courseWatch the video version of this podcastKris Jenkins' TwitterStreaming Audio Playlist Join the Confluent CommunityLearn more on Confluent DeveloperUse PODCAST100 for additional $100 of  Confluent Cloud usage (details)

Streaming Audio: a Confluent podcast about Apache Kafka
Apache Kafka 3.2 - New Features & Improvements

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later May 17, 2022 6:54 Transcription Available


Apache Kafka® 3.2 delivers new  KIPs in three different areas of the Kafka ecosystem: Kafka Core, Kafka Streams, and Kafka Connect. On behalf of the Kafka community, Danica Fine (Senior Developer Advocate, Confluent), shares release highlights.More than half of the KIPs in the new release concern Kafka Core. KIP-704 addresses unclean leader elections by allowing for further communication between the controller and the brokers. KIP-764 takes on the problem of a large number of client connections in a short period of time during preferred leader election by adding the configuration `socket.listen.backlog.size`. KIP-784 adds an error code field to the response of the `DescribeLogDirs` API, and KIP-788 improves network traffic by allowing you to set the pool size of network threads individually per listener on Kafka brokers. Finally, in accordance with the imminent KRaft protocol, KIP-801 introduces a built-in `StandardAuthorizer` that doesn't depend on ZooKeeper. There are five KIPs related to Kafka Streams in the AK 3.2 release. KIP-708 brings rack-aware standby assignment by tag, which improves fault tolerance. Then there are three projects related to Interactive Queries v2: KIP-796 specifies an improved interface for Interactive Queries; KIP-805 allows state to be queried over a specific range; and KIP-806 adds two implementations of the Query interface, `WindowKeyQuery` and `WindowRangeQuery`.The final Kafka Streams project, KIP-791, enhances `StateStoreContext` with `recordMetadata`,which may be accessed from state stores.Additionally, this Kafka release introduces Kafka Connect-related improvements, including KIP-769, which extends the `/connect-plugins` API, letting you list all available plugins, and not just connectors as before.  KIP-779 lets `SourceTasks` handle producer exceptions according to `error.tolerance`, rather than instantly killing the entire connector by default. Finally, KIP-808 lets you specify precisions with respect to TimestampConverter single message transforms. Tune in to learn more about the Apache Kafka 3.2 release!EPISODE LINKSApache Kafka 3.2 release notes Read the blog to learn moreDownload Apache Kafka 3.2.0Watch the video version of this podcast

Quarkus Insights
Quarkus Insights #84: Quarkus Testing

Quarkus Insights

Play Episode Listen Later Mar 23, 2022 60:13


Quarkus has full support for Kafka Streams with abilty to run in vm mode, native mode and dev mode.

testing kafka streams
Data on Kubernetes Community
Dok Talks #119 - Cloud-Native Data Pipelines // Hakan Lofcali

Data on Kubernetes Community

Play Episode Listen Later Mar 4, 2022 53:25


https://go.dok.community/slack https://dok.community ABSTRACT OF THE TALK This talk walks you through our stack, architecture, and processes. We develop tools to deploy and run data-driven applications in a cloud-native environment. We will give a whirlwind tour on developing a Java Quarkus application, a CICD stack powered by GitHub Actions / ArgoCD, building and deploying containerized Kafka Streams applications at runtime with Jib container builder. Having introduced the above common understanding, we will give a high-level overview of how we utilize modern Kubernetes and Cloud tooling to manage multiple clusters in different organizations together with our customers. BIO DataCater commoditizes data pipeline development lifecycle by applying software engineering and cloud native practices to data work. Hakan is a Software / Data Engineer and CTO of DataCater. He worked and built his knowledge around Software, Data Engineering, and Cloud-Native Computing in severely different environments. From early start-up to hyper-scaler AWS. From sports media companies to highly regulated FSI enterprises. The experiences gained, problems encountered, and solutions found led to him co-founding DataCater to enhance tooling in the Data space.

Streaming Audio: a Confluent podcast about Apache Kafka
Serverless Stream Processing with Apache Kafka ft. Bill Bejeck

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Mar 3, 2022 42:23 Transcription Available


What is serverless?Having worked as a software engineer for over 15 years and as a regular contributor to Kafka Streams, Bill Bejeck (Integration Architect, Confluent) is an Apache Kafka® committer and author of “Kafka Streams in Action.” In today's episode, he explains what serverless and the architectural concepts behind it are. To clarify, serverless doesn't mean you can run an application without a server—there are still servers in the architecture, but they are abstracted away from your application development. In other words, you can focus on building and running applications and services without any concerns over infrastructure management. Using a cloud provider such as Amazon Web Services (AWS) enables you to allocate machine resources on demand while handling provisioning, maintenance, and scaling of the server infrastructure. There are a few important terms to know when implementing serverless functions with event stream processors: Functions as a service (FaaS)Stateless stream processingStateful stream processingServerless commonly falls into the FaaS cloud computing service category—for example, AWS Lambda is the classic definition of a FaaS offering. You have a greater degree of control to run a discrete chunk of code in response to certain events, and it lets you write code to solve a specific issue or use case. Stateless processing is simpler in comparison to stateful processing, which is more complex as it involves keeping the state of an event stream and needs a key-value store. ksqlDB allows you to perform both stateless and stateful processing, but its strength lies in stateful processing to answer complex questions while AWS Lambda is better suited for stateless processing tasks. By integrating ksqlDB with AWS Lambda together, they deliver serverless event streaming and analytics at scale.EPISODE LINKSServerless Stream Processing with Apache Kafka, AWS Lambda, and ksqlDBStateful Serverless Architectures with ksqlDB and AWS Lambda Serverless GitHub repositoryKafka Streams in ActionWatch the video version of this podcastJoin the Confluent CommunityLearn more with Kafka tutorials, resources, and guides at Confluent DeveloperLive demo: Intro to Event-Driven Microservices with ConfluentUse PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)

Engenharia de Dados [Cast]
Casos de Uso e Experiência de Campo com Apache Kafka

Engenharia de Dados [Cast]

Play Episode Listen Later Feb 24, 2022 63:34


Trazemos nesse episódio o especialista João Bosco Seixas, Community Catalyst para falar sobre Apache Kafka, nesse bate-papo falamos das vertentes de Desenvolvimento e Engenharia de Dados e como cada área pode utilizar o Apache Kafka de forma mais efetiva.* Apache Kafka para Desenvolvimento de Software* Engenharia de Dados com Apache Kafka e Analytics em Tempo-Real* Curva de Aprendizagem da Tecnologia* Casos de Uso* Experiências de Campo* Dicas para IniciantesA intenção principal é mostrar para um Engenheiro de Dados como o Apache Kafka é usado não somente para Analytics mais sim por toda a empresa principalmente na construção de microsserviços. Luan Moreno = https://www.linkedin.com/in/luanmoreno/

Engenharia de Dados [Cast]
Apache Kafka é um Banco de Dados Relacional?

Engenharia de Dados [Cast]

Play Episode Listen Later Feb 14, 2022 53:57


O Apache Kafka é uma plataforma de streaming de dados, capaz de ingerir e processar milhões de eventos por segundo entretanto, alguns pontos são importantes e normalmente não temos muitas explicações sobre os mesmos, como:O Apache Kafka é um Banco de DadosTransações no Apache KafkaArmazenamento e Processamento DesacopladoComparação de Banco de Dados vs. Apache KafkaEsse episódio irá de uma vez por todas desmistificar o Apache Kafka e tirar todas as suas dúvidas referentes a seus pontos fortes e fracos e como você pode extrair o melhor dessa tecnologia open-source da Apache Software Foundation. Luan Moreno = https://www.linkedin.com/in/luanmoreno/

Streaming Audio: a Confluent podcast about Apache Kafka
Apache Kafka 3.1 - Overview of Latest Features, Updates, and KIPs

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Jan 24, 2022 4:43 Transcription Available


Apache Kafka® 3.1 is here with exciting new features and improvements! On behalf of the Kafka community, Danica Fine (Senior Developer Advocate, Confluent) shares release highlights that you won't want to miss, including foreign-key joins in Kafka Streams and improvements that will provide consistency for Kafka latency metrics. KAFKA-13439 deprecates the eager protocol, which has been the default since Kafka 2.4—it's advised to upgrade your applications to the cooperative protocol as the eager protocol will no longer be supported in future releases. Previously, foreign-key joins in Kafka Streams only worked if both primary and foreign-key tables were joined. This release adds support for foreign-key joins on tables with custom partitioners, which will be passed in as part of a new `TableJoined` object, comparable to the existing `Joined` and `StreamJoined` objects. With the goal of making Kafka more intuitive, KIP-773 enhances naming consistency for three new client metrics with millis and nanos. For example, `io-waittime-total` is reintroduced as `io-wait-time-ns-total`. The previously introduced metrics without ns will be deprecated but available for backward compatibility. KIP-768 continues the work started in KIP-255 to implement the necessary interfaces for a production-grade way to connect to an OpenID identity provider for authentication and token retrieval. This update provides an out-of-the-box implementation of an `AuthenticateCallbackHandler` that can be used to communicate with OAuth/OIDC. Additionally, this Kafka release introduces two new metrics for active brokers specifically, `ActiveBrokerCount` and `FenceBrokerCount`. These two metrics expose the number of active brokers in the cluster known by the controller and the number of fenced brokers known by the controller. Tune in to learn more about the Apache Kafka 3.1 release! EPISODE LINKSApache Kafka 3.1 release notes Read the blog to learn moreDownload Apache Kafka 3.1Watch the video version of this podcast

Streaming Audio: a Confluent podcast about Apache Kafka
Modernizing Banking Architectures with Apache Kafka ft. Fotios Filacouris

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Dec 28, 2021 34:59 Transcription Available


It's been said that financial services organizations have been early Apache Kafka® adopters due to the strong delivery guarantees and scalability that Kafka provides. With experience working and designing architectural solutions for financial services, Fotios Filacouris (Senior Solutions Engineer, Enterprise Solutions Engineering, Confluent) joins Tim to discuss how Kafka and Confluent help banks build modern architectures, highlighting key emerging use cases from the sector. Previously, Kafka was often viewed as a simple pipe that connected databases together, which allows for easy and scalable data migration. As the Kafka ecosystem evolves with added components like ksqlDB, Kafka Streams, and Kafka Connect, the implementation of Kafka goes beyond being just a pipe—it's an intelligent pipe that enables real-time, actionable data insights.Fotios shares a couple of use cases showcasing how Kafka solves the problems that many banks are facing today. One of his customers transformed retail banking by using Kafka as the architectural base for storing all data permanently and indefinitely. This approach enables data in motion and a better user experience for frontend users while scrolling through their transaction history by eliminating the need to download old statements that have been offloaded in the cloud or a data lake. Kafka also provides the best of both worlds with increased scalability and strong message delivery guarantees that are comparable to queuing middleware like IBM MQ and TIBCO. In addition to use cases, Tim and Fotios talk about deploying Kafka for banks within the cloud and drill into the profession of being a solutions engineer. EPISODE LINKSWatch the video version of this podcastJoin the Confluent CommunityLearn more with Kafka tutorials, resources, and guides at Confluent DeveloperLive demo: Intro to Event-Driven Microservices with ConfluentUse PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)

Streaming Audio: a Confluent podcast about Apache Kafka
Running Hundreds of Stream Processing Applications with Apache Kafka at Wise

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Dec 21, 2021 31:08 Transcription Available


What's it like building a stream processing platform with around 300 stateful stream processing applications based on Kafka Streams? Levani Kokhreidze (Principal Engineer, Wise) shares his experience building such a platform that the business depends on for multi-currency movements across the globe. He explains how his team uses Kafka Streams for real-time money transfers at Wise, a fintech organization that facilitates international currency transfers for 11 million customers. Getting to this point and expanding the stream processing platform is not, however, without its challenges. One of the major challenges at Wise is to aggregate, join, and process real-time event streams to transfer currency instantly. To accomplish this, the Wise relies on Apache Kafka® as an event broker, as well as Kafka Streams, the accompanying Java stream processing library. Kafka Streams lets you build event-driven microservices for processing streams, which can then be deployed alongside the Kafka cluster of your choice. Wise also uses the Interactive Queries feature in Kafka streams, to query internal application state at runtime. The Wise stream processing platform has gradually moved them away from a monolithic architecture to an event-driven microservices model with around 400 total microservices working together. This has given Wise the ability to independently shape and scale each service to better serve evolving business needs. Their stream processing platform includes a domain-specific language (DSL) that provides libraries and tooling, such as Docker images for building your own stream processing applications with governance. With this approach, Wise is able to store 50 TB of stateful data based on Kafka Streams running in Kubernetes. Levani shares his own experiences in this journey with you and provides you with guidance that may help you follow in Wise's footsteps. He covers how to properly delegate ownership and responsibilities for sourcing events from existing data stores, and outlines some of the pitfalls they encountered along the way. To cap it all off, Levani also shares some important lessons in organization and technology, with some best practices to keep in mind. EPISODE LINKSKafka Streams 101 courseReal-Time Stream Processing with Kafka Streams ft. Bill BejeckWatch the video version of this podcastJoin the Confluent CommunityLearn more with Kafka tutorials, resources, and guides at Confluent DeveloperLive demo: Intro to Event-Driven Microservices with ConfluentUse PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)

Streaming Audio: a Confluent podcast about Apache Kafka
ksqlDB Fundamentals: How Apache Kafka, SQL, and ksqlDB Work Together ft. Simon Aubury

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Dec 1, 2021 30:42 Transcription Available


What is ksqlDB and how does Simon Aubury (Principal Data Engineer, Thoughtworks) use it to track down the plane that wakes his cat Snowy in the morning? Experienced in building real-time applications with ksqlDB since its genesis, Simon provides an introduction to ksqlDB by sharing some of his projects and use cases. ksqlDB is a database purpose-built for stream processing applications and lets you build real-time data streaming applications with SQL syntax. ksqlDB reduces the complexity of having to code with Java, making it easier to achieve outcomes through declarative programming, as opposed to procedural programming. Before ksqlDB, you could use the producer and consumer APIs to get data in and out of Apache Kafka®; however, when it comes to data enrichment, such as joining, filtering, mapping, and aggregating data, you would have to use the Kafka Streams API—a robust and scalable programming interface influenced by the JVM ecosystem that requires Java programming knowledge. This presented scaling challenges for Simon, who was at a multinational insurance company that needed to stream loads of data from disparate systems with a small team to scale and enrich data for meaningful insights. Simon recalls discovering ksqlDB during a practice fire drill, and he considers it as a memorable moment for turning a challenge into an opportunity.Leveraging your familiarity with relational databases, ksqlDB abstracts away complex programming that is required for real-time operations both for stream processing and data integration, making it easy to read, write, and process streaming data in real time.Simon is passionate about ksqlDB and Kafka Streams as well as getting other people inspired by the technology. He's been using ksqlDB for projects, such as taking a stream of information and enriching it with static data. One of Simon's first ksqlDB projects was using Raspberry Pi and a software-defined radio to process aircraft movements in real time to determine which plane wakes his cat Snowy up every morning. Simon highlights additional ksqlDB use cases, including e-commerce checkout interaction to identify where people are dropping out of a sales funnel. EPISODE LINKSksqlDB 101 courseA Guide to ksqlDB Fundamentals and Stream Processing ConceptsksqlDB 101 Training with Live Walkthrough ExerciseKSQL-ops! Running ksqlDB in the WildArticles from Simon AuburyWatch the video version of this podcastJoin the Confluent CommunityLearn more with Kafka tutorials, resources, and guides at Confluent DeveloperLive demo: Intro to Event-Driven Microservices with ConfluentUse PODCAST100 to get $100 of free Confluent Cloud usage (details)

Streaming Audio: a Confluent podcast about Apache Kafka
Explaining Stream Processing and Apache Kafka ft. Eugene Meidinger

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Nov 23, 2021 29:28 Transcription Available


Many of us find ourselves in the position of equipping others to use Apache Kafka® after we've gained an understanding of what Kafka is used for. But how do you communicate and teach others event streaming concepts effectively? As a Pluralsight instructor and business intelligence consultant, Eugene Meidinger shares tips for creating consumable training materials for conveying event streaming concepts to developers and IT administrators, who are trying to get on board with Kafka and stream processing. Eugene's background as a database administrator (DBA) and immense knowledge of event streaming architecture and data processing shows as he reveals his learnings from years of working with Microsoft Power BI, Azure Event Hubs, data processing, and event streaming with ksqlDB and Kafka Streams. Eugene mentions the importance of understanding your audience, their pain points, and their questions, such as why was Kafka invented? Why does ksqlDB matter? It also helps to use metaphors where appropriate. For example, when explaining what is processing typology for Kafka Streams, Eugene uses the analogy of a highway where people are getting on a bus as the blocking operations, after the grace period, the bus will leave even without passengers, meaning after the window session, the processor will continue even without events. He also likes to inject a sense of humor in his training and keeps empathy in mind. Here is the structure that Eugene uses when building courses:The first module is usually fundamentals, which lays out the groundwork and the objectives of the courseIt's critical to repeat and summarize core concepts or major points; for example, a key capability of Kafka is the ability to decouple data in both network space and in time Provide variety and different modalities that allow people to consume content through multiple avenues, such as screencasts, slides, and demos, wherever it makes senseEPISODE LINKSBuilding ETL Pipelines from Streaming Data with Kafka and ksqlDBDon't Make Me Think | Steve KrugDesign for How People Learn | Julie Dirksen Watch the video version of this podcastJoin the Confluent CommunityLearn more with Kafka tutorials, resources, and guides at Confluent DeveloperLive demo: Intro to Event-Driven Microservices with ConfluentUse PODCAST100 to get $100 of free Confluent Cloud usage (details)

explaining kafka dba pluralsight podcastjoin apache kafka microsoft power bi stream processing kafka streams meidinger ksqldb
Streaming Audio: a Confluent podcast about Apache Kafka
Confluent Platform 7.0: New Features + Updates

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Nov 9, 2021 12:16 Transcription Available


Confluent Platform 7.0 has launched and includes Apache Kafka® 3.0, plus new features introduced by KIP-630: Kafka Raft Snapshot, KIP-745: Connect API to restart connector and task, and KIP-695: Further improve Kafka Streams timestamp synchronization. Reporting from Dubai, Tim Berglund (Senior Director, Developer Advocacy, Confluent) provides a summary of new features, updates, and improvements to the 7.0 release, including the ability to create a real-time bridge from on-premises environments to the cloud with Cluster Linking. Cluster Linking allows you to create a single cluster link between multiple environments from Confluent Platform to Confluent Cloud, which is available on public clouds like AWS, Google Cloud, and Microsoft Azure, removing the need for numerous point-to-point connections. Consumers reading from a topic in one environment can read from the same topic in a different environment without risks of reprocessing or missing critical messages. This provides operators the flexibility to make changes to topic replication smoothly and byte for byte without data loss. Additionally, Cluster Linking eliminates any need to deploy MirrorMaker2 for replication management while ensuring offsets are preserved. Furthermore, the release of Confluent for Kubernetes 2.2 allows you to build your own private cloud in Kafka. It completes the declarative API by adding cloud-native management of connectors, schemas, and cluster links to reduce the operational burden and manual processes so that you can instead focus on high-level declarations. Confluent for Kubernetes 2.2 also enhances elastic scaling through the Shrink API.  Following ZooKeeper's removal in Apache Kafka 3.0, Confluent Platform 7.0 introduces KRaft in preview to make it easier to monitor and scale Kafka clusters to millions of partitions. There are also several ksqlDB enhancements in this release, including foreign-key table joins and the support of new data types—DATE and TIME— to account for time values that aren't TIMESTAMP. This results in consistent data ingestion from the source without having to convert data types.EPISODE LINKSDownload Confluent Platform 7.0Check out the release notesRead the Confluent Platform 7.0 blog postWatch the video version of this podcastJoin the Confluent CommunityLearn more with Kafka tutorials, resources, and guides at Confluent DeveloperLive demo: Intro to Event-Driven Microservices with ConfluentUse PODCAST100 to get $100 of free Confluent Cloud usage (details)

Streaming Audio: a Confluent podcast about Apache Kafka
Real-Time Stream Processing with Kafka Streams ft. Bill Bejeck

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Nov 4, 2021 35:32 Transcription Available


Kafka Streams is a native streaming library for Apache Kafka® that consumes messages from Kafka to perform operations like filtering a topic's message and producing output back into Kafka. After working as a developer in stream processing, Bill Bejeck (Apache Kafka Committer and Integration Architect, Confluent) has found his calling in sharing knowledge and authoring his book, “Kafka Streams in Action.” As a Kafka Streams expert, Bill is also the author of the Kafka Streams 101 course on Confluent Developer, where he delves into what Kafka Streams is, how to use it, and how it works. Kafka Streams provides the abstraction over Kafka consumers and producers by minimizing administrative details like the need to code and manage frameworks required when using plain Kafka consumers and producers to process streams. Kafka Streams is declarative—you can state what you want to do, rather than how to do it. Kafka Streams leverages the KafkaConsumer protocol internally; it inherits its dynamic scaling properties and the consumer group protocol to dynamically redistribute the workload. When Kafka Streams applications are deployed separately but have the same application.id, they are logically still one application. Kafka Streams has two processing APIs, the declarative API or domain-specific language (DSL)  is a high-level language that enables you to build anything needed with a processor topology, whereas the Processor API lets you specify a processor typology node by node, providing the ultimate flexibility. To underline the differences between the two APIs, Bill says it's almost like using the object-relational mapping framework (ORM) versus SQL. The Kafka Streams 101 course is designed to get you started with Kafka Streams and to help you learn the fundamentals of: How streams and tables work How stateless and stateful operations work How to handle time windows and out of order dataHow to deploy Kafka StreamsEPISODE LINKSKafka Streams 101 courseA Guide to Kafka Streams and Its UsesKafka Streams 101 meetupWatch the video version of this podcastJoin the Confluent CommunityLearn more with Kafka tutorials, resources, and guides at Confluent DeveloperLive demo: Intro to Event-Driven Microservices with ConfluentUse podcon19 to get 40% off "Kafka Streams in Action"Use podcon19 to get 40% off "Event Streaming with Kafka Streams and ksqlDB"Use PODCAST100 to get $100 of free Confluent Cloud usage (details)

action api real time apis kafka sql dsl orm podcastjoin confluent apache kafka datahow stream processing kafka streams integration architect ksqldb
Streaming Audio: a Confluent podcast about Apache Kafka
Getting Started with Spring for Apache Kafka ft. Viktor Gamov

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Oct 19, 2021 32:44 Transcription Available


What's the distinction between the Spring Framework and Spring Boot? If you are building a car, the Spring Framework is the engine while Spring Boot gives you the vehicle that you ride in. With experience teaching and answering questions on how to use Spring and Apache Kafka® together, Viktor Gamov (Principal Developer Advocate, Kong) designed a free course on Confluent Developer and previews it in this episode. Not only this, but he also explains why the opinionated Spring Framework would be a good hero in Marvel. Spring is an ever-evolving framework that embraces modern, cloud-native technologies with cross-language options, such as Kotlin integration. Unlike its predecessors, the Spring Framework supports a modern version of Java and the requirements of the Twelve-Factor App manifesto for you to move an application between environments without changing the code. With that engine in place, Spring Boot introduces a microservices architecture. Spring Boot contains databases and messaging systems integrations, reducing development time and increasing overall productivity. Spring for Apache Kafka applies best practices of the Spring community to the Kafka ecosystem, including features that abstract away infrastructure code for you to focus on programming logic that is important for your application. Spring for Apache Kafka provides a wrapper around the producer and consumer to ease Kafka configuration with APIs, including KafkaTemplate, MessageListenerContainer, @KafkaListener, and TopicBuilder.The Spring Framework and Apache Kafka course will equip you with the knowledge you need in order to build event-driven microservices using Spring and Kafka on Confluent Cloud. Tim and Viktor also discuss Spring Cloud Stream as well as Spring Boot integration with Kafka Streams and more. EPISODE LINKSSpring Framework and Apache Kafka courseSpring for Apache Kafka 101Bootiful Stream Processing with Spring and KafkaLiveStreams with Viktor GamovUse kafkaa35 to get 30% off "Kafka in Action"Watch the video version of this podcastJoin the Confluent CommunityLearn more with Kafka tutorials, resources, and guides at Confluent DeveloperLive demo: Intro to Event-Driven Microservices with ConfluentUse PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)

Streaming Audio: a Confluent podcast about Apache Kafka
Apache Kafka 3.0 - Improving KRaft and an Overview of New Features

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Play 30 sec Highlight Listen Later Sep 21, 2021 15:17 Transcription Available


Apache Kafka® 3.0 is out! To spotlight major enhancements in this release, Tim Berglund (Apache Kafka Developer Advocate) provides a summary of what's new in the Kafka 3.0 release from Krakow, Poland, including API changes and improvements to the early-access Kafka Raft (KRaft). KRaft is a built-in Kafka consensus mechanism that's replacing Apache ZooKeeper going forward. It is recommended to try out new KRaft features in a development environment, as KRaft is not advised for production yet. One of the major features in Kafka 3.0 is the efficiency for KRaft controllers and brokers to store, load, and replicate snapshots into a Kafka cluster for metadata topic partitioning. The Kafka controller is now responsible for generating a Kafka producer ID in both ZooKeeper and KRaft, easing the transition from ZooKeeper to KRaft on the Kafka 3.X version line. This update also moves us closer to the ZooKeeper-to-KRaft bridge release. Additionally, this release includes metadata improvements, exactly-once semantics, and KRaft reassignments. To enable a stronger record delivery guarantee, Kafka producers turn on by default idempotency, together with acknowledgment delivery by all the replicas. This release also comprises enhancements to Kafka Connect task restarts, Kafka Streams timestamp based synchronization and more flexible configuration options for MirrorMaker2 (MM2). The first version of MirrorMaker has been deprecated, and MirrorMaker2 will be the focus for future developments. Besides that, this release drops support for older message formats, V0 and V1, as well as initiates the removal of Java 8 and Scala 2.12 across all components in Apache Kafka. The universal Java 8 and Scala 2.12 deprecation is anticipated to complete in the future Apache Kafka 4.0 release.Apache Kafka 3.0 is a major release and step forward for the Apache Kafka project!EPISODE LINKSApache Kafka 3.0 release notes Read the blog to learn moreDownload Apache Kafka 3.0Watch the video version of this podcastJoin the Confluent Community Slack

Streaming Audio: a Confluent podcast about Apache Kafka
Using Apache Kafka and ksqlDB for Data Replication at Bolt

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Aug 26, 2021 29:15 Transcription Available


What does a ride-hailing app that offers micromobility and food delivery services have to do with data in motion? In this episode, Ruslan Gibaiev (Data Architect, Bolt) shares about Bolt's road to adopting Apache Kafka® and ksqlDB for stream processing to replicate data from transactional databases to analytical warehouses. Rome wasn't built overnight, nor was the adoption of Kafka and  ksqlDB at Bolt. Initially, Bolt noticed the need for system standardization and replacing the unreliable query-based change data capture (CDC) process. As an experienced Kafka developer, Ruslan believed that Kafka is the solution for adopting change data capture as a company-wide event streaming solution. Persuading the team at Bolt to adopt and buy in was hard at first, but Ruslan made it possible. Eventually, the team replaced query-based CDC with log-based CDC from Debezium, built on top of Kafka. Shortly after the implementation, developers at Bolt began to see precise, correct, and real-time data. As Bolt continues to grow, they see the need to implement a data lake or a data warehouse for OTP system data replication and stream processing. After carefully considering several different solutions and frameworks such as ksqlDB, Apache Flink®, Apache Spark™, and Kafka Streams, ksqlDB shines most for their business requirement. Bolt adopted ksqlDB because it is native to the Kafka ecosystem, and it is a perfect fit for their use case. They found ksqlDB to be a particularly good fit for replicating all their data to a data warehouse for a number of reasons, including: Easy to deploy and manageLinearly scalableNatively integrates with Confluent Schema Registry Turn in to find out more about Bolt's adoption journey with Kafka and ksqlDB. EPISODE LINKSInside ksqlDB Course ksqlDB 101 CourseHow Bolt Has Adopted Change Data Capture with Confluent PlatformAnalysing Changes with Debezium and Kafka StreamsNo More Silos: How to Integrate Your Databases with Apache Kafka and CDCChange Data Capture with Debezium ft. Gunnar MorlingAnnouncing ksqlDB 0.17.0Real-Time Data Replication with ksqlDBWatch the video version of this podcastJoin the Confluent CommunityLearn more with Kafka tutorials, resources, and guides at Confluent DeveloperLive demo: Kafka streaming in 10 minutes on Confluent CloudUse PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)