Podcasts about apache pulsar

  • 35PODCASTS
  • 53EPISODES
  • 54mAVG DURATION
  • 1MONTHLY NEW EPISODE
  • Mar 20, 2025LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about apache pulsar

Latest podcast episodes about apache pulsar

DMRadio Podcast
Just In Time: How Streaming Architectures Enable Business

DMRadio Podcast

Play Episode Listen Later Mar 20, 2025 54:12


In an era where real-time decision-making is a serious competitive advantage, streaming-first architectures are revolutionizing how organizations process and act on data. Unlike traditional batch-oriented systems, streaming platforms like Apache Kafka, Redpanda, Apache Flink, and Apache Pulsar enable continuous data ingestion, transformation, and analysis at scale. These technologies empower businesses to break free from the limitations of periodic data updates, unlocking the ability to react instantly to events, personalize customer experiences in real-time, and drive automation with high-velocity insights. By decoupling producers and consumers through scalable, event-driven pipelines, streaming-first architectures not only enhance system resilience but also pave the way for a more agile, intelligence-driven enterprise. Register for this episode of DM Radio to learn how today's innovators are leveraging this rapidly evolving discipline.

The Ravit Show
StreamNative's Partnership with Ververica

The Ravit Show

Play Episode Listen Later Jan 1, 2025 13:22


Thrilled to be covering Flink Forward Berlin 2024 hosted by Ververica | Original creators of Apache Flink®, where I had a fascinating conversation with Sijie Guo, CEO of StreamNative! As the first partner in Ververica's new “Powered by Ververica” program, StreamNative is making waves by integrating Ververica's VERA engine into its platform. Sijie shared exciting insights into how this collaboration brings real-time data processing to life across industries like financial services, automotive, and IoT. We discussed use cases where the blend of Apache Pulsar and Flink simplifies data processing while enhancing operational efficiency and scalability for their customers. With a shared commitment to open-source innovation, StreamNative and Ververica are pushing boundaries to bring advanced data capabilities to businesses worldwide. Thank you, Sijie, for a visionary conversation! #data #ververica #flinkforward #apacheflink #datastreaming #theravitshow

Open at Intel
AI, Community, and the Future of Generative Applications

Open at Intel

Play Episode Listen Later Nov 27, 2024 20:53


In this engaging conversation at the All Things Open conference, Tim Spann, Principal Developer Advocate at Zilliz, discusses the importance of community collaboration in advancing AI technologies. He emphasizes the need for diverse perspectives in solving complex problems and highlights his work with the Milvus open source vector database. Tim also explains the evolving landscape of retrieval augmented generation (RAG) and its applications and shares insights into the future of AI development. The conversation concludes on a lighter note with Tim describing his creative use of Milvus in a fun Halloween project to catalog and identify ghosts. 00:00 Introduction 00:41 Meet Tim Spann: Principal Developer Advocate 01:35 The Importance of Community in AI 02:56 Advanced RAG and Multimodal Models 06:17 The Future of Agentic RAG 09:04 Challenges and Excitement in AI Development 13:35 Building AI the Right Way 17:50 Fun with AI: Capturing Ghosts 19:24 Conclusion and Final Thoughts   Guest: Tim Spann is a Principal Developer Advocate for Zilliz and Milvus. He works with Apache NiFi, Apache Kafka, Apache Pulsar, Apache Flink, Flink SQL, Apache Pinot, Trino, Apache Iceberg, DeltaLake, Apache Spark, Big Data, IoT, Cloud, AI/DL, machine learning, and deep learning. Tim has over ten years of experience with the IoT, big data, distributed computing, messaging, streaming technologies, and Java programming. Previously, he was a Principal Developer Advocate at Cloudera, Developer Advocate at StreamNative, Principal DataFlow Field Engineer at Cloudera, a Senior Solutions Engineer at Hortonworks, a Senior Solutions Architect at AirisData, a Senior Field Engineer at Pivotal and a Team Leader at HPE. He blogs for DZone, where he is the Big Data Zone leader, and runs a popular meetup in Princeton & NYC on Big Data, Cloud, IoT, deep learning, streaming, NiFi, the blockchain, and Spark. Tim is a frequent speaker at conferences such as ApacheCon, DeveloperWeek, Pulsar Summit and many more. He holds a BS and MS in computer science.

Spring Office Hours
S3E32 - Streaming Data with Chris Bono

Spring Office Hours

Play Episode Listen Later Oct 1, 2024 56:34


In this episode of Spring Office Hours, hosts Dan Vega and DeShaun Carter interview Chris Bono, a Spring team member who works on Spring Cloud Dataflow and Spring Pulsar. They discuss streaming data, comparing Apache Kafka and Apache Pulsar, and explore the features and use cases of Spring Cloud Stream applications. Chris provides insights into the architecture of streaming applications, explains key concepts, and highlights the benefits of using Spring's abstraction layers for working with messaging systems.Show Notes:Introduction to Chris Bono and his work on Spring Cloud Dataflow and Spring PulsarComparison between Apache Kafka and Apache PulsarOverview of Spring Cloud Stream and its bindersExplanation of source, processor, and sink concepts in streaming applicationsIntroduction to Spring Cloud Stream Applications projectDiscussion on Change Data Capture (CDC) and its importance in streamingExploration of various sources, processors, and sinks available in Spring Cloud Stream ApplicationsMention of KEDA (Kubernetes Event-driven Autoscaling) and its potential use with Spring Cloud applicationsUpcoming features in Spring Pulsar 1.2 releaseImportance of community feedback and using GitHub discussions for feature requests and issue reportingThe podcast provides a comprehensive overview of streaming data concepts and how Spring projects can be used to build efficient streaming applications.

airhacks.fm podcast with adam bien
High-Performance Java, Or How JVector Happened

airhacks.fm podcast with adam bien

Play Episode Listen Later May 18, 2024 61:16


An airhacks.fm conversation with Jonathan Ellis (@spyced) about: Jonathan's first computer experiences with IBM PC 8086 and Thinkpad laptop with Red Hat Linux, becoming a key contributor to Apache Cassandra and founding datastax, starting DataStax to provide commercial support for Cassandra, early experiences with Java, C++, and python, discussion about the evolution of Java and its ecosystem, the importance of vector databases for semantic search and retrieval augmented generation, the development of JVector for high-performance vector search in Java, the potential of integrating JVector with LangChain for Java / langchain4j and quarkus for serverless deployment, the advantages of Java's productivity and performance for building concurrent data structures, the shift from locally installed software to cloud-based services, the challenges of being a manager and the benefits of taking a sabbatical to focus on creative pursuits, the importance of separating storage and compute in cloud databases, Cassandra's write-optimized architecture and improvements in read performance, DataStax's investment in Apache Pulsar for stream processing, the llama2java project for high-performance language models in Java Jonathan Ellis on twitter: @spyced

The GeekNarrator
Messaging and Streaming with Apache Pulsar - with Matteo Merli

The GeekNarrator

Play Episode Listen Later Jan 27, 2024 63:46


In this video I talk about Apache Pulsar with Matteo Merli, CTO at StreamNative. This episode will provide you good insight about how Apache Pulsar works and more importantly differs with the most popular Pub/Sub and streaming platform Apache Kafka. Things like, what enables possibility of 1 million topics? Why is rebalancing not required? How does decoupled storage and compute architecture works? How it uses the concept of Subscriptions to avoid retaining data unnecessarily? And much more... Chapters: 00:00 Introduction and Guest Introduction 00:08 Understanding Apache Pulsar and its Origin 01:22 The Problem Apache Pulsar was Designed to Solve 02:35 The Evolution of Apache Pulsar 05:15 Understanding Basic Concepts of Apache Pulsar 09:27 Deep Dive into Apache Pulsar's Architecture 21:16 Understanding the Flow of Data in Apache Pulsar 28:54 Understanding Subscriptions in Apache Pulsar 31:57 Understanding End-to-End Latency and Subscription Creation 32:32 Broker's Role and Handling Metadata 33:05 Memory Management and Consumer Handling 34:07 Message Processing and Flow Control 34:32 Message Storage and Retrieval 36:00 Comparing Pulsar with Kafka 43:52 Understanding Multi-Tenancy in Pulsar 49:17 Exploring Tiered Storage and Future Developments Important links: StreamNative: https://streamnative.io/ Apache Pulsar: https://pulsar.apache.org/ Matteo Merli: https://twitter.com/merlimat =============================================================================== For discount on the below courses: Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003 Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003 Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003 Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount. =============================================================================== Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator If you like this episode, please hit the like button and share it with your network. Also please subscribe if you haven't yet. Database internals series: https://youtu.be/yV_Zp0Mi3xs Popular playlists: Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA- Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17 Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN Stay Curios! Keep Learning!

Engenharia de Dados [Cast]
Challenge in Bulding an Open-Source Community with Aaron Williams

Engenharia de Dados [Cast]

Play Episode Listen Later Nov 21, 2023 78:58


No episódio de hoje, Luan Moreno e Mateus Oliveira entrevistaram Aron Willians, atualmente como Community Manager/ Developer Advocate na Ampare. Aaron é apaixonado por trazer novas tecnologias para desenvolvedores atuais e para a próxima geração, por meio de hacking e treinamento prático. Neste podcast, você vai aprender sobre: Desafios na construção da comunidade de código aberto; Visão gerencial de comunidades de dados;Empresas que estão investindo no impulsionamento do Pulsar.Falamos também nesse bate-papo sobre os seguintes temas:Tecnologia como Kuberbetes; Apache Pulsar.Aprenda mais sobre as comunidades de dados e sobre as principais tecnologias do Mercado.Ararob Willians= https://www.linkedin.com/in/aaron-don-williams/ Luan Moreno = https://www.linkedin.com/in/luanmoreno/

Enterprise Java Newscast
Stackd 66: Streams, Messages, Events, and a Java User Group

Enterprise Java Newscast

Play Episode Listen Later Aug 11, 2023 121:43


Ian, Kito, and Josh are joined by Java Champion, Streaming Developer Advocate at DataStax, and President of Chicago-JUG, Mary Grygleski. They discuss news about Capacitor, Angular, PrimeNG Designer for Tailwind, JetBraiins Compose Multiplatform for iOS, JDK 21,  AI developer tools, Jakarta EE 10, and more. Kito announces the work he is doing on the Jakarta EE Tutorial, and then they delve into Mary's background and event streaming with Apache Pulsar, plus tools like Apache Pinot, Apache Flink, RisingWave, ByteWax and Apache Cassandra. We Thank DataDog for sponsoring this podcast! https://www.pubhouse.net/datadog Front End  - Announcing Capacitor 5.0 - Ionic Blog (https://ionic.io/blog/announcing-capacitor-5)  - Angular v16 is here! (https://blog.angular.io/angular-v16-is-here-4d7a28ec680d)  - Compose Multiplatform (https://blog.jetbrains.com/kotlin/2023/05/compose-multiplatform-for-ios-is-in-alpha/)  - PrimeNG Designer - Tailwind (Q3 2023) (https://www.primefaces.org/primeng-theme-designer-with-tailwind/) Server Side Java  - Kito is working with Bauke Scholtz and Arjan Tjmes to refresh the Jakarta EE Tutorial     - Eclipse Documentation for Jakarta EE (https://projects.eclipse.org/projects/ee4j.jakartaee-documentation)    - Antora (https://antora.org)    - Asciidoc (http://asciidoc.org)  - Jakarta EE 10; MicroProfile 6; Java SE 20; Open Liberty (https://openliberty.io/blog/2023/04/04/23.0.0.3.html)  - Jakarta EE Starter (https://start.jakarta.ee/) AI/ML  - Phind - AI search engine for developers (https://www.phind.com/)  - 92% of devs using AI coding assistants (https://www.zdnet.com/article/github-developer-survey-finds-92-of-programmers-using-ai-tools/) Java Platform  - JDK 21, the next LTS release, due out in September (https://www.infoworld.com/article/3689880/jdk-21-the-new-features-in-java-21.html) IDE and Tools  - Grazie Professional - IntelliJ IDEs Plugin | Marketplace (https://plugins.jetbrains.com/plugin/16136-grazie-professional) Chat w/Mary  - Twitter: @mgrygles (https://twitter.com/mgrygles)  - Discord server:  https://discord.gg/RMU4Juw  - LinkedIn:  https://www.linkedin.com/in/mary-grygleski/  - Apache Pulsar (https://pulsar.apache.org/)  - Apache Pinot (https://pinot.apache.org/)  - Apache Flink (https://flink.apache.org/)  - RisingWave (https://www.risingwave.dev/)  - ByteWax (https://bytewax.io/)  - Apache Cassandra (https://cassandra.apache.org/)  - Apache Kafka (https://kafka.apache.org/) Picks   - Quantum Energy Squares (Kito) (https://quantumsquares.com/)  - JBOSS EAP on Azure (Josh) (https://learn.microsoft.com/en-us/azure/developer/java/ee/jboss-on-azure)  - Interstellar (Mary) (https://www.imdb.com/title/tt0816692/)  - Black Mirror Season 6 Episode 1 - Joan Is Awful - Netflix (Ian) (https://www.rottentomatoes.com/tv/black_mirror/s06/e01) Other Pubhouse Network podcasts   - Breaking into Open Source (https://www.pubhouse.net/breaking-into-open-source)  - OffHeap (https://www.javaoffheap.com/)  - Java Pubhouse (https://www.javapubhouse.com/) Events  - Lone Star Software Symposium - July 14 - 15, Austin, TX, USA (https://nofluffjuststuff.com/austin)  - ÜberConf - July 18 - 21, Denver, CO, USA (https://uberconf.com/)  - Nebraska.code() - July 19-20, Lincoln, NE, USA (https://nebraskacode.amegala.com/)

The Hacking Open Source Business Podcast
The Story of Apache Pulsar and StreamNative through Sijie Guo's Perspective - Ep. 28

The Hacking Open Source Business Podcast

Play Episode Listen Later May 11, 2023 52:45


Join HOSS Matt Yonkovit and Scarf CEO Avi Press on the Hacking Open Source Business Podcast as they talk with Sijie Guo, CEO of StreamNative, where he shares his journey with Apache Pulsar and the future of open-source tech. Explore the power of Pulsar, community building, documentation importance, and the balance between open source and commercial value. They also discuss the challenges and learning curves of growing an open-source business. Prioritizing customer demands, delivering value to the majority, and staying aligned with strategic direction are explored as essential considerations. Chapters:00:00 - Introduction01:22 - Rapid Fire Questions09:03 - Getting to know Sijie Guo13:08 - The Impact of Yahoo, Twitter, and StreamNative on Open Source21:15 - Day Zero of Apache Pulsar Community24:18 - Open Source Commercialization32:51 - Splitting between SaaS and Enterprise Offerings36:07 - When would you say no to customer asks?38:24 - Experience growing the company and the community44:55 - The Importance of OSS Community and How To Measure It50:55 - The Value of User-Focused Improvements in Open Source Business52:59 - Advice to Startup FoundersSijie's LinkedIn Profile: https://www.linkedin.com/in/sijieg/Checkout our other interviews, clips, and videos: https://l.hosbp.com/YoutubeDon't forget to visit the open-source business community at: https://opensourcebusiness.community/Visit our primary sponsor, Scarf, for tools to help analyze your #opensource growth and adoption: https://about.scarf.sh/Subscribe to the podcast on your favorite app:Spotify: https://l.hosbp.com/SpotifyApple: https://l.hosbp.com/AppleGoogle: https://l.hosbp.com/GoogleBuzzsprout: https://l.hosbp.com/Buzzsprout

airhacks.fm podcast with adam bien
Star Trek, Star Wars, Transactions, SQL, NoSQL and almost Streaming

airhacks.fm podcast with adam bien

Play Episode Listen Later Feb 5, 2023 62:10


An airhacks.fm conversation with Mary Grygleski (@mgrygles) about: 808X as first computer, Hong Kong was high tech, enjoying space missions, Star Trek and Star Wars, the intriguing registration terminal, writing code in Pascal, 3 GL programming languages and SQL, set theory and SQL, the seven layers of OSI, OSI model, IBM MVS, AS 400 is the opposite of micro services, developers get bored too early, learning X-Windows, working with early Oracle databases, using dBASE, clipper and FoxPro, transarc, stratos tx, Transarc the transaction file system, Transaction Processing: Concepts and Techniques, working on SMTP / MTA, CouchDB and Lotus Notes, the Sun Ultra 30 workstation, starting at Sybase, EA server Sybase / Jaguar, using emacs for Java development, then netbeans, Java EE and the hierarchical class loaders, working on EJB 3 specs, mobile apps with Apache Cordova, reactive systems at IBM, using akka, Eclipse Vertex and MicroProfile, working for datastax and Pulsar, Datastax provides support for Apache Cassandra and Apache Pulsar, separating the compute from the storage, astra the managed cloud platform Mary Grygleski on twitter: @mgrygles

Engenharia de Dados [Cast]
Enabling User-Facing Analytics using Apache Pinot with Kishore Gopalakrishna

Engenharia de Dados [Cast]

Play Episode Listen Later Dec 29, 2022 52:11


Neste episódio entrevistamos o Kishore Gopalakrishna, Co-Fundador e CEO da empresa StarTree, Luan Moreno e Mateus Oliveira batem um papo com o co-criador dessa poderosa ferramenta chamada Apache Pinot.O Pinot é um OLAP DataStore desenvolvido para responder consultas analíticas com tempo de resposta na casa dos milissegundos, podendo ser considerado um banco de dados para consultas em tempo-real. Capaz de ingerir de fontes de dados em Batch (Hadoop HDFS, Amazon S3, Azure ADLS, Google Cloud Storage), bem como fontes de dados em Stream (Apache Kafka, Apache Pulsar, Amazon Kinesis).O Pinot foi projetado para executar consultas OLAP em tempo real, com baixa latência em grandes quantidades de eventos para entregar o conceito de User-Facing Analytics.Foi criado e desenvolvido por engenheiros do LinkedIn e do Uber e projetado para escalar e expandir sem limites.Apache PinotKishore GopalakrishnaStarTree Luan Moreno = https://www.linkedin.com/in/luanmoreno/

Data on Kubernetes Community
Architecting Your First Event Driven Serverless Streaming Applications on K8 // Timothy Spann (DoK Day North America 2022)

Data on Kubernetes Community

Play Episode Listen Later Nov 2, 2022 13:29


From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) ABSTRACT Once you have built a topic in Apache Pulsar, you will quickly see the need to build event-driven applications. This can require a lot of decisions on what framework to use, where to run it, how to deploy it, and how to manage these applications on Kubernetes cloud natively. I will walk you through step-by-step in building Pulsar Functions which is the easy way to design, test, develop, integrate, deploy, monitor, and manage serverless streaming applications in Java and Python. Together we will build a full application as an Apache Pulsar function and enjoy the power of running it in the cloud for IoT events and add any routing, transformation, or machine learning that we need to accomplish our business requirements. Through FunctionMesh we run on Kubernetes natively. In this talk, you will deploy ML functions to transform real-time data on Kubernetes.

Cyber and Technology with Mike
04 October 2022 Cyber and Tech News

Cyber and Technology with Mike

Play Episode Listen Later Oct 4, 2022 10:26


In today's podcast we cover four crucial cyber and technology topics, including: 1.        Apache Pulsar flaw discovered; patch available 2.        Ferrarri confirms ownership of leaked data; begins investigation3.        New Zealand Healthcare organization dealing with apparent ransomware 4.        FBI arrests former NSA employee for trying to sell stolen data I'd love feedback, feel free to send your comments and feedback to  | cyberandtechwithmike@gmail.com

A Bootiful Podcast
Big data legend, former Pivot, and friend to the Spring community, Tim Spann

A Bootiful Podcast

Play Episode Listen Later Sep 15, 2022 88:16


Hi, Spring fans! In this installment, Josh Long (@starbuxman) talks to big data legend, former Pivot, and friend to the Spring community, Tim Spann (@PaaSDev), about big data, StreamNative, and Apache Pulsar. Get your notebooks ready for this one, class!

Code Story
S6 Bonus: Addison Higham, StreamNative

Code Story

Play Episode Listen Later Aug 4, 2022 26:02


Addison Higham is a father of 2 kids, and has been married for 12 years. His family is of the utmost importance to him, and he has been happy to be able to balance his career and personal life along his journey. As he puts it - he has been a nerd since day 1, building, fixing and playing with computers from a young age. Along with his CS interests, his family was very entrepreneurial growing up. During schooling, he joined some startups to learn the ropes of building solutions. Outside of tech, he loves to ski and be outdoors.The makers of Apache Pulsar, an open source project, decided to build a cloud-native event streaming platform. Early on in the venture, Addison joined his team as a Chief Architect, in order to enable enterprises to easily access data as real-time event streams.This is the creation story of StreamNative.SponsorsImmediateOrbitPostmarkStytchVerb DataWebapp.ioLinksWebsite: https://streamnative.io/LinkedIn: https://www.linkedin.com/in/addisonj/Support this podcast at — https://redcircle.com/code-story/donationsAdvertising Inquiries: https://redcircle.com/brandsPrivacy & Opt-Out: https://redcircle.com/privacy

Chinchilla Squeaks
Apache Pulsar with Patrick McFadin

Chinchilla Squeaks

Play Episode Listen Later Aug 3, 2022 28:29


I have Patrick McFadin back on the show to discuss using Apache Pulsar for distributed cloud-native streaming and how it fits into Data Stax's plans and business goals. --- Send in a voice message: https://anchor.fm/chinchillasqueaks/message

Engenharia de Dados [Cast]
Apache Pulsar: A Plataforma de Streaming Distribuída mais Completa do Mercado com Samuel Matioli

Engenharia de Dados [Cast]

Play Episode Listen Later Jul 29, 2022 59:57


O Apache Pulsar é a nova plataforma de streaming mais querida da Fortune 500 e o Samuel Matioli, Arquiteto de Dados da DataStax traz toda sua experiência de campo para falar sobre esse tópico no nosso podcast.Nesse episódio falamos sobre:Mercado de Dados Hoje em DiaSoluções em Batch vs. StreamingThe Killing Features do Apache PulsarAstra Streaming - Serviço Auto-Gerenciável de StreamingApache Kafka vs. Apache PulsarKubernetes como Tipo de Deployment para Soluções de Dados em Tempo-RealSamuel Matioli = https://www.linkedin.com/in/samuelmatioli/   Astra Streaming = https://www.datastax.com/products/astra-streaming No YouTube possuímos um canal de Engenharia de Dados com os tópicos mais importantes dessa área e com lives todas as quartas-feiras.https://www.youtube.com/channel/UCnErAicaumKqIo4sanLo7vQ Quer ficar por dentro dessa área com posts e updates semanais, então acesse o LinkedIN para não perder nenhuma notícia.https://www.linkedin.com/in/luanmoreno/ Disponível no Spotify e na Apple Podcasthttps://open.spotify.com/show/5n9mOmAcjra9KbhKYpOMqYhttps://podcasts.apple.com/br/podcast/engenharia-de-dados-cast/  Luan Moreno = https://www.linkedin.com/in/luanmoreno/

Data on Kubernetes Community
Serverless Event Streaming Applications as Functions on K8 (DoK Day EU 2022) // Timothy Spann

Data on Kubernetes Community

Play Episode Listen Later May 27, 2022 8:43


https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) We will walk through how to build serverless event streaming applications as functions running in a function mesh on kubernetes with cloud native messaging via Apache Pulsar. In this talk, you will deploy ML functions to transform real-time data on Kubernets. Tim Spann is a Developer Advocate @ StreamNative where he works with Apache Pulsar, Apache Flink, Apache NiFi, Apache MXNet, TensorFlow, Apache Spark, big data, the IoT, machine learning, and deep learning. Tim has over a decade of experience with the IoT, big data, distributed computing, streaming technologies, and Java programming. Previously, he was a Principal Field Engineer at Cloudera, a Senior Solutions Architect at AirisData and a senior field engineer at Pivotal. He blogs for DZone, where he is the Big Data Zone leader, and runs a popular meetup in Princeton on big data, the IoT, deep learning, streaming, NiFi, the blockchain, and Spark. Tim is a frequent speaker at conferences such as IoT Fusion, Strata, ApacheCon, Data Works Summit Berlin, DataWorks Summit Sydney, and Oracle Code NYC. He holds a BS and MS in computer science. https://www.datainmotion.dev/p/about-me.html https://dzone.com/users/297029/bunkertor.html https://conferences.oreilly.com/strata/strata-ny-2018/public/schedule/speaker/185963

Orchestrate all the Things podcast: Connecting the Dots with George Anadiotis
Evaluating the streaming data ecosystem: StreamNative releases benchmark comparing Apache Pulsar to Apache Kafka. Featuring Chief Architect & Head of Cloud Engineering Addison Higham

Orchestrate all the Things podcast: Connecting the Dots with George Anadiotis

Play Episode Listen Later Apr 7, 2022 30:23


Processing data in real-time is on the rise. The streaming analytics market (which depending on definitions, may just be one segment of the streaming data market) is projected to grow from $15.4 billion in 2021 to $50.1 billion in 2026, at a Compound Annual Growth Rate (CAGR) of 26.5% during the forecast period as per Markets and Markets. A multitude of streaming data alternatives, each with its own focus and approach, has emerged in the last few years. One of those alternatives is Apache Pulsar. In 2021, Pulsar ranked as a Top 5 Apache Software Foundation project and surpassed Apache Kafka in monthly active contributors. In another episode in the data streaming saga, StreamNative just released a report comparing Apache Pulsar to Apache Kafka in terms of performance benchmarks. We caught up with StreamNative Chief Architect & Head of Cloud Engineering Addison Higham to discuss the report's findings, as well as the bigger picture in data streaming. Article published on VentureBeat

Into the Hopper
Apache Pulsar with Jowanza Joseph

Into the Hopper

Play Episode Listen Later Feb 6, 2022


I talk with Jowanza Joseph about his new book Mastering Apache Pulsar: Cloud Native Event Streaming at Scale

DMRadio Podcast
You *Can* Step in the Same Stream Twice

DMRadio Podcast

Play Episode Listen Later Dec 18, 2021 53:58


Streaming technology has upended the data business, largely thanks to Apache Kafka, but also because of other technologies such as Apache Flink and Apache Pulsar. In this episode, Host @eric_kavanagh interviews Paul Brebner, Instaclustr; along with Tim Spann and David Kjerrumgaard of StreamNative. 

airhacks.fm podcast with adam bien
Debezium, Server, Engine, UI and the Outbox

airhacks.fm podcast with adam bien

Play Episode Listen Later Nov 28, 2021 67:11


An airhacks.fm conversation with Gunnar Morling (@gunnarmorling) about: debezium as analytics enablement, enriching events with quarkus, ksqlDB and PrestoDB and trino, cloud migrations with Debezium, embedded Debezium Engine, debezium server vs. Kafka Connect, Debezium Server with sink connectors, Apache Pulsar, Redis Streams are supporting Debezium Server, Debezium Server follows the microservice architecture, pluggable offset stores, JDBC offset store is Apache Iceberg connector, DB2, MySQL, PostgreSQL, MongoDB change streams, Cassandra, Vitess, Oracle, Microsoft SQL Server scylladb is cassandra compatible and provides external debezium connector, debezium ui is written in React, incremental snapshots, netflix cdc system, DBLog: A Watermark Based Change-Data-Capture Framework, multi-threaded snapshots, internal data leakage and the Outbox pattern, debezium listens to the outbox pattern, OpenTracing integration and the outbox pattern, sending messages directly to transaction log with PostgreSQL, Quarkus outbox pattern extension, the transaction boundary topic Gunnar Morling on twitter: @gunnarmorling and debezium.io

Data on Kubernetes Community
DoK Talks #79- Running Apache Pulsar in Kubernetes // Chris Bartholomew

Data on Kubernetes Community

Play Episode Listen Later Aug 26, 2021 66:45


https://go.dok.community/slack https://dok.community/ ABSTRACT OF THE TALK When I founded Kesque in 2019, my goal was to Kubernetes as the base platform for all our software. Because Kesque was a cloud-based SaaS powered by Apache Pulsar, that meant getting Apache Pulsar, a high-performance streaming solution and Kafka alternative, up and running in Kubernetes. In this talk, I will give an overview of Apache Pulsar and describe how we got Pulsar up and running in Kubernetes. We will cover some of the features of Pulsar that make it "cloud-native" and easy to run in Kubernetes as well as some of the challenges we faced and how we solved them. Kesque was acquired by DataStax, which is a strong supporter of Kubernetes. I will also cover how we continue to use Kubernetes as the foundation for the work we are doing at DataStax around Apache Pulsar. BIO Chris Bartholomew is a Streaming Engineering Leader at DataStax. He has been working with high-performance pub–sub systems for over a decade. He has tested, supported, and operated messaging systems that are deployed in banking, capital markets, and transportation industries. He was the founder and CEO of Kesque, a cloud-based managed service built around Apache Pulsar that was acquired by DataStax.

Software Daily
Pulsar Rerevisted with Enrico Olivelli

Software Daily

Play Episode Listen Later Jul 26, 2021


In the previous episode, Pulsar Revisited, we discussed how the company DataStax has added to their product stack Astra Streaming, their cloud-native messaging and event streaming service that's built on top of Apache Pulsar. We discussed Apache Pulsar and the added features DataStax offers like injecting machine learning into your data streams and viewing real-time

Podcast – Software Engineering Daily
Pulsar Rerevisted with Enrico Olivelli

Podcast – Software Engineering Daily

Play Episode Listen Later Jul 26, 2021 56:17


In the previous episode, Pulsar Revisited, we discussed how the company DataStax has added to their product stack Astra Streaming, their cloud-native messaging and event streaming service that's built on top of Apache Pulsar. We discussed Apache Pulsar and the added features DataStax offers like injecting machine learning into your data streams and viewing real-time The post Pulsar Rerevisted with Enrico Olivelli appeared first on Software Engineering Daily.

Software Engineering Daily
Pulsar Rerevisted with Enrico Olivelli

Software Engineering Daily

Play Episode Listen Later Jul 26, 2021 48:21


In the previous episode, Pulsar Revisited, we discussed how the company DataStax has added to their product stack Astra Streaming, their cloud-native messaging and event streaming service that's built on top of Apache Pulsar. We discussed Apache Pulsar and the added features DataStax offers like injecting machine learning into your data streams and viewing real-time The post Pulsar Rerevisted with Enrico Olivelli appeared first on Software Engineering Daily.

Data – Software Engineering Daily
Pulsar Rerevisted with Enrico Olivelli

Data – Software Engineering Daily

Play Episode Listen Later Jul 26, 2021 48:21


In the previous episode, Pulsar Revisited, we discussed how the company DataStax has added to their product stack Astra Streaming, their cloud-native messaging and event streaming service that's built on top of Apache Pulsar. We discussed Apache Pulsar and the added features DataStax offers like injecting machine learning into your data streams and viewing real-time The post Pulsar Rerevisted with Enrico Olivelli appeared first on Software Engineering Daily.

Podcast – Software Engineering Daily
Pulsar Revisited with Jonathan Ellis

Podcast – Software Engineering Daily

Play Episode Listen Later Jul 23, 2021 53:25


Apache Pulsar is a cloud-native, distributed messaging and streaming platform originally created at Yahoo! and now a top-level Apache Software Foundation project (pulsar.apache.org). Pulsar is used by many large companies like Yahoo!, Verizon media, Tencent, and Splunk. The company DataStax, an open, multi-cloud stack for modern data apps, has added to their product stack Astra The post Pulsar Revisited with Jonathan Ellis appeared first on Software Engineering Daily.

Software Daily
Pulsar Revisited with Jonathan Ellis

Software Daily

Play Episode Listen Later Jul 23, 2021


Apache Pulsar is a cloud-native, distributed messaging and streaming platform originally created at Yahoo! and now a top-level Apache Software Foundation project (pulsar.apache.org). Pulsar is used by many large companies like Yahoo!, Verizon media, Tencent, and Splunk. The company DataStax, an open, multi-cloud stack for modern data apps, has added to their product stack Astra

Cloud Engineering – Software Engineering Daily
Pulsar Revisited with Jonathan Ellis

Cloud Engineering – Software Engineering Daily

Play Episode Listen Later Jul 23, 2021 53:25


Apache Pulsar is a cloud-native, distributed messaging and streaming platform originally created at Yahoo! and now a top-level Apache Software Foundation project (pulsar.apache.org). Pulsar is used by many large companies like Yahoo!, Verizon media, Tencent, and Splunk. The company DataStax, an open, multi-cloud stack for modern data apps, has added to their product stack Astra The post Pulsar Revisited with Jonathan Ellis appeared first on Software Engineering Daily.

Software Engineering Daily
Pulsar Revisited with Jonathan Ellis

Software Engineering Daily

Play Episode Listen Later Jul 23, 2021 45:33


Apache Pulsar is a cloud-native, distributed messaging and streaming platform originally created at Yahoo! and now a top-level Apache Software Foundation project (pulsar.apache.org). Pulsar is used by many large companies like Yahoo!, Verizon media, Tencent, and Splunk. The company DataStax, an open, multi-cloud stack for modern data apps, has added to their product stack Astra The post Pulsar Revisited with Jonathan Ellis appeared first on Software Engineering Daily.

SplunkTalk Podcast
S2E27 - Checking The Pulse On Pulsar

SplunkTalk Podcast

Play Episode Listen Later Jun 29, 2021 63:00


In this episode we are super excited to go deep on Apache Pulsar and the challenges of stream processing at scale! We are joined by Jerry Peng, Principal Software Engineer at Splunk, a very active committer and PMC member to the Apache Heron, Storm, and Pulsar projects, and co-creator of several important Pulsar components. Check it out! Video version: https://youtu.be/QgokB1FlnTg

The Data Stack Show
41: Doing MLOps on Top of Apache Pulsar and Trino with Joshua Odmark of Pandio

The Data Stack Show

Play Episode Listen Later Jun 23, 2021 50:21


Highlights from this week's episode:Joshua started his first company at age 15 and then sold two more startups after that (2:15)Embracing the open source movement and not reinventing the wheel if you don't have to (12:15)Pulsar seemed built to address Kafka's weaknesses (17:23)Using Redis as a coordinator for federated learning and taking advantage of its portability (23:05)The pillars of Pandio and some practical use cases (31:24)Feature stores and model versioning (38:23)Seeing Pulsar as the future because of the ability to run tens of millions of topics (41:04)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Les Cast Codeurs Podcast
LCC 256 - jTerrasse

Les Cast Codeurs Podcast

Play Episode Listen Later May 24, 2021 80:36


Antonio et Emmanuel discutent entre autre de JavaDoc, Quarkus, Crypto dans le CI, bootstrap 5, Grafana, cloud de confiance sans oublier les crowdcasts sur Cypress et sur hack.commit.push du 29 mai. Enregistré le 21 mai 2021 Téléchargement de l’épisode LesCastCodeurs-Episode–256.mp3 News Langages Un JEP pour améliorer la JavaDoc On va pouvoir référencer par exemple des morceaux de code dans un autre fichier, dans un test, et l’intégrer dans la JavaDoc d’une méthode, d’une classe. Ca permettra d’avoir de la doc vraiment à jour au niveau des bouts de code, vu que ce sera toujours le vrai code qui tourne qui sera inséré dans la JavaDoc. Il pourra y avoir également de la coloration syntaxique de définir des régions qui doivent être surlignées pour être bien visibles Il sera possible de modifier certaines parties d’un snippet de code, par exemple pour cacher une chaine de caractère de test dont on se moque de la valeur quand on explique ce bout de code Possibilité de rajouter des liens hypertextes sur certains bouts de code, pour pointer par exemple vers la JavaDoc d’une méthode utilisée dans ce bout de code Pourvu qu’ils reprennent le plus possible la syntaxe asciidoctor qui a déjà résolu ce problème Asciidoclet Discussion sur le raisons du besoin derrière Loom Article qui reste d.un premier niveau, il faut creuser,les bénéfices réels IO et synchro bloque un thread. Limite scalabilité. Le code asynchrone est plus dur à comprendre. Virtual threads don’t bien pour des taches qui passent beaucoup de temps à attendre Les API IO blocantes parkent le virtual thread quand elles sont en attente Un poller (boucle d’evenement) regarde les IO et leur état et unpark les virtualthread correspondant Mechanisme similaire aux frameworks non blocs to de type vert.x mais avec une API bloxante Librairies Quarkus 2.0 alpha 1, 2 et 3 sont sortis Quarkus 2 parce que vert.x 4 et MicroProfile 4, pas de “gros” breaking changes mais quelques uns surtout pour les extensions Continuous Testing: dans la console, on voit les tests qui plantent. Et quand on fait un code change, uniquement les tests qui sont impactés sont joués (flow analysis). Lance aussi dans un container dédié les dépendances (e.g. une base de donnée pour les tests utilisant Hibernate). LE container pour les tests en continu est différent de celui pour le quarkus:dev qui tourner (pas de pollution). JDK 11 minimum Micronaut 2.5 est sorti support for @java 16 and @graalvm 21.1 on Micronaut Launch, huge improvements to Micronaut Data from @DenisStepanov, improved @OracleCloud integration and many other small improvements Infrastructure Les cryptomineurs tuent les CI gratuite Les mineurs de crypto monnaies abusent des services de CI qui offre des capacités de build gratuites Une des nouvelles astuces c’est d’utiliser les outils comme Pupetteer pour automatiser l’utilisation d’un navigateur web, pour miner de la crypto monnaie dans le navigateur qui tourne en headless sur la machine de CI A la grande époque de OpenShift online et OpenShift.io, on a beaucoup appris sur le detection des Bitcoin miners :) on a eu le soucis sur Codeship (la CI SaaS de CloudBees). Ils ont passé un max de temps à virer et proteger les builds. J’ai vu que GitHub avait eu aussi le soucis Les 19 étapes facile pour écrire un dockerfile En vérifiant l’ordre de ses commandes, en limitant le scope de Copy, d’aligner les RUN d’installation de package, d’utiliser des images officielles, voire de se créer ses images de base, d’utiliser des tags spécifiques pour des images plus reproductibles, effacer le cache du package manager, de builder dans une image offrant un environnement cohérent, de récupérer ses dépendance dans une étape à part, de faire du multi-stage build… Ou d’utiliser les Cloud Native Buildpacks! (sur lesquels Joe bosse) Article qui nous explique la complexité et les trade off impossibles. Et donc que buildpack c’est indispensable Comparaison Apache Kafka et Apache Pulsar pulsar a des brokers sans etat et deriere il y a des bookkeepers (qui stockent les data). Cela permet plus de flexiblités pour augmenter ou descendre le nbombre de brokers. mais avec plus de “moving parts” et avec un hop de reseau supplémentaire. Mais l’architecture est plus flexible notamment pour Kubernetes Le stockage étagé et la geo replication est plus facile dans Pulsar (par default). Stockage etageé c’est de stocker l’info dans un S3 quand ellee st vielle par example. Pulsar est multitenant par design. Pulsar accepte des gros messages et sit les fragmenter au besoin plus grosse communaute sur Kafka mais il y a des composants non open source (Confluent). Cloud Red Hat OpenShift Streams for Apache Kafka : un service cloud de Kafkas managé C’est ce sur quoi emmanuel a bossé ses 9 derniers mois Essayer le Managed Kafka de red hat Red Hat OpenShift Streams for Apache Kafka: un cloud service de Kafka managés https://twitter.com/emmanuelbernard/status/1387686420903563264 Super intégration avec Quarkus et utilise Quarkus a l’intérieur Web Bootstrap 5 est sorti New offcanvas component New accordion New and updated forms RTL is here Overhauled utilities New snippet examples Improved customizing Browser support Dropped Microsoft Edge Legacy Dropped Internet Explorer 10 and 11 Dropped Firefox < 60 Dropped Safari < 10 Dropped iOS Safari < 10 Dropped Chrome < 60 Dropped Android < 6 JavaScript No more jQuery! Le Guide de migration est ici Crowdcast sur Cypress par Emmanuel Demey La fin de Google AMP ou son intérêt devrait descendre AMP avait un avantage majeur. Celui d’être en premier sur les résultats du moteur de recherche. Et les médias passaient en AMP rien que pour ça parce que le traffic du moteur de recherche dominant est essentiel Mais AMP posait beaucoup de problèmes techniques et éthiques. Le contenu était hébergé et caché sur des pros idées edge et en pratique Google. Donc les mesures d’audience étaient plus compliqeees Et les ads avaient aussi des bias pavers la régie google. Les prochains scoring de google search seront neutre ce qui risque faire baisser les pages amp Les pages amp avaient du réinventer beaucoup de concepts du web Outillage JFrog garde Bintray JCenter en lecture seule y compris le miroir de Maven central Ca sent le truc planifie pour faire migrer et descendre le traffic et arriver en bon samaritain après. Cela dit ils étaient bon samaritains avec la version gratuite Au moins les builds anciens ne vont pas casser Docker desktop : sauter une mise à jour devient une option payante a partir de Docker 3.3 on peut éviter l’installation d’une nouvelle version avec la souscription pro ou team. Si j’ai bien compris. Tu peux faire un rappel pour plus tard mais tu ne peux effectivement pas refuser définitivement une version donnée sans payer sinon ils te harcèlent (je ne connais pas la fréquence) pour upgrader. En gros si tu ne paies pas tu dois être sur latest. Ils ne vont pas faire du support sur d’anciennes version pour les clients gratuits Ce qui est logique. Spock 2.0 Spock est rebasé sur JUnit Platform Support de l’exécution en parallèle des test specs et des test features Support de Groovy 3 Améliorations des tests avec des données tabulaires Sécurité Bug de dénie de service dans snakeyml C’est du à la capacité de faire des références qui contiennent une référence à un élément plus haut. Paf récursion infinie. à un moment, notre support YAML dans Groovy utilisait SnakeYaml il me semble, mais je viens de vérifier, on est passé à Jackson Loi, société et organisation Grafana, Loki et Tempo passent de ASL 2 à AGPL La AGPL c’est la GPL mais pour lequel un services est comme une distribution inspiré par MongoLab CoackroachDB etc Cela reste open source au moins même si il y a des interprétations différentes du linkage et donc des risques Est-ce que un service qui utilise grafana doit entièrement être AGPL? Quand un troll de brevet attaque, cloudflare contre attaque cloud flare est attaqué par un troll de brevet et contre attaque pour la seconde fois en payant la recherche d’antériorité sur l’ensemble du porte feuille de brevets de cette entité. Pour lui faire perdre une bonne partie de la valeur. « You do not negotiate with terrorists or children » BaseCamp perd 30% de ses employés après son ban de conversations sociétales La liste des noms d’employés « funny » est ressorti avec des relents racistes Les employés ont visiblement eu un débat dessus DHH et Fry on fait un mémo bannissant les conversations politiques et sociétale parce que elle n’amenaient pas de bien pour la société (resentment etc) Mais les employés le voient comme une façon de ne pas voir les sujets importants en face et les impactes des produits tech sur la société Ils on offert un golden parachute à qui voulait partir Et boom 30% ont dit oui Stratégie nationale du cloud français cloud solution d’hébergement par défaut des services numériques d’état protégé de règlementation extracommunautaire contre le cloud act et autres lois label “Cloud de confiance” c’est comme le porc salut mise à jour du SecNumCloud de l’ANSSI solution hybride société Française ou Européenne en utilisant les briques logicielles de groups américains serveurs en France opérés par des entreprises européennes détenues par des européens “les américains sont les plus avancés” Google et Microsoft ont signé l’accord de licence donc pas Amazon Cloud de Confiance en qui ? par Laurent Doguin Outils de l’épisode MuseGroup rachète audacity Enfin la marque Promet des designers sur l’interface et des contributeurs Et de rester open source On va voir Conférences Devoxx france bougent au 29, 30 septembre et 1er octobre Crowdcast d’Agathe sur hack.commit.push samedi 29 mai, inscrivez-vous ! Nous contacter Soutenez Les Cast Codeurs sur Patreon https://www.patreon.com/LesCastCodeurs Faire un crowdcast ou une crowdquestion Contactez-nous via twitter https://twitter.com/lescastcodeurs sur le groupe Google https://groups.google.com/group/lescastcodeurs ou sur le site web https://lescastcodeurs.com/

BadGeek
Les Cast Codeurs n°256 du 24/05/21 - LCC 256 - jTerrasse (81min)

BadGeek

Play Episode Listen Later May 24, 2021 81:50


Antonio et Emmanuel discutent entre autre de JavaDoc, Quarkus, Crypto dans le CI, bootstrap 5, Grafana, cloud de confiance sans oublier les crowdcasts sur Cypress et sur hack.commit.push du 29 mai. Enregistré le 21 mai 2021 Téléchargement de l'épisode [LesCastCodeurs-Episode-256.mp3](https://traffic.libsyn.com/lescastcodeurs/LesCastCodeurs-Episode-256.mp3) ## News ### Langages [Un JEP pour améliorer la JavaDoc](https://openjdk.java.net/jeps/413) * On va pouvoir référencer par exemple des morceaux de code dans un autre fichier, dans un test, et l’intégrer dans la JavaDoc d’une méthode, d’une classe. Ca permettra d’avoir de la doc vraiment à jour au niveau des bouts de code, vu que ce sera toujours le vrai code qui tourne qui sera inséré dans la JavaDoc. * Il pourra y avoir également de la coloration syntaxique * de définir des régions qui doivent être surlignées pour être bien visibles * Il sera possible de modifier certaines parties d’un snippet de code, par exemple pour cacher une chaine de caractère de test dont on se moque de la valeur quand on explique ce bout de code * Possibilité de rajouter des liens hypertextes sur certains bouts de code, pour pointer par exemple vers la JavaDoc d’une méthode utilisée dans ce bout de code * Pourvu qu’ils reprennent le plus possible la syntaxe asciidoctor qui a déjà résolu ce problème [Asciidoclet](https://github.com/asciidoctor/asciidoclet) [Discussion sur le raisons du besoin derrière Loom](https://inside.java/2021/05/10/networking-io-with-virtual-threads/) * Article qui reste d.un premier niveau, il faut creuser,les bénéfices réels * IO et synchro bloque un thread. Limite scalabilité. Le code asynchrone est plus dur à comprendre. * Virtual threads don’t bien pour des taches qui passent beaucoup de temps à attendre * Les API IO blocantes parkent le virtual thread quand elles sont en attente * Un poller (boucle d’evenement) regarde les IO et leur état et unpark les virtualthread correspondant * Mechanisme similaire aux frameworks non blocs to de type vert.x mais avec une API bloxante ### Librairies [Quarkus 2.0 alpha 1, 2 et 3 sont sortis](https://quarkus.io/blog/quarkus-2-0-0-alpha1-released/) * Quarkus 2 parce que vert.x 4 et MicroProfile 4, pas de “gros” breaking changes mais quelques uns surtout pour les extensions * Continuous Testing: dans la console, on voit les tests qui plantent. Et quand on fait un code change, uniquement les tests qui sont impactés sont joués (flow analysis). * Lance aussi dans un container dédié les dépendances (e.g. une base de donnée pour les tests utilisant Hibernate). LE container pour les tests en continu est différent de celui pour le quarkus:dev qui tourner (pas de pollution). * JDK 11 minimum [Micronaut 2.5 est sorti ](https://docs.micronaut.io/latest/guide/#whatsNew) * support for @java 16 and @graalvm 21.1 on Micronaut Launch, * huge improvements to Micronaut Data from @DenisStepanov, * improved @OracleCloud integration * and many other small improvements ### Infrastructure [Les cryptomineurs tuent les CI gratuite](https://layerci.com/blog/crypto-miners-are-killing-free-ci/) * Les mineurs de crypto monnaies abusent des services de CI qui offre des capacités de build gratuites * Une des nouvelles astuces c’est d’utiliser les outils comme Pupetteer pour automatiser l’utilisation d’un navigateur web, pour miner de la crypto monnaie dans le navigateur qui tourne en headless sur la machine de CI * A la grande époque de OpenShift online et OpenShift.io, on a beaucoup appris sur le detection des Bitcoin miners :) * on a eu le soucis sur Codeship (la CI SaaS de CloudBees). Ils ont passé un max de temps à virer et proteger les builds. J’ai vu que GitHub avait eu aussi le soucis [Les 19 étapes facile pour écrire un dockerfile](https://jkutner.github.io/2021/04/26/write-good-dockerfile.html) * En vérifiant l’ordre de ses commandes, en limitant le scope de Copy, d’aligner les RUN d’installation de package, d’utiliser des images officielles, voire de se créer ses images de base, d’utiliser des tags spécifiques pour des images plus reproductibles, effacer le cache du package manager, de builder dans une image offrant un environnement cohérent, de récupérer ses dépendance dans une étape à part, de faire du multi-stage build... Ou d’utiliser les Cloud Native Buildpacks! (sur lesquels Joe bosse) * Article qui nous explique la complexité et les trade off impossibles. Et donc que buildpack c’est indispensable [Comparaison Apache Kafka et Apache Pulsar](https://blog.bigdataboutique.com/2021/03/apache-kafka-vs-apache-pulsar-video-fd3fi2) * pulsar a des brokers sans etat et deriere il y a des bookkeepers (qui stockent les data). * Cela permet plus de flexiblités pour augmenter ou descendre le nbombre de brokers. mais avec plus de “moving parts” et avec un hop de reseau supplémentaire. * Mais l’architecture est plus flexible notamment pour Kubernetes * Le stockage étagé et la geo replication est plus facile dans Pulsar (par default). Stockage etageé c’est de stocker l’info dans un S3 quand ellee st vielle par example. * Pulsar est multitenant par design. * Pulsar accepte des gros messages et sit les fragmenter au besoin * plus grosse communaute sur Kafka mais il y a des composants non open source (Confluent). ### Cloud [Red Hat OpenShift Streams for Apache Kafka : un service cloud de Kafkas managé](https://twitter.com/emmanuelbernard/status/1387687197621563396) * C’est ce sur quoi emmanuel a bossé ses 9 derniers mois * [Essayer le Managed Kafka de red hat](https://red.ht/TryKafka) * Red Hat OpenShift Streams for Apache Kafka: un cloud service de Kafka managés https://twitter.com/emmanuelbernard/status/1387686420903563264 * Super intégration avec Quarkus et utilise Quarkus a l’intérieur ### Web [Bootstrap 5 est sorti](https://blog.getbootstrap.com/2021/05/05/bootstrap-5/) * New offcanvas component * New accordion * New and updated forms * RTL is here * Overhauled utilities * New snippet examples * Improved customizing * Browser support * Dropped Microsoft Edge Legacy * Dropped Internet Explorer 10 and 11 * Dropped Firefox < 60 * Dropped Safari < 10 * Dropped iOS Safari < 10 * Dropped Chrome < 60 * Dropped Android < 6 * JavaScript * No more jQuery! * Le [Guide de migration est ici](https://getbootstrap.com/docs/5.0/migration/) Crowdcast sur [Cypress](https://www.cypress.io/) par Emmanuel Demey [La fin de Google AMP ou son intérêt devrait descendre ](https://www.lafoo.com/the-end-of-amp/) * AMP avait un avantage majeur. Celui d’être en premier sur les résultats du moteur de recherche. * Et les médias passaient en AMP rien que pour ça parce que le traffic du moteur de recherche dominant est essentiel * Mais AMP posait beaucoup de problèmes techniques et éthiques. Le contenu était hébergé et caché sur des pros idées edge et en pratique Google. * Donc les mesures d’audience étaient plus compliqeees * Et les ads avaient aussi des bias pavers la régie google. * Les prochains scoring de google search seront neutre ce qui risque faire baisser les pages amp * Les pages amp avaient du réinventer beaucoup de concepts du web ### Outillage [JFrog garde Bintray JCenter en lecture seule y compris le miroir de Maven central ](https://jfrog.com/blog/into-the-sunset-bintray-jcenter-gocenter-and-chartcenter/) * Ca sent le truc planifie pour faire migrer et descendre le traffic et arriver en bon samaritain après. Cela dit ils étaient bon samaritains avec la version gratuite * Au moins les builds anciens ne vont pas casser [Docker desktop : sauter une mise à jour devient une option payante](https://www.docker.com/blog/changing-how-updates-work-with-docker-desktop-3-3/) * a partir de Docker 3.3 on peut éviter l’installation d’une nouvelle version avec la souscription pro ou team. Si j’ai bien compris. * Tu peux faire un rappel pour plus tard mais tu ne peux effectivement pas refuser définitivement une version donnée sans payer sinon ils te harcèlent (je ne connais pas la fréquence) pour upgrader. * En gros si tu ne paies pas tu dois être sur latest. Ils ne vont pas faire du support sur d’anciennes version pour les clients gratuits * Ce qui est logique. [Spock 2.0](https://spockframework.org/spock/docs/2.0/release_notes.html) * Spock est rebasé sur JUnit Platform * Support de l’exécution en parallèle des test specs et des test features * Support de Groovy 3 * Améliorations des tests avec des données tabulaires ### Sécurité [Bug de dénie de service dans snakeyml](https://snyk.io/blog/java-yaml-parser-with-snakeyaml/) * C’est du à la capacité de faire des références qui contiennent une référence à un élément plus haut. Paf récursion infinie. * à un moment, notre support YAML dans Groovy utilisait SnakeYaml il me semble, mais je viens de vérifier, on est passé à Jackson ### Loi, société et organisation [Grafana, Loki et Tempo passent de ASL 2 à AGPL](https://grafana.com/blog/2021/04/20/grafana-loki-tempo-relicensing-to-agplv3/) * La AGPL c’est la GPL mais pour lequel un services est comme une distribution * inspiré par MongoLab CoackroachDB etc * Cela reste open source au moins même si il y a des interprétations différentes du linkage et donc des risques * Est-ce que un service qui utilise grafana doit entièrement être AGPL? [Quand un troll de brevet attaque, cloudflare contre attaque](https://techcrunch.com/2021/04/26/cloudflare-rallies-the-troops-to-fight-off-another-so-called-patent-troll/?guccounter=1&guce_referrer=aHR0cHM6Ly9kdWNrZHVja2dvLmNvbS8&guce_referrer_sig=AQAAAEKNBJxidgIYvuXxPu-69VCJuD9nzkRUHMT62_2SS9vEox3eoMhFekoDHrH4ZSrjpsithr74uN62VF-i-6mt4MRqRREcR7NOFjiGy1T5VARNkaXcxG6F3zXxBqCyBUSxaoECUB1yCMc7XChZ6BKwEjdbUPIQtzmraWENdciwdYja) * cloud flare est attaqué par un troll de brevet et contre attaque pour la seconde fois en payant la recherche d’antériorité sur l’ensemble du porte feuille de brevets de cette entité. * Pour lui faire perdre une bonne partie de la valeur. « You do not negotiate with terrorists or children » [BaseCamp perd 30% de ses employés après son ban de conversations sociétales ](https://www.google.com/amp/s/marker.medium.com/amp/p/d487bed43155) * La liste des noms d’employés « funny » est ressorti avec des relents racistes * Les employés ont visiblement eu un débat dessus * DHH et Fry on fait un mémo bannissant les conversations politiques et sociétale parce que elle n’amenaient pas de bien pour la société (resentment etc) * Mais les employés le voient comme une façon de ne pas voir les sujets importants en face et les impactes des produits tech sur la société * Ils on offert un golden parachute à qui voulait partir * Et boom 30% ont dit oui [Stratégie nationale du cloud français](https://www.lemonde.fr/economie/article/2021/05/17/cloud-la-france-se-veut-plus-souveraine_6080442_3234.html) * cloud solution d'hébergement par défaut des services numériques d'état * protégé de règlementation extracommunautaire * contre le cloud act et autres lois * label "Cloud de confiance" c'est comme le porc salut * mise à jour du SecNumCloud de l'ANSSI * solution hybride société Française ou Européenne en utilisant les briques logicielles de groups américains * serveurs en France * opérés par des entreprises européennes * détenues par des européens * "les américains sont les plus avancés" * Google et Microsoft ont signé l'accord de licence * donc pas Amazon [Cloud de Confiance en qui ? par Laurent Doguin](https://ldoguin.name/fr/2021/05/quoi-cloud/) ## Outils de l'épisode [MuseGroup rachète audacity](https://www.minimachines.net/actu/muse-group-rachete-le-logiciel-audacity-99063) * Enfin la marque * Promet des designers sur l’interface et des contributeurs * Et de rester open source * On va voir ## Conférences [Devoxx france bougent au 29, 30 septembre et 1er octobre](https://twitter.com/DevoxxFR/status/1389489979978563584) Crowdcast d'Agathe sur [hack.commit.push](https://paris2021.hack-commit-pu.sh/) samedi 29 mai, inscrivez-vous ! ## Nous contacter Soutenez Les Cast Codeurs sur Patreon [Faire un crowdcast ou une crowdquestion](https://lescastcodeurs.com/crowdcasting/) Contactez-nous via twitter sur le groupe Google ou sur le site web

Streaming Audio: a Confluent podcast about Apache Kafka
Examining Apache Kafka Performance Metrics ft. Alok Nikhil

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Feb 1, 2021 50:30 Transcription Available


Coming up with an honest test built on open source tools in an easily documented, replicable environment for a distributed system like Apache Kafka® is not simple. Alok Nikhil (Cloud Native Engineer, Confluent) shares about getting Kafka in the cloud and how best to leverage Confluent Cloud for high performance and scalability. His blog post “Benchmarking Apache Kafka, Apache Pulsar, and RabbitMQ: Which is the Fastest?” discusses how Confluent tested Kafka’s performance on the latest cloud hardware using research-based methods to answer this question. Alok and Tim talk through the vendor-neutral framework OpenMessaging Benchmark used for the tests, which is Pulsar’s standardized benchmarking framework for event streaming workloads. Alok and his co-author Vinoth Chandar helped improve that framework, evaluated messaging systems in the event streaming space like RabbitMQ, and talked about improvements to those existing platforms. Later in this episode, Alok shares what he believes would help move Kafka forward and what he predicts to come soon, like KIP-500, the removal of ZooKeeper dependency in Kafka. EPISODE LINKSBenchmarking Apache Kafka, Apache Pulsar, and RabbitMQ: Which is the Fastest?Join the Confluent Community SlackLearn more with Kafka tutorials, resources, and guides at Confluent DeveloperLive demo: Kafka streaming in 10 minutes on Confluent CloudUse 60PDCAST to get an additional $60 of free Confluent Cloud usage (details)

The Cloudify Tech Talk Podcast
Episode Nine - Machine Learning & AI (ft. Joshua Odmark, Pandio)

The Cloudify Tech Talk Podcast

Play Episode Listen Later Dec 17, 2020 66:52


In this unique edition of the Cloudify Tech Talk Podcast, we are taking a dive into machine learning and AI when it comes to DevOps - all through the eyes of our special guest Joshua Odmark, CTO and Founder of Pandio. In this discussion Josh walks us through his experience with different distributed messaging systems such as Apache Kafka, SQS and explains why he chose Apache Pulsar from many other machine learning and messaging related systems.  

Message à caractère informatique
Message À Caractère Informatique #4 - Faire Du SQL Static Et Distribué Avec Des Fonctions Rust

Message à caractère informatique

Play Episode Listen Later Jun 24, 2020 53:08


Toutes les notes sont disponibles sur https://www.clever-cloud.com/fr/podcast/episode4 Avec par ordre d'apparition : @waxzce @ldoguin @urcadox @kannarfr Ahana, nouvel éditeur derrière Presto https://www.nextplatform.com/2020/06/05/presto-is-the-third-time-charm-for-federated-databases/ Spark 3 https://databricks.com/blog/2020/06/18/introducing-apache-spark-3-0-now-available-in-databricks-runtime-7-0.html Pulsar 2.6.0 https://pulsar.apache.org/blog/2020/06/18/Apache-Pulsar-2-6-0/ Introduction à Datalog https://x775.net/2019/03/18/Introduction-to-Datalog.html Pulsar summit featuring Geal et Kannar https://twitter.com/streamnativeio/status/1272616113772228608 Le FaaS chez Clever Cloud https://www.youtube.com/watch?v=wchehMIsu80&t=8s Redis Raft https://jepsen.io/analyses/redis-raft-1b3fbf6 Hugo sur Clever avec ou sans cellar https://www.clever-cloud.com/blog/engineering/2020/06/18/deploy-static-site-hugo/ https://www.clever-cloud.com/blog/engineering/2020/06/24/deploy-cellar-s3-static-site/ Résumé WWDC https://www.engadget.com/apple-wwdc2020-event-roundup-203226945.html Le gros supercalculateur du monde utilise ARM https://www.top500.org/news/japan-captures-top500-crown-arm-powered-supercomputer/ TSMC annonce une finesse de gravure de 5nm https://www.notebookcheck.net/TSMC-officially-begins-5-nm-production-Snapdragon-875-SoC-Snapdragon-X60-5G-modem-A14-Bionic-and-a-5-nm-AMD-high-end-GPU-incoming.477119.0.html Les nouveaux laptops Clever Cloud https://www.tuxedocomputers.com/index.php https://system76.com/ https://www.coreboot.org/ Le report rustls https://github.com/ctz/rustls/blob/master/audit/TLS-01-report.pdf Outils CLI en rust https://towardsdatascience.com/awesome-rust-powered-command-line-utilities-b5359c38692 Sd https://github.com/chmln/sd Amp https://github.com/jmacdonald/amp Dust https://github.com/bootandy/dust Le choix musical de Kannar: https://www.youtube.com/watch?v=SwYN7mTi6HM

Software Daily
Pravega: Storage for Streams with Flavio Junquiera

Software Daily

Play Episode Listen Later May 7, 2020


“Data stream” is a word that can be used in multiple ways. A stream can refer to data in motion or data at rest. When a stream is data in motion, an endpoint is receiving new pieces of data on a continual basis. Each new data point is sent over the wire and captured by the other end. Another way a stream can be represented is as a sequence of events that have been written to a storage medium. This is a stream at rest.Pravega is a system for storing large streams of data. Pravega can be used as an alternative to systems like Apache Kafka or Apache Pulsar. Flavio Junquiera is an engineer at Dell EMC who works on Pravega. He joins the show to talk about the history of stream processing and his work on Pravega.

Bigdata Hebdo
Episode 99 : Apache Pulsar et Kafka on Pulsar

Bigdata Hebdo

Play Episode Listen Later May 6, 2020 79:58


On parle de Apache Pulsar et Kafka on Pulsar avec nos invitésShownotes complètes sur : https://trkit.io/s/BDHEP99Steven : https://twitter.com/GwinizDuPierre : https://twitter.com/PierreZQuentin : https://twitter.com/waxzceVincent : https://twitter.com/vhe74Nicolas : https://www.cerenit.fr/ et https://twitter.com/_CerenIT et https://twitter.com/nsteinmetz Jérôme : https://twitter.com/jxerome-------------------------------------------------------------Cette publication est sponsorisée par Affini-Tech et CerenitBesoin de concevoir, d'industrialiser ou d'automatiser vos plateformes ? Ecrivez nous à contact@cerenit.fr( https://www.cerenit.fr/ et https://twitter.com/_CerenIT )Affini-Tech vous accompagne dans tous vos projets Cloud et Data, pour Imaginer, Expérimenter et Executer vos services ! ( http://affini-tech.com https://twitter.com/affinitech )On recrute ! venez cruncher de la data avec nous ! écrivez nous à recrutement@affini-tech.com----------------------------------------------------------------

Bigdata Hebdo
Episode 99 : Apache Pulsar et Kafka on Pulsar

Bigdata Hebdo

Play Episode Listen Later May 6, 2020 79:58


On parle de Apache Pulsar et Kafka on Pulsar avec nos invitésShownotes complètes sur : https://trkit.io/s/BDHEP99Steven : https://twitter.com/GwinizDuPierre : https://twitter.com/PierreZQuentin : https://twitter.com/waxzceVincent : https://twitter.com/vhe74Nicolas : https://www.cerenit.fr/ et https://twitter.com/_CerenIT et https://twitter.com/nsteinmetz Jérôme : https://twitter.com/jxerome-------------------------------------------------------------Cette publication est sponsorisée par Affini-Tech et CerenitBesoin de concevoir, d'industrialiser ou d'automatiser vos plateformes ? Ecrivez nous à contact@cerenit.fr( https://www.cerenit.fr/ et https://twitter.com/_CerenIT )Affini-Tech vous accompagne dans tous vos projets Cloud et Data, pour Imaginer, Expérimenter et Executer vos services ! ( http://affini-tech.com https://twitter.com/affinitech )On recrute ! venez cruncher de la data avec nous ! écrivez nous à recrutement@affini-tech.com----------------------------------------------------------------

The Data Exchange with Ben Lorica
Taking messaging and data ingestion systems to the next level

The Data Exchange with Ben Lorica

Play Episode Listen Later Jan 23, 2020 38:00


Sijie Guo on how Apache Pulsar is able to handle both queuing and streaming, and both online and offline applications.In this episode of the Data Exchange I speak with Sijie Guo, founder of StreamNative, a new startup focused on making enterprise messaging technologies - specifically Apache Pulsar - easy to use on the cloud. Sijie was previously a cofounder of Streamlio (acquired by Splunk) and prior to that he led the messaging team at Twitter. He is also the main organizer behind the Pulsar Summit (April in San Francisco), a new conference whose Call for Speakers closes on January 31st.  Our conversation spanned many topics, including:The role of messaging in modern data applications and platforms.The two main types of messaging applications: queuing and streaming.Apache Pulsar as a unified messaging platform, able to handle both queuing and streaming, and both online and offline applications.A status update on Apache Pulsar.Detailed show notes can be found on The Data Exchange web site.

javaswag
#3 - Сергей Егоров - Pivotal, Testcontainers, Reactor Type episode Kind page

javaswag

Play Episode Listen Later Nov 4, 2019 76:48


#3 - Сергей Егоров - Pivotal, Testcontainers, Reactor 00:00:49 Доклад про https://github.com/testcontainers/ 00:06:17 Разработка игр 00:09:30 Язык Haxe - https://haxe.org/ - https://www.youtube.com/watch?v=XQLNAx9DGmk 00:14:18 Apache Groovy - https://www.youtube.com/watch?v=Ujuz-D-ekXE 00:16:30 Groovy макрометоды https://github.com/bsideup/MacroGroovy 00:22:50 Первый доклад на английском на 00:25:10 Zeroturnaround JRebel, XRebel 00:30:40 С нуля в облака. Поднять продакшн, пока едет пицца - https://www.youtube.com/watch?v=9lpDjZUGhKA 00:33:25 Берлин, Zalando 00:36:36 История Testcontainers - Перевести сервис на SpringBoot за час 00:40:30 Vivy https://www.vivy.com/ - Стартап - Архитектура за которую не стыдно EventSourcing, CQRS - Выиграть тендер у IBM - Пивотал - это как найти улыбающуюся, поизитивну голову лошади в кровати :) - Liiklus - https://github.com/bsideup/liiklus - Бум ифраструктур, построенных на ивентах - Apache Kafka, Apache Pulsar - 50 микросервисов 00:58:40 Спринг - Офисы Pivotal - Staff Software Engineer - Reactor & Reactive Spring - https://pivotal.io/careers/openings/staff-software-engineer-reactor-reactive-spring/1077260 - Почему выбрали реактивный подход в Vivy - Ownership & trust - Конференция s1p https://springoneplatform.io/ - Java agent to detect blocking calls from non-blocking threads https://github.com/reactor/BlockHound - Доклад от Blizzard https://www.youtube.com/watch?v=xCu73WVg8Ps 00:67:10 Jabel - unlock Javac 12+ syntax when targeting Java 8 - https://github.com/bsideup/jabel 00:70:45 Подкаст Two Devs One Ops https://www.2d1o.ru/ 00:72:10 Никнейм bsideup 00:74:00 Дреды, конфликты в университете Гость - twitter.com/bsideup Телеграм канал t.me/javaswag Чат t.me/javaswag_chat Подкаст записан на конференции https://jokerconf.com/ Голос подкаста - t.me/volyx Продакшн подкаста - t.me/pahaus

airhacks.fm podcast with adam bien
DBs-ium, CDC and Streaming

airhacks.fm podcast with adam bien

Play Episode Listen Later Oct 13, 2019 71:14


An airhacks.fm conversation with Gunnar Morling (@gunnarmorling) about: The first Debezium commit, Randal Hauch, DBs-iuim, Java Content Repository JCR / modshape, exploring the Change Data Capture (CDC), how Debezium started, the MySQL binlog, the logical decoding in Postgres, Oracle Advanced Queuing, update triggers, Java Message System (JMS), there is no read detection, switching the current user at JDBC connection for audit purposes, helping Debezium with additional metadata table, using Kafka Streams to join the metadata and the payload, installing the logical decoding plugins into PostgreSQL, logical decoding plugin exposes the data from the write ahead log, decoding into protocol buffers with decoderbufs, in cloud environments like e.g. Amazon RDS you are not allowed to install any plugins, wal2json is verbose but comes preinstalled on RDS, pgoutput is responsible for the actual decoding of the events, debezium only sees committed transactions, debezium is mainly written in Java, decoderbufs was written by community and included to debezium, Debezium communicates with Postgres via the JDBC / Postgres API, pgoutput format is converted into Kafka Connector source format, Kafka Connect is a framework for running connectors, Kafka Connect comes with sink and source connectors, Kafka Connect comes with connector specific connectors like e.g. StringConverter, Converters are not Serializers, Debezium ships as Kafka Connect plugin, Kafka Connector runs as standalone process, running Debezium in embedded mode, JPA cache invalidation with Debezium, converting Debezium events into CDI events, converting database changes to WebSockets events, database polling vs the Debezium approach, DB2 will support Debezium, Oracle support is "on the horizon", Oracle LogmMiner, Oracle XStream, Debezium supports Microsoft SQL Server (starting with Enterprise license), Apache Pulsar comes with Debezium out-of-the-box, Pulsar IO, running Debezium as standalone service with outbounds APIs, MongoDB supports the "Debezium Change Event Format", Kafka Sink connectors are easy to implement, Debezium embedded mode and offsets, embedded connector has to remember the offset, an offset API is available for embedded Debezium connectors, combining CDC with Kafka Streams, Quarkus supports Kafka Streams and Reactive Messaging, Quarkus and Kafka Streams, Quarkus supports Kafka Streams in dev mode, replacing Hibernate Envers with Debezium, Messaging vs. Streaming or JMS vs. Kafka, Kafka is a database, the possible Debezium features, Cassandra support is coming, Outbox pattern is going to be better supported, transactional event grouping, dedicated topic for transaction demarcations, commercial support for Debezium, Debezium exposes JMX metrics, Five Advantages of Log-Based Change Data Capture, Reliable Microservices Data Exchange With the Outbox Pattern, Automating Cache Invalidation With Change Data Capture Gunnar Morling on twitter: @gunnarmorling and github: https://github.com/gunnarmorling. Gunnar's blog: https://morling.dev/.

airhacks.fm podcast with adam bien
Plugging Things Together With Reactive Programming

airhacks.fm podcast with adam bien

Play Episode Listen Later Jul 7, 2019 71:39


An airhacks.fm conversation with Gordon Hutchison (@hutchig) about: Playing chess with zx81, huge computer scene in Glasgow, BBC micro then saving for Acron Electron -- the cheaper BBC Micro, programming text adventure games, Forth on RML 380 Z, Sun's OpenBoot was written in Forth, Dragon 32, controlling the computer world with 13, programming colourful fractals, "do whatever you have permission to", then accessing the printer queue, transactions research and Java, IBM develops Java Transaction Service (JTS), travelling to Javasoft in Silicon Valley to transfer the JTS knowledge, moving from JTS to JVM implementation group at JDK 1.2 timeframe, having fun with IBM Java classloader, heap corruption, "lighter" experience with Eclipse RCP, Java Transaction API, Java Transaction Service and CORBA's Object Transaction Service, tranactions are a gift, just learn databases, "we don't need your transactions" in 2006, reused blog post from 15 years ago will be a big hit, IT became fashion -- everything is just reframed, implementing RAID algorithms, enjoying Java EE experience with OpenLiberty, deploying 50 times a conference session with wad.sh, having more coffee with classic WebSphere, OpenLiberty loose applications, OpenLiberty guide to loose applications, starting TX at facade level, JPA and transactions, getting two copies of the same object in the same request, every request is a transaction, loosing up the thread context, project Loom, transactions are making the developer's live simple, the pre-prepare phase, errors on CICS vs. MTS, solving the transaction diamond problem, reactive programming and backpressure, application servers and backpressure, you are not Google, reactive platform at Uber, too much sophistication, too complex to debug, and the human problem, functional reactive programming, plugging things together in reactive programming is appealing, the simple interface between publisher and subscriber, reactive programming as integration hub, learn Java streams first and reactive concepts will come easily, HTTP request / response model does not fit well with reactive programming, backpressure and kafka, Kafka's configuration, reactive streams operators as enabling layer, microprofile reactive messaging is similar to Message Driven Beans, Event Sourcing with debezium.io and Apache Kafka, event sourcing with GRPC, Apache Pulsar the "Kafka.next", SmallRye, CloudEvents and MicroProfile, SOAP envelope Gordon Hutchison on twitter: @gordhut, on GitHub: https://github.com/hutchig

The New Stack Context
Cloud Providers vs. Open Source, the Open Source Leadership Summit

The New Stack Context

Play Episode Listen Later Mar 18, 2019 35:35


This week, on The New Stack Context podcast, we talking about how cloud providers are affecting open source companies with Karthik Ramasamy, co-founder and CEO of Streamlio. Streamlio offers cloud native messaging, processing and event storage as a service, powered by Apache Pulsar. This week, everyone was buzzing about Amazon's new distribution of Elasticsearch. AWS was careful to say it wasn't a fork of the project on which the company Elasticsearch runs a lucrative business. But instead justified the move by saying it was too difficult for users of the project to distinguish between the open source code and the proprietary code which the company charges for. Karthik will talk in the first half of the show with us about this battle between the cloud providers and commercial open source companies and how business models are evolving. He wrote about this very topic a few weeks ago on The New Stack in a post called “2019: The Cloud's Impact on Open Source.” Then later in the show, we catch up on all the news from the Linux Foundation's Open Source Leadership Summit, in Half Moon Bay, California.

Roaring Elephant
Episode 103 – Apache Pulsar version 2.0 with Matteo and Sijie from Streamlio

Roaring Elephant

Play Episode Listen Later Aug 28, 2018 43:31


Matteo and Sijie from Streamlio reached out to us and let us know they had an update on Apache Pulsar. It turned out they had a lot to talk about so we cut the interview in two parts. the first of which was published in episode 101. Here is the second part with information on version 2.0 and the future of the Apache Pulsar project. Apache Pulsar logo   The first subject taken on by Sijie is Pulsar Functions, followed by Matteo talking about the new schema registry and Topic Compaction. With a new major version being released, users will probably want to upgrade so we asked the guys about the upgrade path. The rest of the episode, Matteo and Sijie share what they can regarding the future Pulsar Roadmap. Matteo Merli (https://www.linkedin.com/in/matteomerli/) Co-Founder - Software Engineer Sijie Guo (https://www.linkedin.com/in/samuelguo/) Co-Founder Apache Pulsar (incubating) https://pulsar.apache.org/ Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Roaring Elephant
Episode 101 – Apache Pulsar update with Matteo and Sijie from Streamlio

Roaring Elephant

Play Episode Listen Later Aug 14, 2018 65:48


Matteo and Sijie from Streamlio reached out to us and let us know they had an update on Apache Pulsar. It turned out they had a lot to talk about so we cut the interview in two parts and here is the first part where they introduce Apache Pulsar, go in depth on the correct deployment scaling of a stable Pulsar cluster and clarify Pulsars "at least once vs exactly once" strategy. Part two will go in more depth on what's new. Stay tuned! Apache Pulsar logo   Matteo Merli (https://www.linkedin.com/in/matteomerli/) Co-Founder - Software Engineer Sijie Guo (https://www.linkedin.com/in/samuelguo/) Co-Founder Apache Pulsar (incubating) https://pulsar.apache.org/ Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

THE ARCHITECHT SHOW
Ep. 59: Streamlio founders on why the world needs a new streaming data platform

THE ARCHITECHT SHOW

Play Episode Listen Later Jun 6, 2018 66:36


In this episode of the ARCHITECHT Show, Streamlio co-founders Karthik Ramasamy and Matteo Merli discuss their company's new streaming data platform, which is built atop Apache Heron, Apache Pulsar and Apache BookKeeper -- technologies the two helped develop while at Twitter and Yahoo, respectively. They explain how the underlying technologies differ from more well-known open source projects -- including Apache Kafka -- and the ideal use cases for the type of performance Streamlio claims. Additionally, GeekWire cloud and enterprise editor Tom Krazit is on to discuss Microsoft's $7.5 billion acquisition of GitHub. Tom and host Derrick Harris analyze why the deal happened and what this might mean for Microsoft, it cloud competitors and the world of GitHub alternatives. This week's episode is sponsored by MongoDB, Neo4j and Replicated.

Open Source – Software Engineering Daily
Pulsar Messaging with Lewis Kaneshiro

Open Source – Software Engineering Daily

Play Episode Listen Later May 17, 2018 60:53


Message broker systems decouple the consumers and producers of a message channel. In previous shows, we have explored ZeroMQ, PubNub, Apache Kafka, and NATS. In this episode, we talk about another message broker: Apache Pulsar. Pulsar is an open source distributed pub-sub message system originally created at Yahoo. It was used to scale products with The post Pulsar Messaging with Lewis Kaneshiro appeared first on Software Engineering Daily.

Roaring Elephant
Episode 67 – Roaring News

Roaring Elephant

Play Episode Listen Later Dec 26, 2017 43:12


It's here: the final news episode for 2017! We finish off the year talking about Apache Pulsar, Hadoop Delegation tokens (aka Kerberos), the Hadoop on Container hype (or is it?), Apache Hadoop 3.0 release and all you need to know bout Data Prepping (or at least all we can tell you in about 10 minutes, that is). Breaking News Jhon Comparing Pulsar and Kafka: unified queuing and streaming https://streaml.io/blog/pulsar-streaming-queuing/ Hadoop Delegation Tokens Explained http://blog.cloudera.com/blog/2017/12/hadoop-delegation-tokens-explained/ Hadoop and Containers Big Data and Container Orchestration with Kubernetes (K8s) https://www.bluedata.com/blog/2017/12/big-data-container-orchestration-kubernetes-k8s/ Spark on Kubernetes series https://banzaicloud.com/blog/spark-k8s/ https://banzaicloud.com/blog/scaling-spark-k8s/ https://banzaicloud.com/blog/zeppelin-spark-k8/ Data Prepping in the clouds Google Cloud Dataprep: Spreadsheet-Style Data Wrangling Powered by Google Cloud Dataflow https://medium.com/mark-rittman/google-cloud-dataprep-spreadsheet-style-data-wrangling-powered-by-google-cloud-dataflow-a48c405d81c Data Transformations “By Example” in the Azure Machine Learning Workbench https://blogs.technet.microsoft.com/machinelearning/2017/09/25/by-example-transformations-in-the-azure-machine-learning-workbench/   Dave Hadoop 3.0 Released on December 13th 2017 http://hadoop.apache.org/docs/r3.0.0/index.html http://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/release/3.0.0/RELEASENOTES.3.0.0.html http://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/release/3.0.0/CHANGES.3.0.0.html Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Roaring Elephant
Episode 64 – Talking Apache Pulsar with Matteo and Sijie from Streamlio

Roaring Elephant

Play Episode Listen Later Dec 5, 2017 82:44


A while ago, the all knowing oracle that is twitter pointed out that we really did not do justice to the Apache Pulsar project when we covered it in or Roaring News episode. The good people at Streamlio reached out to us and here is the 80+ minutes long discussion we had with Matteo Merli and Sijie Guo, going in depth on the merits and technical details, setting the Roaring Pulsar record straight! Apache Pulsar logo   Matteo Merli (https://www.linkedin.com/in/matteomerli/) Co-Founder - Software Engineer Sijie Guo (https://www.linkedin.com/in/samuelguo/) Co-Founder Apache Pulsar (incubating) https://pulsar.apache.org/ Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Roaring Elephant
Episode 59 – Roaring News

Roaring Elephant

Play Episode Listen Later Oct 31, 2017 35:18


It's another installment of Roaring News! This time, we talk about the ensemble recommendation system allegedly used by Spotify, not-so-new kid-on-the-block-after-all Apache Pulsar, the ever so popular "Hadoop is dead" and end with a quick shout-out to the Tokyo Data Platform Conference. Dave Apache Pulsar https://pulsar.apache.org/ https://www.slideshare.net/ydn/october-2016-hug-pulsar-a-highly-scalable-low-latency-pubsub-messaging-system https://streaml.io/blog/apache-pulsar-geo-replication/ https://streaml.io/blog/geo-replication-patterns-practices/ https://news.ycombinator.com/item?id=12453080 Data Platform Conference Tokyo http://dataplatform.jp/ Jhon Spotify’s Discover Weekly: How machine learning finds your new music https://hackernoon.com/spotifys-discover-weekly-how-machine-learning-finds-your-new-music-19a41ab76efe Hadoop Was Hard to Find at Strata This Week https://www.datanami.com/2017/09/29/hadoop-hard-find-strata-week/ Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.