Podcasts about clickhouse

71PODCASTS
125EPISODES
48mAVG DURATION
1EPISODE EVERY OTHER WEEK
Nov 19, 2025LATEST

POPULARITY

20172018201920202021202220232024

Best podcasts about clickhouse

Data Engineering Podcast

12 episodes with clickhouse

Contributor

4 episodes with clickhouse

DevZen Podcast

3 episodes with clickhouse

The Swyx Mixtape

5 episodes with clickhouse

Screaming in the Cloud

2 episodes with clickhouse

Postgres FM

2 episodes with clickhouse

The Data Stack Show

2 episodes with clickhouse

Can I get that software in blue?

2 episodes with clickhouse

The Data Engineering Show

2 episodes with clickhouse

Latest podcast episodes about clickhouse

Pourquoi l'internet mondial était en panne le 18/11 ?

Choses à Savoir TECH

Play Episode Listen Later Nov 19, 2025 2:46

Mardi 18 novembre, un peu après midi, Internet a toussé… puis s'est effondré par intermittence. En quelques minutes, Cloudflare, l'un des piliers de l'infrastructure du Web mondial, a entraîné dans sa chute une avalanche de services : ChatGPT, X/Twitter, Canva, Clubic et des milliers d'autres plateformes. L'hypothèse d'une cyberattaque massive a d'abord dominé. En réalité, la vérité est plus banale — et beaucoup plus inquiétante.Tout commence à 12h05, lorsque Cloudflare déploie une mise à jour sur un cluster de bases de données ClickHouse. Une modification censée renforcer la sécurité en rendant explicites les permissions d'accès. Un ajustement mineur, en apparence. Sauf que ce changement provoque un bug imprévu : chaque colonne de données se duplique dans les métadonnées. Une anomalie invisible pour l'utilisateur… mais catastrophique pour un composant clé : le fichier utilisé par le système Bot Management, chargé d'analyser le trafic pour distinguer humains et robots.Habituellement, ce fichier contient une soixantaine d'empreintes. Avec les doublons, il en compte plus de 200. Le problème ? Le logiciel censé le traiter est conçu pour refuser tout fichier dépassant 200 entrées, afin d'éviter une surcharge mémoire. Résultat : lorsque ce fichier corrompu se propage aux milliers de serveurs mondiaux, les machines plantent en série et renvoient des erreurs 500 aux internautes du monde entier. Le cauchemar se complique encore. Le fichier est régénéré toutes les cinq minutes. Selon que le serveur tombe sur une version saine ou défectueuse, Cloudflare oscille entre fonctionnement normal et blackout. Diagnostiquer la panne devient un casse-tête. Matthew Prince, le PDG, parle même d'une possible “démonstration de force” d'un réseau de bots, après les gigantesques attaques DDoS de juin.Ce n'est qu'à 14h04 qu'une piste interne apparaît. À 14h37, les équipes identifient enfin le coupable : le fichier Bot Management. À 15h24, sa génération automatique est stoppée. À 15h30, Internet redémarre. Enfin… presque. Le tableau de bord tombe à son tour, écrasé par le flot de connexions en attente. Il faudra attendre 18h06 pour un retour complet. Dans un mea culpa inhabituellement frontal, Matthew Prince avoue : « Une panne comme celle-ci est inacceptable. » Cloudflare promet des coupe-circuits plus rapides, une validation plus stricte des fichiers internes, et des limites pour ses outils de débogage, eux-mêmes responsables d'un ralentissement massif. Hébergé par Acast. Visitez acast.com/privacy pour plus d'informations.

internet chatgpt web dans acast tout enfin canva visitez tait selon mardi sauf ddos cloudflare mondial pdg panne habituellement clickhouse

#231 - On décrypte avec Blef les news de 2025 : Sommets Snowflake et Databricks, ClickHouse, DuckDB, BigQuery

Data Gen

Play Episode Listen Later Oct 20, 2025 29:40

Christophe Blefari est le créateur de la newsletter data Blef.fr la plus connue en France. Il a été Head of Data, Head of Data Engineering et Staff Data Engineer dans des startups et des grands groupes et est selon moi l'un des plus grands experts data en France. Récemment, il a cofondé Nao Labs, un éditeur de code à destination des équipes Data qui utilisent l'IA.On décrypte les news data qu'il ne fallait pas rater en 2025.On aborde :

head france zoom data iphone acast ia visitez snowflakes suivez databricks inscrivez data engineering sommets crypte bigquery les news clickhouse duckdb android l

Episode 42 | Tanya Bragin | VP Product & Marketing @ Clickhouse

Can I get that software in blue?

Play Episode Listen Later Oct 12, 2025 105:31

Episode #42 brings Tanya Bragin back to join Chad and Steve for the first sequel episode of the pod! Tanya's first episode was Episode 29 and it is the most watched episode so far.Tanya recently took on the VP of Marketing role at Clickhouse in addition to running Product so dive deep into her experience and how she's approaching Marketing differently from most companies since she's coming in as a technical product leader already. Lot of great insights here about how to do Demand Generation and Community Events differently from how companies typically run them.Episode #42 of "Can I get that software in blue?", a podcast by and for people engaged in technology sales. If you are in the technology presales, solution architecture, sales, support or professional services career paths then this show is for you!Our website: https://softwareinblue.comTwitter: https://twitter.com/softwareinblueLinkedIn: https://www.linkedin.com/showcase/softwareinblueMake sure to subscribe or follow us to get notified about our upcoming episodes:Youtube: https://www.youtube.com/channel/UC8qfPUKO_rPmtvuB4nV87rgApple Podcasts:https://podcasts.apple.com/us/podcast/can-i-get-that-software-in-blue/id1561899125Spotify: https://open.spotify.com/show/25r9ckggqIv6rGU8ca0WP2Links mentioned in the episode:Clickhouse HyperDX acquisition: https://clickhouse.com/blog/clickhouse-acquires-hyperdx-the-future-of-open-source-observabilityClickhouse PeerDB acquisition: https://clickhouse.com/blog/clickhouse-acquires-peerdb-to-boost-real-time-analytics-with-postgres-cdc-integrationDebugging story from Clickhouse SRE: https://clickhouse.com/blog/a-case-of-the-vanishing-cpu-a-linux-kernel-debugging-storyEssentialism Book: https://www.amazon.com/Essentialism-Disciplined-Pursuit-Greg-McKeown/dp/0804137382

spotify marketing product community events demand generation debugging clickhouse

E182: The Rise of ClickHouse

Open Source Startup Podcast

Play Episode Listen Later Oct 8, 2025 47:02

In the episode, we sat down with ClickHouse Co-Founder Yury Izrailevsky to unpack how one of the fastest open-source databases in the world became the analytics engine of choice for 2,000 customers including Harvey, Canva, HP, and Supabase. From its Yandex origins to powering AI observability, Yury shares how ClickHouse balances open-source roots, cloud innovation, and a remote-first culture moving at breakneck speed.ClickHouse's Series C valued the company at $6.35B earlier this year, and just yesterday they announced an extension to that round, just months after it was raised. In this episode, we dig into:Origins & Founding StoryClickHouse began as an internal project at Yandex to power a Google Analytics–style platform, focused on performance and scale.Open-sourced in 2016 - rapid global adoption laid the foundation for ClickHouse the company. Yury first discovered ClickHouse while at Google; impressed by its speed, he later co-founded the company in 2021 alongside Aaron Katz (ex-Elastic) and the original creator Alexey Milovidov.Why ClickHouse Stands OutColumn-oriented, open source OLAP database designed for massive-scale analytical processing.Excels in performance, efficiency, and cost - ideal for large data volumes and real-time analytics (and now AI workloads). Architectural choices:Columnar storage = better compression and faster execution.Separation of compute and storage enables elasticity, scalability, and resilience in the cloud.Open Source vs. CloudOpen-source version offers freedom and flexibility.Cloud product delivers much lower total cost of ownership and fully managed experience.Architectural parity between the two ensuring no vendor lock-in for customers. Customers can run the same queries on both; most stay with cloud due to simplicity and cost efficiency.Use Cases & Ecosystem4 main use cases:Real-time analyticsData WarehousingObservability AI / ML WorkloadsCompany Building & CultureFully remote from day one.Prioritized experienced, self-sufficient engineers over early-career hires.Built and launched GA version in less than a year - insane pace of innovation.Innovation & CommunityMonthly release cadence.Hundreds of integrations and connectors.Strong open-source and commercial communityAdvice for FoundersFocus on what matters most Hire mature, independent thinkers.Move fast but maintain quality; ClickHouse Cloud achieved production-grade quality in record time.

#216 Konsistenz und Isolation: von Write Skew bis Dirty Reads

Engineering Kiosk

Play Episode Listen Later Oct 7, 2025 67:40 Transcription Available

Datenbanken sind das Rückgrat vieler Anwendungen, aber wie konsistent sind deine Daten eigentlich? Egal ob Banküberweisung, Sneaker-Kauf im Online-Shop oder das neueste Side-Project: Oft verbergen sich hinter der vermeintlich „sicheren“ Datenhaltung komplexe Stolperfallen. Wie funktionieren Transaktionen wirklich? Und warum kann ausgerechnet ein falsch gewähltes Isolationslevel zu Dirty Reads, non-repeatable Reads oder sogar zu Write Skew führen?Wir nehmen dich in dieser Episode mit auf eine Reise in die Tiefen der Konsistenzmodelle. Wolfi ist ehemaliger Forscher für Datenbanksysteme an der Uni Innsbruck. Mit ihm steigen wir ein in die Praxis und Theorie; Von Foreign Keys und Check Constraints bis hin zur Multi-Version Concurrency Control (MVCC). Du erfährst, was sich hinter Serializable, Repeatable Read, Read Committed und Read Uncommitted verbirgt und weshalb Tools wie Jepsen immer neue Fehler in selbst „sicheren“ Systemen aufdecken.Am Ende weißt du, warum dich auch als Entwickler:in das Thema Konsistenz, Isolationslevel und Transaktionsmanagement beschäftigen solltest.Bonus: Dirty Reads sind wie Gerüchte: Man hört sie, bevor sie wahr sind… aber was, wenn sie nie stimmen?Unsere aktuellen Werbepartner findest du auf https://engineeringkiosk.dev/partnersDas schnelle Feedback zur Episode:

Why the Middle Layer of Your Agency Org Chart May Not Survive AI with Jennifer Bagley | Ep #841

Smart Agency Masterclass with Jason Swenk: Podcast for Digital Marketing Agencies

Play Episode Listen Later Oct 1, 2025 28:36

Would you like access to our advanced agency training for FREE? https://www.agencymastery360.com/training Are you still thinking of AI as just “ChatGPT with a better prompt”? Or maybe you've played around with Zapier automations and thought, yeah, that's good enough. Today's featured guest knows that the agencies pulling ahead right now are building full-on AI agent networks that replace routine tasks, streamline data pipelines, and give their teams superpowers. She's re-engineering her agency around AI and will talk about where she finds top-tier talent and why you don't need to code to lead your agency into the future. Jennifer Bagley is the CEO and founder of CI Web Group, a fully virtual digital marketing agency registered in 22 U.S. states with clients across the United States and Canada. A former corporate operator turned entrepreneur, Jennifer started in real estate and mortgage brokerage before leaning into the marketing work she built to support those businesses. Today she runs a modern, tech-forward agency that's rebuilt its stack around AI, centralized data, and agentic networks, all while carrying the scars and lessons of scaling, pivoting, and re-founding a business from the ground up. In this episode, we'll discuss: Feeling trapped by the business. Hiring, firing, and the people reset AI, reskilling, and the end of “middle” roles What does this talent cost? Subscribe Apple | Spotify | iHeart Radio Sponsors and Resources E2M Solutions: Today's episode of the Smart Agency Masterclass is sponsored by E2M Solutions, a web design, and development agency that has provided white-label services for the past 10 years to agencies all over the world. Check out e2msolutions.com/smartagency and get 10% off for the first three months of service. From Corporate Ladder to Accidental Agency Founder Jennifer came from an operations background, a self-proclaimed black belt in Six Sigma and certified project manager. Having built that corporate background, she had made a promise to herself (“by 30 I'll be an entrepreneur”), and started to build the side hustle that became the main event. She started in real estate and mortgage brokering where she had to learn marketing the hard way; not because she wanted to be a marketer, but because the survival of her businesses depended on it. Initially, Jennifer didn't set out to build a scalable agency; she built a team to support her broker network. When the market collapsed in 2008, the same team that did marketing for agents suddenly had a market outside real estate. That “we'll just help this painter or HVAC company” phase is where the web group was born: small, service-focused, and useful to people in her network. That accidental turn became a business by solving real, pressing problems for paying clients, then leaned into that. Trading Time for Freedom: The Hard Pivot For the first five years, Jennifer describes the business as a “lifestyle” operation, profitable maybe, but trapping her time. She was trading billable hours for income and was reaching her limit when she hired a coach that forced a reckoning: if entrepreneurship isn't buying you time, money, and freedom, what's the point? So she made the brutal choice of cutting consulting contracts and burning the bridge to the “safety” of hourly work, and effectively gave herself a mulligan. This is the classic founder pivot: you have to choose between growth that keeps you doing the work and growth that scales the business without you. Jennifer's reset wasn't pretty, for a while she lost everything and she and her son lived in an office for a while, but it bought her the permission to build something salable, not just sustainable. Agency owners who feel trapped in delivery need to remember that sometimes you have to give up short-term revenue to create long-term value. Feeling Trapped by the Agency and Becoming a CEO Those first five years, Jennifer continued to run a business that started as a supply chain consulting and eventually turned into a sales supply chain consulting. This change meant the business was now a good lead generator for the agency but it also meant Jennifer was essentially selling her image and her time. Until she ran out of time. Once she felt trapped by the business, Jennifer actually hired a business coach that helped her change the model from “selling Jennifer with marketing on the side” to an actual sustainable business. She had to go back to the basics and remember she, like every entrepreneur, started the business with the idea of having more time, money, and freedom. It took losing everything, but Jennifer knew she didn't want a lifestyle business, she wanted a sellable business. The antidote was delegation plus systems. If you want growth and a future exit, you need to own those CEO responsibilities and be comfortable with letting go of the day-to-day. Hiring, Firing, and Resetting the Team Jennifer's talent strategy has evolved with each stage of growth. Her early hires were the classic “friends, family, fools” bootstrap crew; later she invested in developers, content teams, project managers, and over time, more strategic hires like CFOs, chief of staff, BI teams, and AI engineers. Each five-year arc brought a new set of needs and a new level of sophistication in hiring. Now, she divides her time between promoting her agency's work in podcasts and content and thinking of ways to navigate her business in these volatile and exciting times. Her most recent addition to the team was a technology and transformation team that is revisiting all of the agency's processes, investments, and infrastructure. As a result, she has downsized her team from over 300 W2 employees and refocus the team. The takeaway for agency owners: be honest about whether your people are builders or maintainers, and hire accordingly. The workforce you need for growth is not the same as the workforce you need for stable operations. Building AI Agent Networks with Centralized Data Jennifer's agency shifted from WordPress to Webflow and built agentic networks: hundreds of AI agents that crawl competitors, do strategy homework, and automate tasks that humans used to do. More importantly, they rebuilt infrastructure into a hub-and-spoke model with a centralized min.io data layer and ETL pipelines feeding analytics and BI. Two big lessons here. One: invest in your tech stack deliberately so you're not a Frankenstein of five different platforms that don't talk to each other. Two: design your data architecture so your people (and your AI agents) have a single source of truth. That's how you get from fire-fighting in six dashboards to proactive, predictive signals that tell you when a client engagement needs attention. AI, Reskilling, and Shrinking Middle Roles Jennifer draws a hard line: the agency now tends to hire either very seasoned client-facing leaders or AI engineers; the middle is shrinking. With agentic networks giving junior staff “superpowers,” the agency can afford fewer mid-level “lever pullers.” At this level there's no room for slow execution or elementary work. That's a cultural and ethical challenge, both for hiring and for workforce development. For agency owners, this raises practical HR questions: do you reskill your people, or replace them? Jennifer suggests building agent-driven systems that augment humans, and being brutally honest about who can grow into that future. It's also a call to action for how we prepare the next generation: schools won't teach this; companies will need to. Playing with AI Platforms: Why Leaders Need to Just Know Enough to Be Dangerous Jennifer started like a lot of agency owners dipping into AI, playing around on tools like n8n, Make.com, Relevance, and Longchain. Her dev team laughed, calling her an “elementary school kid on a tricycle,” but here's the point: she didn't need to master the tech. She needed to know enough to point her team in the right direction. Instead of obsessing over code, she framed the problem differently: “Here's what I don't want a human doing anymore. Can you make that happen?” That mindset shift is key for agency owners. You don't need to be a full-stack AI engineer to lead an agency into the future; you just need to clearly define outcomes and invest in people who can deliver them. Find Real AI Talent in Unlikely Places This is where most agencies get stuck. You're not going to find your next AI architect on Upwork. Jennifer leaned on her network, starting with her cousin Chris, a hardcore developer who initially thought AI platforms were “rookie business.” Once Chris realized the power of agentic networks to scale his expertise, he became the backbone of CI Web Group's transformation. Now, she hunts talent in unconventional places: hackathons, LinkedIn, and especially YouTube. Forget the flashy “10x growth hack” videos — she looks for nerds with four views, geeking out about orchestrators and ETL pipelines. Those are the builders who care about solving real problems, not just building hype. Her tip: if you find one, reach out immediately. They don't want sales, they just want to build. Designing AI Agents Like an Agency Org Chart Jennifer compares AI agents to a company org chart. You don't hire one person to do everything, that's a recipe for burnout. Same thing with AI. Each agent should tightly focus on a single task, with checks, auditors, and orchestrators overseeing the system. The payoff was massive efficiency gains. Instead of six different platforms that don't talk, her agency built a centralized hub with min.io, ClickHouse, and AI layers on top. That's how you go from patchwork automation to true predictive intelligence. The Real Cost of AI Talent If you're wondering how much this all costs, the answer is… a lot. On the high end, seasoned AI engineers can run you a quarter million in salary. On the low end, Jennifer tests new hires on project-based sprints, maybe $6K for a 10-hour challenge. The point isn't to cut costs; it's to prove quickly who can deliver and who can't. Her recruiting process is brutal but effective: give candidates a project, a tight deadline, and see how they perform. If they stall, they're out. If they screen-share fast and solve problems live, they're in. No fluff, no endless interviews. Do You Want to Transform Your Agency from a Liability to an Asset? Looking to dig deeper into your agency's potential? Check out our Agency Blueprint. Designed for agency owners like you, our Agency Blueprint helps you uncover growth opportunities, tackle obstacles, and craft a customized blueprint for your agency's success.

#214 Daten aus Spotify & Co: Architektur einer skalierbaren API-Data-Pipeline

Engineering Kiosk

Play Episode Listen Later Sep 23, 2025 59:13 Transcription Available

Wie würdest du ... Open Podcasts … bauen? Architektur- und Design-Diskussion, die zweite.Monolith oder Microservices? Python oder Go? Wer träumt nachts eigentlich vom perfekten ETL-Stack? Als Softwareentwickler:in kennst du das: Daten aus zig Quellen, kapriziöse APIs, Security-Bedenken und der Wunsch nach einem skalierbaren, sauberen Architekturkonzept. Fragen über Fragen und etliche mögliche Wege. Welcher ist “der Richtige”?Genau dieses Szenario nehmen wir uns zur Brust: Wolfi hat mit „Open Podcast“ ein reales Projekt gebaut, das Analytics-Daten aus Plattformen wie Spotify, Apple & Co. zusammenführt. Du willst wissen, wie du verteilte APIs knackst, Daten harmonisierst, Backups sicherst und deine Credentials nicht als Excel-Sheet auf den Desktop legst? Komm mit auf unseren Architektur-Deepdive! Andy wird Schritt für Schritt interviewt und challenged, wie er als Engineer, von API-Strategie über Message Queues bis Security und Skalierung, dieses Problem kreativ lösen würde. Nebenbei erfährst du alles Wichtige über Open-Source-Vorteile, Datenbanken (PostgreSQL, Clickhouse), Backups, Monitoring und DevOps. Das Ganze immer garniert mit Learnings aus der echten Praxis.Unsere aktuellen Werbepartner findest du auf https://engineeringkiosk.dev/partnersDas schnelle Feedback zur Episode:

Berlin Buzzwords 2025 Conference Interviews

DataTalks.Club

Play Episode Listen Later Sep 12, 2025 67:42

At Berlin Buzzwords, industry voices highlighted how search is evolving with AI and LLMs.- Kacper Łukawski (Qdrant) stressed hybrid search (semantic + keyword) as core for RAG systems and promoted efficient embedding models for smaller-scale use.- Manish Gill (ClickHouse) discussed auto-scaling OLAP databases on Kubernetes, combining infrastructure and database knowledge.- André Charton (Kleinanzeigen) reflected on scaling search for millions of classifieds, moving from Solr/Elasticsearch toward vector search, while returning to a hands-on technical role.- Filip Makraduli (Superlinked) introduced a vector-first framework that fuses multiple encoders into one representation for nuanced e-commerce and recommendation search.- Brian Goldin (Voyager Search) emphasized spatial context in retrieval, combining geospatial data with AI enrichment to add the “where” to search.- Atita Arora (Voyager Search) highlighted geospatial AI models, the renewed importance of retrieval in RAG, and the cautious but promising rise of AI agents.Together, their perspectives show a common thread: search is regaining center stage in AI—scaling, hybridization, multimodality, and domain-specific enrichment are shaping the next generation of retrieval systems.Kacper Łukawski Senior Developer Advocate at Qdrant, he educates users on vector and hybrid search. He highlighted Qdrant's support for dense and sparse vectors, the role of search with LLMs, and his interest in cost-effective models like static embeddings for smaller companies and edge apps. Connect: https://www.linkedin.com/in/kacperlukawski/Manish Gill Engineering Manager at ClickHouse, he spoke about running ClickHouse on Kubernetes, tackling auto-scaling and stateful sets. His team focuses on making ClickHouse scale automatically in the cloud. He credited its speed to careful engineering and reflected on the shift from IC to manager. Connect: https://www.linkedin.com/in/manishgill/André Charton Head of Search at Kleinanzeigen, he discussed shaping the company's search tech—moving from Solr to Elasticsearch and now vector search with Vespa. Kleinanzeigen handles 60M items, 1M new listings daily, and 50k requests/sec. André explained his career shift back to hands-on engineering. Connect: https://www.linkedin.com/in/andrecharton/Filip Makraduli Founding ML DevRel engineer at Superlinked, an open-source framework for AI search and recommendations. Its vector-first approach fuses multiple encoders (text, images, structured fields) into composite vectors for single-shot retrieval. His Berlin Buzzwords demo showed e-commerce search with natural-language queries and filters. Connect: https://www.linkedin.com/in/filipmakraduli/Brian Goldin Founder and CEO of Voyager Search, which began with geospatial search and expanded into documents and metadata enrichment. Voyager indexes spatial data and enriches pipelines with NLP, OCR, and AI models to detect entities like oil spills or windmills. He stressed adding spatial context (“the where”) as critical for search and highlighted Voyager's 12 years of enterprise experience. Connect: https://www.linkedin.com/in/brian-goldin-04170a1/Atita Arora Director of AI at Voyager Search, with nearly 20 years in retrieval systems, now focused on geospatial AI for Earth observation data. At Berlin Buzzwords she hosted sessions, attended talks on Lucene, GPUs, and Solr, and emphasized retrieval quality in RAG systems. She is cautiously optimistic about AI agents and values the event as both learning hub and professional reunion. Connect: https://www.linkedin.com/in/atitaarora/

When not to use Postgres

Postgres FM

Play Episode Listen Later Sep 5, 2025 46:17

Nik and Michael discuss when not to use Postgres — specifically use cases where it still makes sense to store data in another system. Here are some links to things they mentioned:Just use Postgres (blog post by Ethan McCue) https://mccue.dev/pages/8-16-24-just-use-postgresJust Use Postgres for Everything (blog post by Stephan Schmidt) https://www.amazingcto.com/postgres-for-everythingReal-time analytics episode https://postgres.fm/episodes/real-time-analyticsCrunchy Data Joins Snowflake https://www.crunchydata.com/blog/crunchy-data-joins-snowflakeTwo sizes fit most: PostgreSQL and Clickhouse (blog post by Sid Sijbrandij) https://about.gitlab.com/blog/two-sizes-fit-most-postgresql-and-clickhousepg_duckdb episode https://postgres.fm/episodes/pg_duckdbCloudberry https://github.com/apache/cloudberryTime-series considerations episode https://postgres.fm/episodes/time-series-considerationsQueues in Postgres episode https://postgres.fm/episodes/queues-in-postgresLarge Objects https://www.postgresql.org/docs/current/largeobjects.html PGlite https://pglite.devParadeDB https://www.paradedb.comZomboDB https://github.com/zombodb/zombodbturbopuffer https://turbopuffer.comHNSW vs. DiskANN (blog post by Haziqa Sajid) https://www.tigerdata.com/learn/hnsw-vs-diskannSPANN: Highly-efficient Billion-scale Approximate Nearest Neighbor Search (paper) https://www.microsoft.com/en-us/research/wp-content/uploads/2021/11/SPANN_finalversion1.pdfAmazon S3 Vectors https://aws.amazon.com/s3/features/vectorsIterative Index Scans added to pgvector in 0.8.0 https://github.com/pgvector/pgvector/issues/678S3 FDW from Supabase https://github.com/supabase/wrappers/tree/main/wrappers/src/fdw/s3_fdw~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork

google technology billion databases nik sql spann postgresql postgres supabase clickhouse sid sijbrandij

ClickStack: ClickHouse's New Observability Stack Unveiled - OpenObservability Talks S6E03

OpenObservability Talks

Play Episode Listen Later Aug 30, 2025 59:11

The ClickHouse open source project has gained interest in the observability community, thanks to its outstanding performance benchmarks. Now ClickHouse is doubling down on observability with the release of ClickStack, a new open source observability stack that bundles in ClickHouse, OpenTelemetry and HyperDX frontend. I invited Mike Shi, the co-founder of HyperDX and co-creator of ClickStack, to tell us all about this new project. Mike is Head of Observability at ClickHouse, and brings prior observability experience with Elasticsearch and more.You can read the recap post: https://medium.com/p/73f129a179a3/Show Notes:00:00 episode and guest intro04:38 taking the open source path as an entrepreneur10:51 the HyperDX observability user experience 16:08 challenges in implementing observability directly on ClickHouse20:03 intro to ClickStack and incorporating OpenTelemetry32:35 balancing simplicity and flexibility36:15 SQL vs. Lucene query languages 39:06 performance, cardinality and the new JSON type52:14 use cases in production by OpenAI, Anthropic, Tesla and more55:38 episode outroResources:HyperDX https://github.com/hyperdxio/hyperdx ClickStack https://clickhouse.com/docs/use-cases/observability/clickstack Shopify's Journey to Planet-Scale Observability: https://medium.com/p/9c0b299a04ddClickHouse: Breaking the Speed Limit for Observability and Analytics https://medium.com/p/2004160b2f5e New JSON data type for ClickHouse: https://clickhouse.com/blog/a-new-powerful-json-data-type-for-clickhouseSocials:BlueSky: https://bsky.app/profile/openobservability.bsky.socialTwitter: ⁠https://twitter.com/OpenObserv⁠LinkedIn: https://www.linkedin.com/company/openobservability/YouTube: ⁠https://www.youtube.com/@openobservabilitytalks⁠Dotan Horovits============Twitter: @horovitsLinkedIn: www.linkedin.com/in/horovitsMastodon: @horovits@fosstodonBlueSky: @horovits.bsky.socialMike Shi=======Twitter: https://x.com/MikeShi42LinkedIn: https://www.linkedin.com/in/mikeshi42BlueSky: https://bsky.app/profile/mikeshi42.bsky.socialOpenObservability Talks episodes are released monthly, on the last Thursday of each month and are available for listening on your favorite podcast app and on YouTube.

head tesla analytics openai shopify unveiled stack sql anthropic speed limits json observability elasticsearch clickhouse lucene

Nebius: The Nvidia-Backed AI Stock You've Probably Never Heard Of

The Finimize Podcast

Play Episode Listen Later Jul 30, 2025 14:38

AI is driving a gold rush in compute power. And Nebius is quietly selling the picks and shovels.In today's episode, Finimize Analyst Russell Burns joins the pod to walk us through why this under-the-radar AI infrastructure firm might be one of the most overlooked growth stories out there. It's lean, it's scaling fast, it's tight with Nvidia - and its stake in ClickHouse alone could be worth a chunk of its market cap. Try Finimize Pro

ai stock nvidia backed never heard clickhouse

180: István Mészáros: Merging web and product analytics on top of the warehouse with a zero-copy architecture

Humans of Martech

Play Episode Listen Later Jul 29, 2025 59:15

What's up everyone, today we have the pleasure of sitting down with István Mészáros, Founder and CEO of Mitzu.io. (00:00) - Intro (01:00) - In This Episode (03:39) - How Warehouse Native Analytics Works (06:54) - BI vs Analytics vs Measurement vs Attribution (09:26) - Merging Web and Product Analytics With a Zero-Copy Architecture (14:53) - Feature or New Category? What Warehouse Native Really Means For Marketers (23:23) - How Decoupling Storage and Compute Lowers Analytics Costs (29:11) - How Composable CDPs Work with Lean Data Teams (34:32) - How Seat-Based Pricing Works in Warehouse Native Analytics (40:00) - What a Data Warehouse Does That Your CRM Never Will (42:12) - How AI-Assisted SQL Generation Works Without Breaking Trust (50:55) - How Warehouse Native Analytics Works (52:58) - How To Navigate Founder Burnout While Raising Kids Summary: István built a warehouse-native analytics layer that lets teams define metrics once, query them directly, and skip the messy syncs across five tools trying to guess what “active user” means. Instead of fighting over numbers, teams walk through SQL together, clean up logic, and move faster. One customer dropped their bill from $500K to $1K just by switching to seat-based pricing. István shares how AI helps, but only if you still understand the data underneath. This conversation shows what happens when marketing, product, and data finally work off the same source without second-guessing every report.About IstvánIstvan is the Founder and CEO of Mitzu.io, a warehouse-native product analytics platform built for modern data stacks like Snowflake, Databricks, BigQuery, Redshift, Athena, Postgres, Clickhouse, and Trino. Before launching Mitzu.io in 2023, he spent over a decade leading high-scale data engineering efforts at companies like Shapr3D and Skyscanner. At Shapr3D, he defined the long-term data strategy and built self-serve analytics infrastructure. At Skyscanner, he progressed from building backend systems serving millions of users to leading data engineering and analytics teams. Earlier in his career, he developed real-time diagnostic and control systems for the Large Hadron Collider at CERN. How Warehouse Native Analytics WorksMarketing tools like Mixpanel, Amplitude, and GA4 create their own versions of your customer. Each one captures data slightly differently, labels users in its own format, and forces you to guess how their identity stitching works. The warehouse-native model removes this overhead by putting all customer data into a central location before anything else happens. That means your data warehouse becomes the only source of truth, not just another system to reconcile.István explained the difference in blunt terms. “The data you're using is owned by you,” he said. That includes behavioral events, transactional logs, support tickets, email interactions, and product usage data. When everything lands in one place first (BigQuery, Redshift, Snowflake, Databricks) you get to define the logic. No more retrofitting vendor tools to work with messy exports or waiting for their UI to catch up with your question.In smaller teams, especially B2C startups, the benefits hit early. Without a shared warehouse, you get five tools trying to guess what an active user means. With a warehouse-native setup, you define that metric once and reuse it everywhere. You can query it in SQL, schedule your campaigns off it, and sync it with downstream tools like Customer.io or Braze. That way you can work faster, align across functions, and stop arguing about whose numbers are right.“You do most of the work in the warehouse for all the things you want to do in marketing,” István said. “That includes measurement, attribution, segmentation, everything starts from that central point.”Centralizing your stack also changes how your data team operates. Instead of reacting to reporting issues or chasing down inconsistent UTM strings, they build shared models the whole org can trust. Marketing ops gets reliable metrics, product teams get context, and leadership gets reports that actually match what customers are doing. Nobody wins when your attribution logic lives in a fragile dashboard that breaks every other week.Key takeaway: Warehouse native analytics gives you full control over customer data by letting you define core metrics once in your warehouse and reuse them everywhere else. That way you can avoid double-counting, reduce tool drift, and build a stable foundation that aligns marketing, product, and data teams. Store first, define once, activate wherever you want.BI vs Analytics vs Measurement vs AttributionBusiness intelligence means static dashboards. Not flexible. Not exploratory. Just there, like laminated truth. István described it as the place where the data expert's word becomes law. The dashboards are already built, the metrics are already defined, and any changes require a help ticket. BI exists to make sure everyone sees the same numbers, even if nobody knows exactly how they were calculated.Analytics lives one level below that, and it behaves very differently. It is messy, curious, and closer to the raw data. Analytics splits into two tracks: the version done by data professionals who build robust models with SQL and dbt, and the version done by non-technical teams poking around in self-serve tools. Those non-technical users rarely want to define warehouse logic from scratch. They want fast answers from big datasets without calling in reinforcements.“We used to call what we did self-service BI, because the word analytics didn't resonate,” István said. “But everyone was using it for product and marketing analytics. So we changed the copy.”The difference between analytics and BI has nothing to do with what the tool looks like. It has everything to do with who gets to use it and how. If only one person controls the dashboard, that is BI. If your whole team can dig into campaign performance, break down cohorts, and explore feature usage trends without waiting for data engineering, that is analytics. Attribution, ML, and forecasting live on top of both layers. They depend on the raw data underneath, and they are only useful if the definitions below them hold up.Language often lags behind how tools are actually used. István saw this firsthand. The product stayed the same, but the positioning changed. People used Mitzu for product analytics and marketing performance, so that became the headline. Not because it was a trend, but because that is what users were doing anyway.Key takeaway: BI centralizes truth through fixed dashboards, while analytics creates motion by giving more people access to raw data. When teams treat BI as the source of agreement and analytics as the source of discovery, they stop fighting over metrics and start asking better questions. That way you can maintain trusted dashboards for executive reporting and still empower teams to explore data without filing tickets or waiting days for answers.Merging Web and Product Analytics With a Zero-Copy ArchitectureMost teams trying to replace GA4 end up layering more tools onto the same mess. They drop in Amplitude or Mixpanel for product analytics, keep something else for marketing attribution, and sync everything into a CDP that now needs babysitting. Eventually, they start building one-off pipelines just to feed the same events into six different systems, all chasing slightly different answers to the same question.István sees this fragmentation as a byproduct of treating product and marketing analytics as separate functions. In categorie...

E177: RunReveal's Anti SIEM SIEM Platform (With AI That Actually Works!)

Open Source Startup Podcast

Play Episode Listen Later Jul 8, 2025 43:33

Alan Braithwaite is Co-Founder & CTO of RunReveal, the security data platform with real-time monitoring, built-in detections, and AI-powered investigations. Today, they manage and analyze security logs for teams at Harvey, ClickHouse, Cloudflare, and Temporal. RunReveal has multiple open source projects including event stream processing library kawa and query language pql. RunReveal has raised from investors including Costanoa, Modern Technical Fund, and Runtime Ventures. In this episode, we dig into:Why today's modern security teams are rethinking data management The benefits of building RunReveal on ClickHouse How they worked with early believers / customers like TemporalTheir open source strategy and building trust with the community through open sourcing components like their event processing libraryTheir MCP server and enabling security teams to use AI to automate investigations (including the launch of their new remote MCP server)

ai co founders platform cto temporal actually works cloudflare mcp siem clickhouse

Linux под прикрытием — Episode 505

DevZen Podcast

Play Episode Listen Later Jul 4, 2025 97:35

В этом выпуске: вайбкодинг и переезд на Linux, агенты и bloom фильтры в ClickHouse. [00:01:18] Чему мы научились за неделю Using Bloom filter indexes for real-time text search in ClickHouse PSI — Pressure Stall Information — The Linux Kernel documentation [00:30:46] Саша свинтил на Linux [00:51:41] Project Vend: Can Claude run a small shop? (And… Читать далее →

linux linux kernel clickhouse

EP228 SIEM in 2025: Still Hard? Reimagining Detection at Cloud Scale and with More Pipelines

Cloud Security Podcast by Google

Play Episode Listen Later Jun 2, 2025 27:09

Guest Alan Braithwaite, Co-founder and CTO @ RunReveal Topics: SIEM is hard, and many vendors have discovered this over the years. You need to get storage, security and integration complexity just right. You also need to be better than incumbents. How would you approach this now? Decoupled SIEM vs SIEM/EDR/XDR combo. These point in the opposite directions, which side do you think will win? In a world where data volumes are exploding, especially in cloud environments, you're building a SIEM with ClickHouse as its backend, focusing on both parsed and raw logs. What's the core advantage of this approach, and how does it address the limitations of traditional SIEMs in handling scale? Cribl, Bindplane and “security pipeline vendors” are all the rage. Won't it be logical to just include this into a modern SIEM? You're envisioning a 'Pipeline QL' that compiles to SQL, enabling 'detection in SQL.' This sounds like a significant shift, and perhaps not to the better? (Anton is horrified, for once) How does this approach affect detection engineering? With Sigma HQ support out-of-the-box, and the ability to convert SPL to Sigma, you're clearly aiming for interoperability. How crucial is this approach in your vision, and how do you see it benefiting the security community? What is SIEM in 2025 and beyond? What's the endgame for security telemetry data? Is this truly SIEM 3.0, 4.0 or whatever-oh? Resources: EP197 SIEM (Decoupled or Not), and Security Data Lakes: A Google SecOps Perspective EP123 The Good, the Bad, and the Epic of Threat Detection at Scale with Panther EP190 Unraveling the Security Data Fabric: Need, Benefits, and Futures “20 Years of SIEM: Celebrating My Dubious Anniversary” blog “RSA 2025: AI's Promise vs. Security's Past — A Reality Check” blog tl;dr security newsletter Introducing a RunReveal Model Context Protocol Server! MCP: Building Your SecOps AI Ecosystem AI Runbooks for Google SecOps: Security Operations with Model Context Protocol

ai benefits security epic scale cloud futures won anton reimagining detection sigma pipelines sql rsa siem spl threat detection siems clickhouse

The PHP Podcast: 2025.05.29

php[podcast] episodes from php[architect]

Play Episode Listen Later May 30, 2025 64:54

This week on the PHP Podcast, Eric, John, and special guest Scott Keck-Warren talk about PHP Tek 2025 Wrap Up, NativePHP, Ethics in Web Development, and more… Links from the show: Security Starts With Developer Enablement: Lessons From PHP TEK 2025 The Bucket | Post PHP[TEK] Reflections Slightly Caffeinated | PHPTek, AI workshop, and ClickHouse […] The post The PHP Podcast: 2025.05.29 appeared first on PHP Architect.

ai ethics wrap up web development clickhouse

ClickHouse: Breaking the Speed Limit for Observability and Analytics - OpenObservability Talks S5E12

OpenObservability Talks

Play Episode Listen Later May 27, 2025 58:27

The ClickHouse® project is a rising star in observability and analytics, challenging performance conventions with its breakneck speed. This open source OLAP column store, originally developed at Yandex to power their web analytics platform at massive scale, has quickly evolved into one of the hottest open source observability data stores around. Its published performance benchmarks have been the topic of conversation, outperforming many legacy databases and setting a new bar for fast queries over large volumes of data.Our guest for this episode is Robert Hodges, CEO of Altinity — the second largest contributor to the ClickHouse project. With over 30 years of experience in databases, Robert brings deep insights into how ClickHouse is challenging legacy databases at scale. We'll also explore Altinity's just-launched groundbreaking open source project—Project Antalya—which extends ClickHouse with Apache Iceberg shared storage, unlocking dramatic improvements in both performance and cost efficiency. Think 90% reductions in storage costs and 10 to 100x faster queries, all without requiring any changes to your existing applications.The episode was live-streamed on 20 May 2025 and the video is available at https://www.youtube.com/watch?v=VeyTL2JlWp0You can read the recap post: https://medium.com/p/2004160b2f5e/ OpenObservability Talks episodes are released monthly, on the last Thursday of each month and are available for listening on your favorite podcast app and on YouTube.We live-stream the episodes on Twitch and YouTube Live - tune in to see us live, and chime in with your comments and questions on the live chat.⁠⁠https://www.youtube.com/@openobservabilitytalks⁠ https://www.twitch.tv/openobservability⁠Show Notes:00:00 - Intro01:38 - ClickHouse elevator pitch02:46 - guest intro04:48 - ClickHouse under the hood08:15 - SQL and the database evolution path 11:20 - the return of SQL16:13 - design for speed 17:14 - use cases for ClickHouse19:18 - ClickHouse ecosystem22:22 - ClickHouse on Kubernetes 31:45 - know how ClickHouse works inside to get the most out of it 38:59 - ClickHouse for Observability46:58 - Project Antalya55:03 - Kubernetes 1.33 release55:32 - OpenSearch 3.0 release56:01 - New Permissive License for ML Models Announced by the Linux Foundation57:08 - OutroResources:ClickHouse on GitHub: https://github.com/ClickHouse/ClickHouse Shopify's Journey to Planet-Scale Observability: https://medium.com/p/9c0b299a04ddProject Antalya: https://altinity.com/blog/getting-started-with-altinitys-project-antalya https://cmtops.dev/posts/building-observability-with-clickhouse/ Kubernetes 1.33 release highlights: https://www.linkedin.com/feed/update/urn:li:activity:7321054742174924800/ New Permissive License for Machine Learning Models Announced by the Linux Foundation: https://www.linkedin.com/feed/update/urn:li:share:7331046183244611584 Opensearch 3.0 major release: https://www.linkedin.com/posts/horovits_opensearch-activity-7325834736008880128-kCqrSocials:Twitter:⁠ https://twitter.com/OpenObserv⁠YouTube: ⁠https://www.youtube.com/@openobservabilitytalks⁠Dotan Horovits============X (Twitter): @horovitsLinkedIn: www.linkedin.com/in/horovitsMastodon: @horovits@fosstodonBlueSky: @horovits.bsky.socialRobert Hodges=============LinkedIn: https://www.linkedin.com/in/berkeleybob2105/

ceo twitch analytics github sql kubernetes speed limits yandex observability linux foundation olap clickhouse apache iceberg

Rethinking Workplace Connection in a Remote World with Marina Farthouat

HRchat Podcast

Play Episode Listen Later May 27, 2025 18:49 Transcription Available

In this episode, Bill Banham talks with Marina Farthouat, the new Vice President, People at Oyster - the global employment platform that enables companies to hire, pay, and care for distributed teams. Marina brings a refreshing perspective to the HRchat Show about transforming workplace norms through remote and distributed teams. Drawing from her diverse background spanning investment banking to startups, Marina shares why her most connected workplace experiences have consistently been in remote organizations."Talent doesn't have a nationality," Marina asserts, challenging traditional location-based hiring approaches. She makes a compelling case for distributed teams as both a talent strategy and a resilience measure. When recent blackouts hit Spain and Portugal, Oyster's globally distributed workforce demonstrated exactly this kind of operational continuity—shifting activities seamlessly to unaffected regions.What makes Marina's perspective particularly valuable is her holistic view of remote work benefits. Beyond the usual flexibility talking points, she highlights how remote arrangements positively impact families, especially children who no longer lose parents to long commutes. She dismantles the myth that remote workers feel disconnected, explaining how digital platforms actually create more egalitarian access to leadership and information than traditional office environments where proximity matters.The conversation tackles economic concerns head-on, addressing fears that global hiring simply shifts jobs to lower-cost regions. Marina offers a more nuanced view: global talent acquisition isn't about replacement but expansion and resilience. She emphasizes that startups particularly benefit from access to diverse talent pools while managing burn rates effectively.By the way, if you enjoy this conversation and want to learn more about Marina's team and some of the people challenges they tackle, check out episode 338 with Oyster Co-founder Jack Mardack.Marina leads the company's people strategy with a focus on building a human-centric, inclusive, and sustainable culture across a global workforce.She brings a wealth of experience leading People functions in remote organizations, most recently at ClickHouse and Elastic. Marina has developed deep expertise in employee engagement and organizational development, as well as in building strong cultures and scaling global comSupport the showFeature Your Brand on the HRchat PodcastThe HRchat show has had 100,000s of downloads and is frequently listed as one of the most popular global podcasts for HR pros, Talent execs and leaders. It is ranked in the top ten in the world based on traffic, social media followers, domain authority & freshness. The podcast is also ranked as the Best Canadian HR Podcast by FeedSpot and one of the top 10% most popular shows by Listen Score. Want to share the story of how your business is helping to shape the world of work? We offer sponsored episodes, audio adverts, email campaigns, and a host of other options. Check out packages here. Follow us on LinkedIn Subscribe to our newsletter Check out our in-person events

leadership vice president spain drawing jobs talent portugal workplace remote rethinking recruitment oyster feedspot elastic remote workers employer brand hr podcast clickhouse bill banham listen score

244: Postgres to ClickHouse: Simplifying the Modern Data Stack with Aaron Katz & Sai Krishna Srirampur

The Data Stack Show

Play Episode Listen Later May 20, 2025 34:51

Highlights from this week's conversation include:Background of ClickHouse (1:14)PostgreSQL Data Replication Tool (3:19)Emerging Technologies Observations (7:25)Observability and Market Dynamics (11:26)Product Development Challenges (12:39)Challenges with PostgreSQL Performance (15:30)Philosophy of Open Source (18:01)Open Source Advantages (22:56)Simplified Stack Vision (24:48)End-to-End Use Cases (28:13)Migration Strategies (30:21)Final Thoughts and Takeaways (33:29)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

challenges philosophy takeaways final thoughts simplifying open source krishna cdp observability postgres clickhouse modern data stack aaron katz rudderstack

The PRQL: Data Migration Made Easy: Postgres, ClickHouse, and the Future of Analytics with Aaron Katz and Sai Krishna Srirampur

The Data Stack Show

Play Episode Listen Later May 19, 2025 5:47

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

analytics krishna cdp postgres data migration clickhouse aaron katz rudderstack

Shopify's Journey to Planet-Scale Observability - OpenObservability Talks S5E09

OpenObservability Talks

Play Episode Listen Later Feb 27, 2025 60:24

Shopify operates at massive scale, running thousands of services and processing billions of events per second. To tackle the challenges of observability at this scale, they built Observe—an in-house observability stack that makes use of open-source tools and specifications. In fact, they replaced an older vendors-based system, in an awe-inspiring migration project. But why build their own stack? Which open source tools did they use? How did they shape the user experience to their needs?Joining us to unpack Shopify's journey is Elijah McPherson, an engineering leader with deep expertise in observability and distributed systems. Elijah led the complete rebuild of Shopify's observability stack and now also oversees jobs, caching, search, and ClickHouse infrastructure. Tune in to hear firsthand insights from one of the most innovative purpose-built observability implementations in production today!The episode was live-streamed on 11 February 2025 and the video is available at https://www.youtube.com/watch?v=rBfTjlXKJW0OpenObservability Talks episodes are released monthly, on the last Thursday of each month and are available for listening on your favorite podcast app and on YouTube.We live-stream the episodes on Twitch and YouTube Live - tune in to see us live, and chime in with your comments and questions on the live chat.⁠⁠https://www.youtube.com/@openobservabilitytalks⁠ https://www.twitch.tv/openobservability⁠Show Notes:00:46 - Episode and guest intro03:43 - Why rebuild the observability stack in house 05:47 - Cost and vendor lock-in07:09 - Tailoring observability for the organizational processes10:27 - How to build a team to build in-house observability 13:37 - The importance of product sense in internal platforms18:05 - The functionality of Shopify's observability platform 25:15 - The Open Source stack used at Shopify observability 29:50 - Extending open source Grafana to Shopify's needs36:23 - Adopting open standards 42:26 - observability into business health45:16 - how to run a migration project for a live production platform53:15 - final tips and best practices 56:41 - which organizations should develop in-house observabilityResources: Episode: Scaling Platform Engineering: Shopify's Blueprint: https://medium.com/p/f18e97140681 Shopify Observe - lectures: https://www.linkedin.com/posts/elijahmcpherson_observe-activity-7258195493657223168-mOGS/ Socials:Twitter:⁠ https://twitter.com/OpenObserv⁠YouTube: ⁠https://www.youtube.com/@openobservabilitytalks⁠Dotan Horovits============Twitter:@horovitsLinkedIn:www.linkedin.com/in/horovitsMastodon: @horovits@fosstodonBlueSky: @horovits.bsky.socialElijah McPherson===============Twitter: https://twitter.com/ElijahMcPhersonLinkedIn: https://www.linkedin.com/in/elijahmcpherson/

cost twitch planet scale blueprint shopify adopting open source observe extending tailoring observability grafana clickhouse

Programmers Quickie

Play Episode Listen Later Feb 16, 2025 13:21

big data metrics devops grafana clickhouse cardinality

Agent Engineering with Pydantic + Graphs — with Samuel Colvin

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Feb 6, 2025 64:04

Did you know that adding a simple Code Interpreter took o3 from 9.2% to 32% on FrontierMath? The Latent Space crew is hosting a hack night Feb 11th in San Francisco focused on CodeGen use cases, co-hosted with E2B and Edge AGI; watch E2B's new workshop and RSVP here!We're happy to announce that today's guest Samuel Colvin will be teaching his very first Pydantic AI workshop at the newly announced AI Engineer NYC Workshops day on Feb 22! 25 tickets left.If you're a Python developer, it's very likely that you've heard of Pydantic. Every month, it's downloaded >300,000,000 times, making it one of the top 25 PyPi packages. OpenAI uses it in its SDK for structured outputs, it's at the core of FastAPI, and if you've followed our AI Engineer Summit conference, Jason Liu of Instructor has given two great talks about it: “Pydantic is all you need” and “Pydantic is STILL all you need”. Now, Samuel Colvin has raised $17M from Sequoia to turn Pydantic from an open source project to a full stack AI engineer platform with Logfire, their observability platform, and PydanticAI, their new agent framework.Logfire: bringing OTEL to AIOpenTelemetry recently merged Semantic Conventions for LLM workloads which provides standard definitions to track performance like gen_ai.server.time_per_output_token. In Sam's view at least 80% of new apps being built today have some sort of LLM usage in them, and just like web observability platform got replaced by cloud-first ones in the 2010s, Logfire wants to do the same for AI-first apps. If you're interested in the technical details, Logfire migrated away from Clickhouse to Datafusion for their backend. We spent some time on the importance of picking open source tools you understand and that you can actually contribute to upstream, rather than the more popular ones; listen in ~43:19 for that part.Agents are the killer app for graphsPydantic AI is their attempt at taking a lot of the learnings that LangChain and the other early LLM frameworks had, and putting Python best practices into it. At an API level, it's very similar to the other libraries: you can call LLMs, create agents, do function calling, do evals, etc.They define an “Agent” as a container with a system prompt, tools, structured result, and an LLM. Under the hood, each Agent is now a graph of function calls that can orchestrate multi-step LLM interactions. You can start simple, then move toward fully dynamic graph-based control flow if needed.“We were compelled enough by graphs once we got them right that our agent implementation [...] is now actually a graph under the hood.”Why Graphs?* More natural for complex or multi-step AI workflows.* Easy to visualize and debug with mermaid diagrams.* Potential for distributed runs, or “waiting days” between steps in certain flows.In parallel, you see folks like Emil Eifrem of Neo4j talk about GraphRAG as another place where graphs fit really well in the AI stack, so it might be time for more people to take them seriously.Full Video EpisodeLike and subscribe!Chapters* 00:00:00 Introductions* 00:00:24 Origins of Pydantic* 00:05:28 Pydantic's AI moment * 00:08:05 Why build a new agents framework?* 00:10:17 Overview of Pydantic AI* 00:12:33 Becoming a believer in graphs* 00:24:02 God Model vs Compound AI Systems* 00:28:13 Why not build an LLM gateway?* 00:31:39 Programmatic testing vs live evals* 00:35:51 Using OpenTelemetry for AI traces* 00:43:19 Why they don't use Clickhouse* 00:48:34 Competing in the observability space* 00:50:41 Licensing decisions for Pydantic and LogFire* 00:51:48 Building Pydantic.run* 00:55:24 Marimo and the future of Jupyter notebooks* 00:57:44 London's AI sceneShow Notes* Sam Colvin* Pydantic* Pydantic AI* Logfire* Pydantic.run* Zod* E2B* Arize* Langsmith* Marimo* Prefect* GLA (Google Generative Language API)* OpenTelemetry* Jason Liu* Sebastian Ramirez* Bogomil Balkansky* Hood Chatham* Jeremy Howard* Andrew LambTranscriptAlessio [00:00:03]: Hey, everyone. Welcome to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol AI.Swyx [00:00:12]: Good morning. And today we're very excited to have Sam Colvin join us from Pydantic AI. Welcome. Sam, I heard that Pydantic is all we need. Is that true?Samuel [00:00:24]: I would say you might need Pydantic AI and Logfire as well, but it gets you a long way, that's for sure.Swyx [00:00:29]: Pydantic almost basically needs no introduction. It's almost 300 million downloads in December. And obviously, in the previous podcasts and discussions we've had with Jason Liu, he's been a big fan and promoter of Pydantic and AI.Samuel [00:00:45]: Yeah, it's weird because obviously I didn't create Pydantic originally for uses in AI, it predates LLMs. But it's like we've been lucky that it's been picked up by that community and used so widely.Swyx [00:00:58]: Actually, maybe we'll hear it. Right from you, what is Pydantic and maybe a little bit of the origin story?Samuel [00:01:04]: The best name for it, which is not quite right, is a validation library. And we get some tension around that name because it doesn't just do validation, it will do coercion by default. We now have strict mode, so you can disable that coercion. But by default, if you say you want an integer field and you get in a string of 1, 2, 3, it will convert it to 123 and a bunch of other sensible conversions. And as you can imagine, the semantics around it. Exactly when you convert and when you don't, it's complicated, but because of that, it's more than just validation. Back in 2017, when I first started it, the different thing it was doing was using type hints to define your schema. That was controversial at the time. It was genuinely disapproved of by some people. I think the success of Pydantic and libraries like FastAPI that build on top of it means that today that's no longer controversial in Python. And indeed, lots of other people have copied that route, but yeah, it's a data validation library. It uses type hints for the for the most part and obviously does all the other stuff you want, like serialization on top of that. But yeah, that's the core.Alessio [00:02:06]: Do you have any fun stories on how JSON schemas ended up being kind of like the structure output standard for LLMs? And were you involved in any of these discussions? Because I know OpenAI was, you know, one of the early adopters. So did they reach out to you? Was there kind of like a structure output console in open source that people were talking about or was it just a random?Samuel [00:02:26]: No, very much not. So I originally. Didn't implement JSON schema inside Pydantic and then Sebastian, Sebastian Ramirez, FastAPI came along and like the first I ever heard of him was over a weekend. I got like 50 emails from him or 50 like emails as he was committing to Pydantic, adding JSON schema long pre version one. So the reason it was added was for OpenAPI, which is obviously closely akin to JSON schema. And then, yeah, I don't know why it was JSON that got picked up and used by OpenAI. It was obviously very convenient for us. That's because it meant that not only can you do the validation, but because Pydantic will generate you the JSON schema, it will it kind of can be one source of source of truth for structured outputs and tools.Swyx [00:03:09]: Before we dive in further on the on the AI side of things, something I'm mildly curious about, obviously, there's Zod in JavaScript land. Every now and then there is a new sort of in vogue validation library that that takes over for quite a few years and then maybe like some something else comes along. Is Pydantic? Is it done like the core Pydantic?Samuel [00:03:30]: I've just come off a call where we were redesigning some of the internal bits. There will be a v3 at some point, which will not break people's code half as much as v2 as in v2 was the was the massive rewrite into Rust, but also fixing all the stuff that was broken back from like version zero point something that we didn't fix in v1 because it was a side project. We have plans to move some of the basically store the data in Rust types after validation. Not completely. So we're still working to design the Pythonic version of it, in order for it to be able to convert into Python types. So then if you were doing like validation and then serialization, you would never have to go via a Python type we reckon that can give us somewhere between three and five times another three to five times speed up. That's probably the biggest thing. Also, like changing how easy it is to basically extend Pydantic and define how particular types, like for example, NumPy arrays are validated and serialized. But there's also stuff going on. And for example, Jitter, the JSON library in Rust that does the JSON parsing, has SIMD implementation at the moment only for AMD64. So we can add that. We need to go and add SIMD for other instruction sets. So there's a bunch more we can do on performance. I don't think we're going to go and revolutionize Pydantic, but it's going to continue to get faster, continue, hopefully, to allow people to do more advanced things. We might add a binary format like CBOR for serialization for when you'll just want to put the data into a database and probably load it again from Pydantic. So there are some things that will come along, but for the most part, it should just get faster and cleaner.Alessio [00:05:04]: From a focus perspective, I guess, as a founder too, how did you think about the AI interest rising? And then how do you kind of prioritize, okay, this is worth going into more, and we'll talk about Pydantic AI and all of that. What was maybe your early experience with LLAMP, and when did you figure out, okay, this is something we should take seriously and focus more resources on it?Samuel [00:05:28]: I'll answer that, but I'll answer what I think is a kind of parallel question, which is Pydantic's weird, because Pydantic existed, obviously, before I was starting a company. I was working on it in my spare time, and then beginning of 22, I started working on the rewrite in Rust. And I worked on it full-time for a year and a half, and then once we started the company, people came and joined. And it was a weird project, because that would never go away. You can't get signed off inside a startup. Like, we're going to go off and three engineers are going to work full-on for a year in Python and Rust, writing like 30,000 lines of Rust just to release open-source-free Python library. The result of that has been excellent for us as a company, right? As in, it's made us remain entirely relevant. And it's like, Pydantic is not just used in the SDKs of all of the AI libraries, but I can't say which one, but one of the big foundational model companies, when they upgraded from Pydantic v1 to v2, their number one internal model... The metric of performance is time to first token. That went down by 20%. So you think about all of the actual AI going on inside, and yet at least 20% of the CPU, or at least the latency inside requests was actually Pydantic, which shows like how widely it's used. So we've benefited from doing that work, although it didn't, it would have never have made financial sense in most companies. In answer to your question about like, how do we prioritize AI, I mean, the honest truth is we've spent a lot of the last year and a half building. Good general purpose observability inside LogFire and making Pydantic good for general purpose use cases. And the AI has kind of come to us. Like we just, not that we want to get away from it, but like the appetite, uh, both in Pydantic and in LogFire to go and build with AI is enormous because it kind of makes sense, right? Like if you're starting a new greenfield project in Python today, what's the chance that you're using GenAI 80%, let's say, globally, obviously it's like a hundred percent in California, but even worldwide, it's probably 80%. Yeah. And so everyone needs that stuff. And there's so much yet to be figured out so much like space to do things better in the ecosystem in a way that like to go and implement a database that's better than Postgres is a like Sisyphean task. Whereas building, uh, tools that are better for GenAI than some of the stuff that's about now is not very difficult. Putting the actual models themselves to one side.Alessio [00:07:40]: And then at the same time, then you released Pydantic AI recently, which is, uh, um, you know, agent framework and early on, I would say everybody like, you know, Langchain and like, uh, Pydantic kind of like a first class support, a lot of these frameworks, we're trying to use you to be better. What was the decision behind we should do our own framework? Were there any design decisions that you disagree with any workloads that you think people didn't support? Well,Samuel [00:08:05]: it wasn't so much like design and workflow, although I think there were some, some things we've done differently. Yeah. I think looking in general at the ecosystem of agent frameworks, the engineering quality is far below that of the rest of the Python ecosystem. There's a bunch of stuff that we have learned how to do over the last 20 years of building Python libraries and writing Python code that seems to be abandoned by people when they build agent frameworks. Now I can kind of respect that, particularly in the very first agent frameworks, like Langchain, where they were literally figuring out how to go and do this stuff. It's completely understandable that you would like basically skip some stuff.Samuel [00:08:42]: I'm shocked by the like quality of some of the agent frameworks that have come out recently from like well-respected names, which it just seems to be opportunism and I have little time for that, but like the early ones, like I think they were just figuring out how to do stuff and just as lots of people have learned from Pydantic, we were able to learn a bit from them. I think from like the gap we saw and the thing we were frustrated by was the production readiness. And that means things like type checking, even if type checking makes it hard. Like Pydantic AI, I will put my hand up now and say it has a lot of generics and you need to, it's probably easier to use it if you've written a bit of Rust and you really understand generics, but like, and that is, we're not claiming that that makes it the easiest thing to use in all cases, we think it makes it good for production applications in big systems where type checking is a no-brainer in Python. But there are also a bunch of stuff we've learned from maintaining Pydantic over the years that we've gone and done. So every single example in Pydantic AI's documentation is run on Python. As part of tests and every single print output within an example is checked during tests. So it will always be up to date. And then a bunch of things that, like I say, are standard best practice within the rest of the Python ecosystem, but I'm not followed surprisingly by some AI libraries like coverage, linting, type checking, et cetera, et cetera, where I think these are no-brainers, but like weirdly they're not followed by some of the other libraries.Alessio [00:10:04]: And can you just give an overview of the framework itself? I think there's kind of like the. LLM calling frameworks, there are the multi-agent frameworks, there's the workflow frameworks, like what does Pydantic AI do?Samuel [00:10:17]: I glaze over a bit when I hear all of the different sorts of frameworks, but I like, and I will tell you when I built Pydantic, when I built Logfire and when I built Pydantic AI, my methodology is not to go and like research and review all of the other things. I kind of work out what I want and I go and build it and then feedback comes and we adjust. So the fundamental building block of Pydantic AI is agents. The exact definition of agents and how you want to define them. is obviously ambiguous and our things are probably sort of agent-lit, not that we would want to go and rename them to agent-lit, but like the point is you probably build them together to build something and most people will call an agent. So an agent in our case has, you know, things like a prompt, like system prompt and some tools and a structured return type if you want it, that covers the vast majority of cases. There are situations where you want to go further and the most complex workflows where you want graphs and I resisted graphs for quite a while. I was sort of of the opinion you didn't need them and you could use standard like Python flow control to do all of that stuff. I had a few arguments with people, but I basically came around to, yeah, I can totally see why graphs are useful. But then we have the problem that by default, they're not type safe because if you have a like add edge method where you give the names of two different edges, there's no type checking, right? Even if you go and do some, I'm not, not all the graph libraries are AI specific. So there's a, there's a graph library called, but it allows, it does like a basic runtime type checking. Ironically using Pydantic to try and make up for the fact that like fundamentally that graphs are not typed type safe. Well, I like Pydantic, but it did, that's not a real solution to have to go and run the code to see if it's safe. There's a reason that starting type checking is so powerful. And so we kind of, from a lot of iteration eventually came up with a system of using normally data classes to define nodes where you return the next node you want to call and where we're able to go and introspect the return type of a node to basically build the graph. And so the graph is. Yeah. Inherently type safe. And once we got that right, I, I wasn't, I'm incredibly excited about graphs. I think there's like masses of use cases for them, both in gen AI and other development, but also software's all going to have interact with gen AI, right? It's going to be like web. There's no longer be like a web department in a company is that there's just like all the developers are building for web building with databases. The same is going to be true for gen AI.Alessio [00:12:33]: Yeah. I see on your docs, you call an agent, a container that contains a system prompt function. Tools, structure, result, dependency type model, and then model settings. Are the graphs in your mind, different agents? Are they different prompts for the same agent? What are like the structures in your mind?Samuel [00:12:52]: So we were compelled enough by graphs once we got them right, that we actually merged the PR this morning. That means our agent implementation without changing its API at all is now actually a graph under the hood as it is built using our graph library. So graphs are basically a lower level tool that allow you to build these complex workflows. Our agents are technically one of the many graphs you could go and build. And we just happened to build that one for you because it's a very common, commonplace one. But obviously there are cases where you need more complex workflows where the current agent assumptions don't work. And that's where you can then go and use graphs to build more complex things.Swyx [00:13:29]: You said you were cynical about graphs. What changed your mind specifically?Samuel [00:13:33]: I guess people kept giving me examples of things that they wanted to use graphs for. And my like, yeah, but you could do that in standard flow control in Python became a like less and less compelling argument to me because I've maintained those systems that end up with like spaghetti code. And I could see the appeal of this like structured way of defining the workflow of my code. And it's really neat that like just from your code, just from your type hints, you can get out a mermaid diagram that defines exactly what can go and happen.Swyx [00:14:00]: Right. Yeah. You do have very neat implementation of sort of inferring the graph from type hints, I guess. Yeah. Is what I would call it. Yeah. I think the question always is I have gone back and forth. I used to work at Temporal where we would actually spend a lot of time complaining about graph based workflow solutions like AWS step functions. And we would actually say that we were better because you could use normal control flow that you already knew and worked with. Yours, I guess, is like a little bit of a nice compromise. Like it looks like normal Pythonic code. But you just have to keep in mind what the type hints actually mean. And that's what we do with the quote unquote magic that the graph construction does.Samuel [00:14:42]: Yeah, exactly. And if you look at the internal logic of actually running a graph, it's incredibly simple. It's basically call a node, get a node back, call that node, get a node back, call that node. If you get an end, you're done. We will add in soon support for, well, basically storage so that you can store the state between each node that's run. And then the idea is you can then distribute the graph and run it across computers. And also, I mean, the other weird, the other bit that's really valuable is across time. Because it's all very well if you look at like lots of the graph examples that like Claude will give you. If it gives you an example, it gives you this lovely enormous mermaid chart of like the workflow, for example, managing returns if you're an e-commerce company. But what you realize is some of those lines are literally one function calls another function. And some of those lines are wait six days for the customer to print their like piece of paper and put it in the post. And if you're writing like your demo. Project or your like proof of concept, that's fine because you can just say, and now we call this function. But when you're building when you're in real in real life, that doesn't work. And now how do we manage that concept to basically be able to start somewhere else in the in our code? Well, this graph implementation makes it incredibly easy because you just pass the node that is the start point for carrying on the graph and it continues to run. So it's things like that where I was like, yeah, I can just imagine how things I've done in the past would be fundamentally easier to understand if we had done them with graphs.Swyx [00:16:07]: You say imagine, but like right now, this pedantic AI actually resume, you know, six days later, like you said, or is this just like a theoretical thing we can go someday?Samuel [00:16:16]: I think it's basically Q&A. So there's an AI that's asking the user a question and effectively you then call the CLI again to continue the conversation. And it basically instantiates the node and calls the graph with that node again. Now, we don't have the logic yet for effectively storing state in the database between individual nodes that we're going to add soon. But like the rest of it is basically there.Swyx [00:16:37]: It does make me think that not only are you competing with Langchain now and obviously Instructor, and now you're going into sort of the more like orchestrated things like Airflow, Prefect, Daxter, those guys.Samuel [00:16:52]: Yeah, I mean, we're good friends with the Prefect guys and Temporal have the same investors as us. And I'm sure that my investor Bogomol would not be too happy if I was like, oh, yeah, by the way, as well as trying to take on Datadog. We're also going off and trying to take on Temporal and everyone else doing that. Obviously, we're not doing all of the infrastructure of deploying that right yet, at least. We're, you know, we're just building a Python library. And like what's crazy about our graph implementation is, sure, there's a bit of magic in like introspecting the return type, you know, extracting things from unions, stuff like that. But like the actual calls, as I say, is literally call a function and get back a thing and call that. It's like incredibly simple and therefore easy to maintain. The question is, how useful is it? Well, I don't know yet. I think we have to go and find out. We have a whole. We've had a slew of people joining our Slack over the last few days and saying, tell me how good Pydantic AI is. How good is Pydantic AI versus Langchain? And I refuse to answer. That's your job to go and find that out. Not mine. We built a thing. I'm compelled by it, but I'm obviously biased. The ecosystem will work out what the useful tools are.Swyx [00:17:52]: Bogomol was my board member when I was at Temporal. And I think I think just generally also having been a workflow engine investor and participant in this space, it's a big space. Like everyone needs different functions. I think the one thing that I would say like yours, you know, as a library, you don't have that much control of it over the infrastructure. I do like the idea that each new agents or whatever or unit of work, whatever you call that should spin up in this sort of isolated boundaries. Whereas yours, I think around everything runs in the same process. But you ideally want to sort of spin out its own little container of things.Samuel [00:18:30]: I agree with you a hundred percent. And we will. It would work now. Right. As in theory, you're just like as long as you can serialize the calls to the next node, you just have to all of the different containers basically have to have the same the same code. I mean, I'm super excited about Cloudflare workers running Python and being able to install dependencies. And if Cloudflare could only give me my invitation to the private beta of that, we would be exploring that right now because I'm super excited about that as a like compute level for some of this stuff where exactly what you're saying, basically. You can run everything as an individual. Like worker function and distribute it. And it's resilient to failure, et cetera, et cetera.Swyx [00:19:08]: And it spins up like a thousand instances simultaneously. You know, you want it to be sort of truly serverless at once. Actually, I know we have some Cloudflare friends who are listening, so hopefully they'll get in front of the line. Especially.Samuel [00:19:19]: I was in Cloudflare's office last week shouting at them about other things that frustrate me. I have a love-hate relationship with Cloudflare. Their tech is awesome. But because I use it the whole time, I then get frustrated. So, yeah, I'm sure I will. I will. I will get there soon.Swyx [00:19:32]: There's a side tangent on Cloudflare. Is Python supported at full? I actually wasn't fully aware of what the status of that thing is.Samuel [00:19:39]: Yeah. So Pyodide, which is Python running inside the browser in scripting, is supported now by Cloudflare. They basically, they're having some struggles working out how to manage, ironically, dependencies that have binaries, in particular, Pydantic. Because these workers where you can have thousands of them on a given metal machine, you don't want to have a difference. You basically want to be able to have a share. Shared memory for all the different Pydantic installations, effectively. That's the thing they work out. They're working out. But Hood, who's my friend, who is the primary maintainer of Pyodide, works for Cloudflare. And that's basically what he's doing, is working out how to get Python running on Cloudflare's network.Swyx [00:20:19]: I mean, the nice thing is that your binary is really written in Rust, right? Yeah. Which also compiles the WebAssembly. Yeah. So maybe there's a way that you'd build... You have just a different build of Pydantic and that ships with whatever your distro for Cloudflare workers is.Samuel [00:20:36]: Yes, that's exactly what... So Pyodide has builds for Pydantic Core and for things like NumPy and basically all of the popular binary libraries. Yeah. It's just basic. And you're doing exactly that, right? You're using Rust to compile the WebAssembly and then you're calling that shared library from Python. And it's unbelievably complicated, but it works. Okay.Swyx [00:20:57]: Staying on graphs a little bit more, and then I wanted to go to some of the other features that you have in Pydantic AI. I see in your docs, there are sort of four levels of agents. There's single agents, there's agent delegation, programmatic agent handoff. That seems to be what OpenAI swarms would be like. And then the last one, graph-based control flow. Would you say that those are sort of the mental hierarchy of how these things go?Samuel [00:21:21]: Yeah, roughly. Okay.Swyx [00:21:22]: You had some expression around OpenAI swarms. Well.Samuel [00:21:25]: And indeed, OpenAI have got in touch with me and basically, maybe I'm not supposed to say this, but basically said that Pydantic AI looks like what swarms would become if it was production ready. So, yeah. I mean, like, yeah, which makes sense. Awesome. Yeah. I mean, in fact, it was specifically saying, how can we give people the same feeling that they were getting from swarms that led us to go and implement graphs? Because my, like, just call the next agent with Python code was not a satisfactory answer to people. So it was like, okay, we've got to go and have a better answer for that. It's not like, let us to get to graphs. Yeah.Swyx [00:21:56]: I mean, it's a minimal viable graph in some sense. What are the shapes of graphs that people should know? So the way that I would phrase this is I think Anthropic did a very good public service and also kind of surprisingly influential blog post, I would say, when they wrote Building Effective Agents. We actually have the authors coming to speak at my conference in New York, which I think you're giving a workshop at. Yeah.Samuel [00:22:24]: I'm trying to work it out. But yes, I think so.Swyx [00:22:26]: Tell me if you're not. yeah, I mean, like, that was the first, I think, authoritative view of, like, what kinds of graphs exist in agents and let's give each of them a name so that everyone is on the same page. So I'm just kind of curious if you have community names or top five patterns of graphs.Samuel [00:22:44]: I don't have top five patterns of graphs. I would love to see what people are building with them. But like, it's been it's only been a couple of weeks. And of course, there's a point is that. Because they're relatively unopinionated about what you can go and do with them. They don't suit them. Like, you can go and do lots of lots of things with them, but they don't have the structure to go and have like specific names as much as perhaps like some other systems do. I think what our agents are, which have a name and I can't remember what it is, but this basically system of like, decide what tool to call, go back to the center, decide what tool to call, go back to the center and then exit. One form of graph, which, as I say, like our agents are effectively one implementation of a graph, which is why under the hood they are now using graphs. And it'll be interesting to see over the next few years whether we end up with these like predefined graph names or graph structures or whether it's just like, yep, I built a graph or whether graphs just turn out not to match people's mental image of what they want and die away. We'll see.Swyx [00:23:38]: I think there is always appeal. Every developer eventually gets graph religion and goes, oh, yeah, everything's a graph. And then they probably over rotate and go go too far into graphs. And then they have to learn a whole bunch of DSLs. And then they're like, actually, I didn't need that. I need this. And they scale back a little bit.Samuel [00:23:55]: I'm at the beginning of that process. I'm currently a graph maximalist, although I haven't actually put any into production yet. But yeah.Swyx [00:24:02]: This has a lot of philosophical connections with other work coming out of UC Berkeley on compounding AI systems. I don't know if you know of or care. This is the Gartner world of things where they need some kind of industry terminology to sell it to enterprises. I don't know if you know about any of that.Samuel [00:24:24]: I haven't. I probably should. I should probably do it because I should probably get better at selling to enterprises. But no, no, I don't. Not right now.Swyx [00:24:29]: This is really the argument is that instead of putting everything in one model, you have more control and more maybe observability to if you break everything out into composing little models and changing them together. And obviously, then you need an orchestration framework to do that. Yeah.Samuel [00:24:47]: And it makes complete sense. And one of the things we've seen with agents is they work well when they work well. But when they. Even if you have the observability through log five that you can see what was going on, if you don't have a nice hook point to say, hang on, this is all gone wrong. You have a relatively blunt instrument of basically erroring when you exceed some kind of limit. But like what you need to be able to do is effectively iterate through these runs so that you can have your own control flow where you're like, OK, we've gone too far. And that's where one of the neat things about our graph implementation is you can basically call next in a loop rather than just running the full graph. And therefore, you have this opportunity to to break out of it. But yeah, basically, it's the same point, which is like if you have two bigger unit of work to some extent, whether or not it involves gen AI. But obviously, it's particularly problematic in gen AI. You only find out afterwards when you've spent quite a lot of time and or money when it's gone off and done done the wrong thing.Swyx [00:25:39]: Oh, drop on this. We're not going to resolve this here, but I'll drop this and then we can move on to the next thing. This is the common way that we we developers talk about this. And then the machine learning researchers look at us. And laugh and say, that's cute. And then they just train a bigger model and they wipe us out in the next training run. So I think there's a certain amount of we are fighting the bitter lesson here. We're fighting AGI. And, you know, when AGI arrives, this will all go away. Obviously, on Latent Space, we don't really discuss that because I think AGI is kind of this hand wavy concept that isn't super relevant. But I think we have to respect that. For example, you could do a chain of thoughts with graphs and you could manually orchestrate a nice little graph that does like. Reflect, think about if you need more, more inference time, compute, you know, that's the hot term now. And then think again and, you know, scale that up. Or you could train Strawberry and DeepSeq R1. Right.Samuel [00:26:32]: I saw someone saying recently, oh, they were really optimistic about agents because models are getting faster exponentially. And I like took a certain amount of self-control not to describe that it wasn't exponential. But my main point was. If models are getting faster as quickly as you say they are, then we don't need agents and we don't really need any of these abstraction layers. We can just give our model and, you know, access to the Internet, cross our fingers and hope for the best. Agents, agent frameworks, graphs, all of this stuff is basically making up for the fact that right now the models are not that clever. In the same way that if you're running a customer service business and you have loads of people sitting answering telephones, the less well trained they are, the less that you trust them, the more that you need to give them a script to go through. Whereas, you know, so if you're running a bank and you have lots of customer service people who you don't trust that much, then you tell them exactly what to say. If you're doing high net worth banking, you just employ people who you think are going to be charming to other rich people and set them off to go and have coffee with people. Right. And the same is true of models. The more intelligent they are, the less we need to tell them, like structure what they go and do and constrain the routes in which they take.Swyx [00:27:42]: Yeah. Yeah. Agree with that. So I'm happy to move on. So the other parts of Pydantic AI that are worth commenting on, and this is like my last rant, I promise. So obviously, every framework needs to do its sort of model adapter layer, which is, oh, you can easily swap from OpenAI to Cloud to Grok. You also have, which I didn't know about, Google GLA, which I didn't really know about until I saw this in your docs, which is generative language API. I assume that's AI Studio? Yes.Samuel [00:28:13]: Google don't have good names for it. So Vertex is very clear. That seems to be the API that like some of the things use, although it returns 503 about 20% of the time. So... Vertex? No. Vertex, fine. But the... Oh, oh. GLA. Yeah. Yeah.Swyx [00:28:28]: I agree with that.Samuel [00:28:29]: So we have, again, another example of like, well, I think we go the extra mile in terms of engineering is we run on every commit, at least commit to main, we run tests against the live models. Not lots of tests, but like a handful of them. Oh, okay. And we had a point last week where, yeah, GLA is a little bit better. GLA1 was failing every single run. One of their tests would fail. And we, I think we might even have commented out that one at the moment. So like all of the models fail more often than you might expect, but like that one seems to be particularly likely to fail. But Vertex is the same API, but much more reliable.Swyx [00:29:01]: My rant here is that, you know, versions of this appear in Langchain and every single framework has to have its own little thing, a version of that. I would put to you, and then, you know, this is, this can be agree to disagree. This is not needed in Pydantic AI. I would much rather you adopt a layer like Lite LLM or what's the other one in JavaScript port key. And that's their job. They focus on that one thing and they, they normalize APIs for you. All new models are automatically added and you don't have to duplicate this inside of your framework. So for example, if I wanted to use deep seek, I'm out of luck because Pydantic AI doesn't have deep seek yet.Samuel [00:29:38]: Yeah, it does.Swyx [00:29:39]: Oh, it does. Okay. I'm sorry. But you know what I mean? Should this live in your code or should it live in a layer that's kind of your API gateway that's a defined piece of infrastructure that people have?Samuel [00:29:49]: And I think if a company who are well known, who are respected by everyone had come along and done this at the right time, maybe we should have done it a year and a half ago and said, we're going to be the universal AI layer. That would have been a credible thing to do. I've heard varying reports of Lite LLM is the truth. And it didn't seem to have exactly the type safety that we needed. Also, as I understand it, and again, I haven't looked into it in great detail. Part of their business model is proxying the request through their, through their own system to do the generalization. That would be an enormous put off to an awful lot of people. Honestly, the truth is I don't think it is that much work unifying the model. I get where you're coming from. I kind of see your point. I think the truth is that everyone is centralizing around open AIs. Open AI's API is the one to do. So DeepSeq support that. Grok with OK support that. Ollama also does it. I mean, if there is that library right now, it's more or less the open AI SDK. And it's very high quality. It's well type checked. It uses Pydantic. So I'm biased. But I mean, I think it's pretty well respected anyway.Swyx [00:30:57]: There's different ways to do this. Because also, it's not just about normalizing the APIs. You have to do secret management and all that stuff.Samuel [00:31:05]: Yeah. And there's also. There's Vertex and Bedrock, which to one extent or another, effectively, they host multiple models, but they don't unify the API. But they do unify the auth, as I understand it. Although we're halfway through doing Bedrock. So I don't know about it that well. But they're kind of weird hybrids because they support multiple models. But like I say, the auth is centralized.Swyx [00:31:28]: Yeah, I'm surprised they don't unify the API. That seems like something that I would do. You know, we can discuss all this all day. There's a lot of APIs. I agree.Samuel [00:31:36]: It would be nice if there was a universal one that we didn't have to go and build.Alessio [00:31:39]: And I guess the other side of, you know, routing model and picking models like evals. How do you actually figure out which one you should be using? I know you have one. First of all, you have very good support for mocking in unit tests, which is something that a lot of other frameworks don't do. So, you know, my favorite Ruby library is VCR because it just, you know, it just lets me store the HTTP requests and replay them. That part I'll kind of skip. I think you are busy like this test model. We're like just through Python. You try and figure out what the model might respond without actually calling the model. And then you have the function model where people can kind of customize outputs. Any other fun stories maybe from there? Or is it just what you see is what you get, so to speak?Samuel [00:32:18]: On those two, I think what you see is what you get. On the evals, I think watch this space. I think it's something that like, again, I was somewhat cynical about for some time. Still have my cynicism about some of the well, it's unfortunate that so many different things are called evals. It would be nice if we could agree. What they are and what they're not. But look, I think it's a really important space. I think it's something that we're going to be working on soon, both in Pydantic AI and in LogFire to try and support better because it's like it's an unsolved problem.Alessio [00:32:45]: Yeah, you do say in your doc that anyone who claims to know for sure exactly how your eval should be defined can safely be ignored.Samuel [00:32:52]: We'll delete that sentence when we tell people how to do their evals.Alessio [00:32:56]: Exactly. I was like, we need we need a snapshot of this today. And so let's talk about eval. So there's kind of like the vibe. Yeah. So you have evals, which is what you do when you're building. Right. Because you cannot really like test it that many times to get statistical significance. And then there's the production eval. So you also have LogFire, which is kind of like your observability product, which I tried before. It's very nice. What are some of the learnings you've had from building an observability tool for LEMPs? And yeah, as people think about evals, even like what are the right things to measure? What are like the right number of samples that you need to actually start making decisions?Samuel [00:33:33]: I'm not the best person to answer that is the truth. So I'm not going to come in here and tell you that I think I know the answer on the exact number. I mean, we can do some back of the envelope statistics calculations to work out that like having 30 probably gets you most of the statistical value of having 200 for, you know, by definition, 15% of the work. But the exact like how many examples do you need? For example, that's a much harder question to answer because it's, you know, it's deep within the how models operate in terms of LogFire. One of the reasons we built LogFire the way we have and we allow you to write SQL directly against your data and we're trying to build the like powerful fundamentals of observability is precisely because we know we don't know the answers. And so allowing people to go and innovate on how they're going to consume that stuff and how they're going to process it is we think that's valuable. Because even if we come along and offer you an evals framework on top of LogFire, it won't be right in all regards. And we want people to be able to go and innovate and being able to write their own SQL connected to the API. And effectively query the data like it's a database with SQL allows people to innovate on that stuff. And that's what allows us to do it as well. I mean, we do a bunch of like testing what's possible by basically writing SQL directly against LogFire as any user could. I think the other the other really interesting bit that's going on in observability is OpenTelemetry is centralizing around semantic attributes for GenAI. So it's a relatively new project. A lot of it's still being added at the moment. But basically the idea that like. They unify how both SDKs and or agent frameworks send observability data to to any OpenTelemetry endpoint. And so, again, we can go and having that unification allows us to go and like basically compare different libraries, compare different models much better. That stuff's in a very like early stage of development. One of the things we're going to be working on pretty soon is basically, I suspect, GenAI will be the first agent framework that implements those semantic attributes properly. Because, again, we control and we can say this is important for observability, whereas most of the other agent frameworks are not maintained by people who are trying to do observability. With the exception of Langchain, where they have the observability platform, but they chose not to go down the OpenTelemetry route. So they're like plowing their own furrow. And, you know, they're a lot they're even further away from standardization.Alessio [00:35:51]: Can you maybe just give a quick overview of how OTEL ties into the AI workflows? There's kind of like the question of is, you know, a trace. And a span like a LLM call. Is it the agent? It's kind of like the broader thing you're tracking. How should people think about it?Samuel [00:36:06]: Yeah, so they have a PR that I think may have now been merged from someone at IBM talking about remote agents and trying to support this concept of remote agents within GenAI. I'm not particularly compelled by that because I don't think that like that's actually by any means the common use case. But like, I suppose it's fine for it to be there. The majority of the stuff in OTEL is basically defining how you would instrument. A given call to an LLM. So basically the actual LLM call, what data you would send to your telemetry provider, how you would structure that. Apart from this slightly odd stuff on remote agents, most of the like agent level consideration is not yet implemented in is not yet decided effectively. And so there's a bit of ambiguity. Obviously, what's good about OTEL is you can in the end send whatever attributes you like. But yeah, there's quite a lot of churn in that space and exactly how we store the data. I think that one of the most interesting things, though, is that if you think about observability. Traditionally, it was sure everyone would say our observability data is very important. We must keep it safe. But actually, companies work very hard to basically not have anything that sensitive in their observability data. So if you're a doctor in a hospital and you search for a drug for an STI, the sequel might be sent to the observability provider. But none of the parameters would. It wouldn't have the patient number or their name or the drug. With GenAI, that distinction doesn't exist because it's all just messed up in the text. If you have that same patient asking an LLM how to. What drug they should take or how to stop smoking. You can't extract the PII and not send it to the observability platform. So the sensitivity of the data that's going to end up in observability platforms is going to be like basically different order of magnitude to what's in what you would normally send to Datadog. Of course, you can make a mistake and send someone's password or their card number to Datadog. But that would be seen as a as a like mistake. Whereas in GenAI, a lot of data is going to be sent. And I think that's why companies like Langsmith and are trying hard to offer observability. On prem, because there's a bunch of companies who are happy for Datadog to be cloud hosted, but want self-hosted self-hosting for this observability stuff with GenAI.Alessio [00:38:09]: And are you doing any of that today? Because I know in each of the spans you have like the number of tokens, you have the context, you're just storing everything. And then you're going to offer kind of like a self-hosting for the platform, basically. Yeah. Yeah.Samuel [00:38:23]: So we have scrubbing roughly equivalent to what the other observability platforms have. So if we, you know, if we see password as the key, we won't send the value. But like, like I said, that doesn't really work in GenAI. So we're accepting we're going to have to store a lot of data and then we'll offer self-hosting for those people who can afford it and who need it.Alessio [00:38:42]: And then this is, I think, the first time that most of the workloads performance is depending on a third party. You know, like if you're looking at Datadog data, usually it's your app that is driving the latency and like the memory usage and all of that. Here you're going to have spans that maybe take a long time to perform because the GLA API is not working or because OpenAI is kind of like overwhelmed. Do you do anything there since like the provider is almost like the same across customers? You know, like, are you trying to surface these things for people and say, hey, this was like a very slow span, but actually all customers using OpenAI right now are seeing the same thing. So maybe don't worry about it or.Samuel [00:39:20]: Not yet. We do a few things that people don't generally do in OTA. So we send. We send information at the beginning. At the beginning of a trace as well as sorry, at the beginning of a span, as well as when it finishes. By default, OTA only sends you data when the span finishes. So if you think about a request which might take like 20 seconds, even if some of the intermediate spans finished earlier, you can't basically place them on the page until you get the top level span. And so if you're using standard OTA, you can't show anything until those requests are finished. When those requests are taking a few hundred milliseconds, it doesn't really matter. But when you're doing Gen AI calls or when you're like running a batch job that might take 30 minutes. That like latency of not being able to see the span is like crippling to understanding your application. And so we've we do a bunch of slightly complex stuff to basically send data about a span as it starts, which is closely related. Yeah.Alessio [00:40:09]: Any thoughts on all the other people trying to build on top of OpenTelemetry in different languages, too? There's like the OpenLEmetry project, which doesn't really roll off the tongue. But how do you see the future of these kind of tools? Is everybody going to have to build? Why does everybody want to build? They want to build their own open source observability thing to then sell?Samuel [00:40:29]: I mean, we are not going off and trying to instrument the likes of the OpenAI SDK with the new semantic attributes, because at some point that's going to happen and it's going to live inside OTEL and we might help with it. But we're a tiny team. We don't have time to go and do all of that work. So OpenLEmetry, like interesting project. But I suspect eventually most of those semantic like that instrumentation of the big of the SDKs will live, like I say, inside the main OpenTelemetry report. I suppose. What happens to the agent frameworks? What data you basically need at the framework level to get the context is kind of unclear. I don't think we know the answer yet. But I mean, I was on the, I guess this is kind of semi-public, because I was on the call with the OpenTelemetry call last week talking about GenAI. And there was someone from Arize talking about the challenges they have trying to get OpenTelemetry data out of Langchain, where it's not like natively implemented. And obviously they're having quite a tough time. And I was realizing, hadn't really realized this before, but how lucky we are to primarily be talking about our own agent framework, where we have the control rather than trying to go and instrument other people's.Swyx [00:41:36]: Sorry, I actually didn't know about this semantic conventions thing. It looks like, yeah, it's merged into main OTel. What should people know about this? I had never heard of it before.Samuel [00:41:45]: Yeah, I think it looks like a great start. I think there's some unknowns around how you send the messages that go back and forth, which is kind of the most important part. It's the most important thing of all. And that is moved out of attributes and into OTel events. OTel events in turn are moving from being on a span to being their own top-level API where you send data. So there's a bunch of churn still going on. I'm impressed by how fast the OTel community is moving on this project. I guess they, like everyone else, get that this is important, and it's something that people are crying out to get instrumentation off. So I'm kind of pleasantly surprised at how fast they're moving, but it makes sense.Swyx [00:42:25]: I'm just kind of browsing through the specification. I can already see that this basically bakes in whatever the previous paradigm was. So now they have genai.usage.prompt tokens and genai.usage.completion tokens. And obviously now we have reasoning tokens as well. And then only one form of sampling, which is top-p. You're basically baking in or sort of reifying things that you think are important today, but it's not a super foolproof way of doing this for the future. Yeah.Samuel [00:42:54]: I mean, that's what's neat about OTel is you can always go and send another attribute and that's fine. It's just there are a bunch that are agreed on. But I would say, you know, to come back to your previous point about whether or not we should be relying on one centralized abstraction layer, this stuff is moving so fast that if you start relying on someone else's standard, you risk basically falling behind because you're relying on someone else to keep things up to date.Swyx [00:43:14]: Or you fall behind because you've got other things going on.Samuel [00:43:17]: Yeah, yeah. That's fair. That's fair.Swyx [00:43:19]: Any other observations just about building LogFire, actually? Let's just talk about this. So you announced LogFire. I was kind of only familiar with LogFire because of your Series A announcement. I actually thought you were making a separate company. I remember some amount of confusion with you when that came out. So to be clear, it's Pydantic LogFire and the company is one company that has kind of two products, an open source thing and an observability thing, correct? Yeah. I was just kind of curious, like any learnings building LogFire? So classic question is, do you use ClickHouse? Is this like the standard persistence layer? Any learnings doing that?Samuel [00:43:54]: We don't use ClickHouse. We started building our database with ClickHouse, moved off ClickHouse onto Timescale, which is a Postgres extension to do analytical databases. Wow. And then moved off Timescale onto DataFusion. And we're basically now building, it's DataFusion, but it's kind of our own database. Bogomil is not entirely happy that we went through three databases before we chose one. I'll say that. But like, we've got to the right one in the end. I think we could have realized that Timescale wasn't right. I think ClickHouse. They both taught us a lot and we're in a great place now. But like, yeah, it's been a real journey on the database in particular.Swyx [00:44:28]: Okay. So, you know, as a database nerd, I have to like double click on this, right? So ClickHouse is supposed to be the ideal backend for anything like this. And then moving from ClickHouse to Timescale is another counterintuitive move that I didn't expect because, you know, Timescale is like an extension on top of Postgres. Not super meant for like high volume logging. But like, yeah, tell us those decisions.Samuel [00:44:50]: So at the time, ClickHouse did not have good support for JSON. I was speaking to someone yesterday and said ClickHouse doesn't have good support for JSON and got roundly stepped on because apparently it does now. So they've obviously gone and built their proper JSON support. But like back when we were trying to use it, I guess a year ago or a bit more than a year ago, everything happened to be a map and maps are a pain to try and do like looking up JSON type data. And obviously all these attributes, everything you're talking about there in terms of the GenAI stuff. You can choose to make them top level columns if you want. But the simplest thing is just to put them all into a big JSON pile. And that was a problem with ClickHouse. Also, ClickHouse had some really ugly edge cases like by default, or at least until I complained about it a lot, ClickHouse thought that two nanoseconds was longer than one second because they compared intervals just by the number, not the unit. And I complained about that a lot. And then they caused it to raise an error and just say you have to have the same unit. Then I complained a bit more. And I think as I understand it now, they have some. They convert between units. But like stuff like that, when all you're looking at is when a lot of what you're doing is comparing the duration of spans was really painful. Also things like you can't subtract two date times to get an interval. You have to use the date sub function. But like the fundamental thing is because we want our end users to write SQL, the like quality of the SQL, how easy it is to write, matters way more to us than if you're building like a platform on top where your developers are going to write the SQL. And once it's written and it's working, you don't mind too much. So I think that's like one of the fundamental differences. The other problem that I have with the ClickHouse and Impact Timescale is that like the ultimate architecture, the like snowflake architecture of binary data in object store queried with some kind of cache from nearby. They both have it, but it's closed sourced and you only get it if you go and use their hosted versions. And so even if we had got through all the problems with Timescale or ClickHouse, we would end up like, you know, they would want to be taking their 80% margin. And then we would be wanting to take that would basically leave us less space for margin. Whereas data fusion. Properly open source, all of that same tooling is open source. And for us as a team of people with a lot of Rust expertise, data fusion, which is implemented in Rust, we can literally dive into it and go and change it. So, for example, I found that there were some slowdowns in data fusion's string comparison kernel for doing like string contains. And it's just Rust code. And I could go and rewrite the string comparison kernel to be faster. Or, for example, data fusion, when we started using it, didn't have JSON support. Obviously, as I've said, it's something we can do. It's something we needed. I was able to go and implement that in a weekend using our JSON parser that we built for Pydantic Core. So it's the fact that like data fusion is like for us the perfect mixture of a toolbox to build a database with, not a database. And we can go and implement stuff on top of it in a way that like if you were trying to do that in Postgres or in ClickHouse. I mean, ClickHouse would be easier because it's C++, relatively modern C++. But like as a team of people who are not C++ experts, that's much scarier than data fusion for us.Swyx [00:47:47]: Yeah, that's a beautiful rant.Alessio [00:47:49]: That's funny. Most people don't think they have agency on these projects. They're kind of like, oh, I should use this or I should use that. They're not really like, what should I pick so that I contribute the most back to it? You know, so but I think you obviously have an open source first mindset. So that makes a lot of sense.Samuel [00:48:05]: I think if we were probably better as a startup, a better startup and faster moving and just like headlong determined to get in front of customers as fast as possible, we should have just started with ClickHouse. I hope that long term we're in a better place for having worked with data fusion. We like we're quite engaged now with the data fusion community. Andrew Lam, who maintains data fusion, is an advisor to us. We're in a really good place now. But yeah, it's definitely slowed us down relative to just like building on ClickHouse and moving as fast as we can.Swyx [00:48:34]: OK, we're about to zoom out and do Pydantic run and all the other stuff. But, you know, my last question on LogFire is really, you know, at some point you run out sort of community goodwill just because like, oh, I use Pydantic. I love Pydantic. I'm going to use LogFire. OK, then you start entering the territory of the Datadogs, the Sentrys and the honeycombs. Yeah. So where are you going to really spike here? What differentiator here?Samuel [00:48:59]: I wasn't writing code in 2001, but I'm assuming that there were people talking about like web observability and then web observability stopped being a thing, not because the web stopped being a thing, but because all observability had to do web. If you were talking to people in 2010 or 2012, they would have talked about cloud observability. Now that's not a term because all observability is cloud first. The same is going to happen to gen AI. And so whether or not you're trying to compete with Datadog or with Arise and Langsmith, you've got to do first class. You've got to do general purpose observability with first class support for AI. And as far as I know, we're the only people really trying to do that. I mean, I think Datadog is starting in that direction. And to be honest, I think Datadog is a much like scarier company to compete with than the AI specific observability platforms. Because in my opinion, and I've also heard this from lots of customers, AI specific observability where you don't see everything else going on in your app is not actually that useful. Our hope is that we can build the first general purpose observability platform with first class support for AI. And that we have this open source heritage of putting developer experience first that other companies haven't done. For all I'm a fan of Datadog and what they've done. If you search Datadog logging Python. And you just try as a like a non-observability expert to get something up and running with Datadog and Python. It's not trivial, right? That's something Sentry have done amazingly well. But like there's enormous space in most of observability to do DX better.Alessio [00:50:27]: Since you mentioned Sentry, I'm curious how you thought about licensing and all of that. Obviously, your MIT license, you don't have any rolling license like Sentry has where you can only use an open source, like the one year old version of it. Was that a hard decision?Samuel [00:50:41]: So to be clear, LogFire is co-sourced. So Pydantic and Pydantic AI are MIT licensed and like properly open source. And then LogFire for now is completely closed source. And in fact, the struggles that Sentry have had with licensing and the like weird pushback the community gives when they take something that's closed source and make it source available just meant that we just avoided that whole subject matter. I think the other way to look at it is like in terms of either headcount or revenue or dollars in the bank. The amount of open source we do as a company is we've got to be open source. We're up there with the most prolific open source companies, like I say, per head. And so we didn't feel like we were morally obligated to make LogFire open source. We have Pydantic. Pydantic is a foundational library in Python. That and now Pydantic AI are our contribution to open source. And then LogFire is like openly for profit, right? As in we're not claiming otherwise. We're not sort of trying to walk a line if it's open source. But really, we want to make it hard to deploy. So you probably want to pay us. We're trying to be straight. That it's to pay for. We could change that at some point in the future, but it's not an immediate plan.Alessio [00:51:48]: All right. So the first one I saw this new I don't know if it's like a product you're building the Pydantic that run, which is a Python browser sandbox. What was the inspiration behind that? We talk a lot about code interpreter for lamps. I'm an investor in a company called E2B, which is a code sandbox as a service for remote execution. Yeah. What's the Pydantic that run story?Samuel [00:52:09]: So Pydantic that run is again completely open source. I have no interest in making it into a product. We just needed a sandbox to be able to demo LogFire in particular, but also Pydantic AI. So it doesn't have it yet, but I'm going to add basically a proxy to OpenAI and the other models so that you can run Pydantic AI in the browser. See how it works. Tweak the prompt, et cetera, et cetera. And we'll have some kind of limit per day of what you can spend on it or like what the spend is. The other thing we wanted to b

new york california ai google china internet pr san francisco project tools mit putting staying states engineering origins cloud honestly agent reflect chapters ibm shared cto excel slack instructors openai competing rust arise uc berkeley api lovely ironically python rsvp aws traditionally github apis licensing strawberry javascript temporal llm gartner ota cpu genai sti agi graphs sequoia grok sql cloudflare git bedrock dx anthropic vcr tweak sdks alessio sentry zod json inherently mcp cli colvin programmatic vertex datadog pii webassembly prefect 17m postgres gla airflow daxter sisyphean jupyter open api neo4j jeremy howard langchain pypi otel numpy dsls jitter timescale clickhouse code interpreter marimo simd latent space pythonic lemps emil eifrem pytest

New SLAP & FLOP Attacks, OCSP Fades Away, DeepSeek's ClickHouse, OAuth 2.0 Security - ASW #316

Paul's Security Weekly TV

Play Episode Listen Later Feb 4, 2025 34:47

Speculative data flow attacks demonstrated against Apple chips with SLAP and FLOP, the design and implementation choices that led to OCSP's demise, an appsec angle on AI, updating the threat model and recommendations for implementing OAuth 2.0, and more! Show Notes: https://securityweekly.com/asw-316

ai apple security attacks slap flop fades speculative oauth clickhouse

New SLAP & FLOP Attacks, OCSP Fades Away, DeepSeek's ClickHouse, OAuth 2.0 Security - ASW #316

Application Security Weekly (Video)

Play Episode Listen Later Feb 4, 2025 34:47

ai apple security attacks slap flop fades speculative oauth clickhouse

DeepSeek Security Failure: Cyber Security Today, Friday, January 31, 2025

Cyber Security Today

Play Episode Listen Later Jan 31, 2025 9:20 Transcription Available

Cybersecurity Today: DeepSeek AI's Data Breach, New API Threats, & Operation Talent In this episode of 'Cybersecurity Today,' host Jim Love delves into the recent security lapse by DeepSeek AI, highlighting the exposure of sensitive data through an open ClickHouse database. Learn about the growing threat posed by APIs as the primary attack vector in cybersecurity, with findings from Wallarm's 2025 API Threat Stat Report. Additionally, discover the impact of international law enforcement's Operation Talent on dismantling major cybercrime forums, and be informed about a new browser attack technique, 'browser sync jacking,' which poses risks to millions of users. Stay tuned for a comprehensive overview of the latest in cybersecurity. 00:00 Major Security Concerns with DeepSeek AI Databases 03:13 The Rise of API Cyber Attacks 05:23 Global Crackdown on Cybercrime Forums 07:04 New Browser Attack Technique Discovered 08:54 Conclusion and Upcoming Discussions

failure conclusion cybersecurity apis data breach clickhouse

090 | Better Stack – Juraj Masár, CEO & Co-Founder

SCRIPTease

Play Episode Listen Later Jan 15, 2025 88:39

Přestože oba zakladatelé monitorovací a debuggovací platformy Better Stack pocházejí ze Slovenska, jejich pražský startup je mnohými odborníky považován za jednu z vůbec nejperspektivnějších českých firem. Získali už dvě obří investice v celkové výši 28 600 000 $ a míří ještě mnohem výše

co founders stack vue tailwinds ruby on rails slovenska juraj redis postgresql clickhouse

Tanya Bragin - Clickhouse, Open Source vs Commercial, and More

The Joe Reis Show

Play Episode Listen Later Nov 20, 2024 56:43

Tanya Bragin and I have a wide-ranging chat about the tension of open source and commercial products, Clickhouse, aligning marketing and product, and how she manages her time.

open source clickhouse

Bootstrapping SaaS, JS Monitoring, & Web Performance | Todd Gardner - Frontend Masters Podcast Ep.21

The Frontend Masters Podcast

Play Episode Listen Later Nov 12, 2024 56:29

In this episode of The Frontend Masters Podcast, Todd Gardner, co-founder of TrackJS and RequestMetrics, discusses his journey from consultant to entrepreneur. He shares insights on bootstrapping SaaS products, competing against VC-backed companies, and the importance of charging customers for your product or service early. Todd delves into technical aspects of his products' stacks, including the use of .NET Core, Clickhouse, and HTMX. He offers advice on public speaking, teaching, and maintaining healthy co-founder relationships. The conversation covers web performance optimization, JavaScript error monitoring, and the challenges of balancing product development with marketing efforts. Todd also reflects on his career philosophy of continuous learning and adaptation in the fast-paced tech industry. Marc has captured his advice on startups in this article, originally an email to Todd in 2014: https://marcgrabanski.com/articles/your-advice-startups/ Check out Todd's Frontend Masters courses here: https://frontendmasters.com/teachers/todd-gardner/ Frontend Masters Online: Twitter: https://twitter.com/FrontendMasters LinkedIn: https://www.linkedin.com/company/frontend-masters/ Facebook: https://www.facebook.com/FrontendMasters Instagram: https://instagram.com/FrontendMasters About Us: Advance your skills with in-depth, modern front-end engineering courses — our 150+ high-quality courses and 18 curated learning paths will guide you from mid-level to senior developer! https://frontendmasters.com/?utm_source=spotify&utm_medium=home_link&utm_campaign=podcastepisode21

saas vc monitoring javascript bootstrapping net core web performance clickhouse htmx todd gardner frontend masters trackjs

Announcing the ClickHouse Podcast

Data Chaos

Play Episode Listen Later Oct 15, 2024 2:18

After two successful seasons of the Data Chaos podcast, we've decided to take a pause on any new episodes. We've absolutely loved all of the conversations we've had with our guests but ultimately have decided to focus our efforts on more technically nuanced content. That new content will be centered on ClickHouse. It's no secret that my company, Propel (which sponsors Data Chaos), is built using this lightning-fast columnar database. We created Propel as the easiest way to run ClickHouse at any scale and learned a ton along the way. We want to share these learnings with the community through our blogs and the new podcast. Whether you're a seasoned ClickHouse admin, a data enthusiast, or just curious about high-performance analytics, this podcast covers everything from basic concepts to advanced use cases. It features industry experts, real-world success stories, and best practices to help you get the most out of ClickHouse in your data stack. We hope you give it a listen! Spotify Apple

propel clickhouse

Friendly Competition within the ClickHouse Ecosystem with Robert Hodges

The Business of Open Source

Play Episode Listen Later Sep 18, 2024 42:51

This week on The Business of Open Source, I spoke with Robert Hodges, CEO of Altinity. This is a great example of an open source company that is built on top of an open source project, ClickHouse, that they did not create and still do not have direct control over. Altinity has created and maintains other open source projects in the ClickHouse ecosystem as well, but So many things to unpack with this episode, but a couple I want to call attention to in particular. The origin story of how Altinity's founder discovered Clickhouse (he did not create it!). I love how Robert specifies that Alexander Zaitsev, one of the Altinity co-founders, discovered ClickHouse because he wasn't happy with how the database he was using scaled — and by the way, it had nothing to do with how much the database cost. Great example of an open source project winning because it provided superior value, not because it was/is free. Making product strategy decisions based on who the ideal user and the ideal customer is. Robert talked about how Altinity didn't contribute a particular high-security feature back to open source ClickHouse because while it's something that very security-conscious organizations would want, for an open source users who doesn't have major security and compliance requirements it would be confusing and create a worse user experience. Working in friendly competition with the 10 or so other companies that are building around ClickHouse, and how this is one of the unique things about working around open sourceHow Altinity's customers tend to value the four freedoms of open sourceAre you a leader of an open source company and you're struggling to prioritize your product roadmap in a way that reinforces your differentiated value… reach out. I help companies figure out the differentiated value of their product and product, where to put the line between the two, and how to use that information to prioritize your roadmap, build a sales narrative and communicate with your market.

ceo business competition friendly real world ecosystem open source hodges founder stories clickhouse

E145: Bootstrapping an Open Source Monitoring Platform

Open Source Startup Podcast

Play Episode Listen Later Aug 15, 2024 38:46

Aliaksandr Valialkin and Roman Khavronenko are Co-Founders of VictoriaMetrics, the open source time series database and monitoring platform built alongside their open source project, also called victoriametrics. In this episode, we discuss the limitations to Prometheus and how ClickHouse inspired the founders to build VictoriaMetrics, how open source helped them attract their early users and gain momentum, the importance of simplicity and saying no to feature requests that would complicate the product, their approach to an Open Core model, their unique view on funding and why bootstrapping has been an advantage for them and more!

co founders platform monitoring open source prometheus bootstrapping clickhouse open core

Improve Data Quality Through Engineering Rigor And Business Engagement With Synq

Data Engineering Podcast

Play Episode Listen Later Jun 30, 2024 59:48

Summary This episode features an insightful conversation with Petr Janda, the CEO and founder of Synq. Petr shares his journey from being an engineer to founding Synq, emphasizing the importance of treating data systems with the same rigor as engineering systems. He discusses the challenges and solutions in data reliability, including the need for transparency and ownership in data systems. Synq's platform helps data teams manage incidents, understand data dependencies, and ensure data quality by providing insights and automation capabilities. Petr emphasizes the need for a holistic approach to data reliability, integrating data systems into broader business processes. He highlights the role of data teams in modern organizations and how Synq is empowering them to achieve this. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake. Trusted by teams of all sizes, including Comcast and Doordash. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst (https://www.dataengineeringpodcast.com/starburst) and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino. Your host is Tobias Macey and today I'm interviewing Petr Janda about Synq, a data reliability platform focused on leveling up data teams by supporting a culture of engineering rigor Interview Introduction How did you get involved in the area of data management? Can you describe what Synq is and the story behind it? Data observability/reliability is a category that grew rapidly over the past ~5 years and has several vendors focused on different elements of the problem. What are the capabilities that you saw as lacking in the ecosystem which you are looking to address? Operational/infrastructure engineers have spent the past decade honing their approach to incident management and uptime commitments. How do those concepts map to the responsibilities and workflows of data teams? Tooling only plays a small part in SLAs and incident management. How does Synq help to support the cultural transformation that is necessary? What does an on-call rotation for a data engineer/data platform engineer look like as compared with an application-focused team? How does the focus on data assets/data products shift your approach to observability as compared to a table/pipeline centric approach? With the focus on sharing ownership beyond the boundaries on the data team there is a strong correlation with data governance principles. How do you see organizations incorporating Synq into their approach to data governance/compliance? Can you describe how Synq is designed/implemented? How have the scope and goals of the product changed since you first started working on it? For a team who is onboarding onto Synq, what are the steps required to get it integrated into their technology stack and workflows? What are the types of incidents/errors that you are able to identify and alert on? What does a typical incident/error resolution process look like with Synq? What are the most interesting, innovative, or unexpected ways that you have seen Synq used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Synq? When is Synq the wrong choice? What do you have planned for the future of Synq? Contact Info LinkedIn (https://www.linkedin.com/in/petr-janda/?originalSubdomain=dk) Substack (https://substack.com/@petrjanda) Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com) with your story. Links Synq (https://www.synq.io/) Incident Management (https://www.pagerduty.com/resources/learn/what-is-incident-management/) SLA == Service Level Agreement (https://en.wikipedia.org/wiki/Service-level_agreement) Data Governance (https://en.wikipedia.org/wiki/Data_governance) Podcast Episode (https://www.dataengineeringpodcast.com/nicola-askham-practical-data-governance-episode-428) PagerDuty (https://www.pagerduty.com/) OpsGenie (https://www.atlassian.com/software/opsgenie) Clickhouse (https://clickhouse.com/) Podcast Episode (https://www.dataengineeringpodcast.com/clickhouse-data-warehouse-episode-88/) dbt (https://www.getdbt.com/) Podcast Episode (https://www.dataengineeringpodcast.com/dbt-data-analytics-episode-81/) SQLMesh (https://sqlmesh.readthedocs.io/en/stable/) Podcast Episode (https://www.dataengineeringpodcast.com/sqlmesh-open-source-dataops-episode-380) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)

How ClickHouse powers Netflix, Uber and Spotify's Analytics | Aaron Katz, CEO of ClickHouse

The MAD Podcast with Matt Turck

Play Episode Listen Later Jun 13, 2024 49:07

In this episode, we sat down with Aaron Katz, the CEO of ClickHouse, a company that went from an open-source analytical database into a highly successful cloud service, utilized by Spotify, Netflix, Disney, and many more. Aaron Katz provides intriguing insights into the challenges of transitioning an open-source project into a thriving business, ClickHouse's go-to-market strategy, the role of technical support in pre-sales, and the strategic decision to avoid traditional SDR and CSM roles. CLICKHOUSE Website - https://clickhouse.com/ Twitter - https://x.com/clickhousedb Aaron Katz LinkedIn - https://www.linkedin.com/in/aaron-katz-5762094 Twitter - https://x.com/ceo_clickhouse FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro(00:56) What is ClickHouse?(04:28) What are the use cases for ClickHouse?(06:17) Reducing the latency: why the world shifts to real-time(09:05) How did ClickHouse evolve from an open-source to a cloud product?(15:01) "Open source is the future of software"(17:27) Self-hosted deployments(18:45) ClickHouse's roadmap(20:51) Is there a real-time data stack?(22:25) ClickHouse partners in data ingestion(24:32) Who are ClickHouse's main competitors?(27:35) ClickHouse's sales process(36:44) Is partnerships a good go-to-market strategy?(37:44) When is the right time for startups to start partnering?(38:22) Aaron's story of becoming the CEO(43:50) Team and culture when working on two continents(46:15) What's next?

ceo spotify netflix disney team open uber powers analytics reducing sdr csm clickhouse aaron katz

S2E9: Sai Srirampur, PeerDB

Hacking Postgres

Play Episode Listen Later May 30, 2024 41:37

Sai Srirampur is the co-founder of PeerDB and a veteran Postgres Solutions Engineer with experience at Citus Data and Microsoft. He has been at the forefront of optimizing and scaling Postgres for large data workloads and is now spearheading innovation in data movement and replication with PeerDB. In this episode, we'll discuss the challenges of tuning massive Postgres systems, real-time data streaming solutions, and PeerDB's vision for the future.In this episode we explore:Leaving Microsoft and starting Sai's own company, PeerDBDeveloping PeerDB's data movement and replication toolWhy Postgres might not be suitable for everything at a large scaleHaving team members be database experts first before being software developersThe growing customer demand for Clickhouse and SnowflakeLinks mentioned:PeerDBSai Srirampur on X (@saisrirampur)Sai Srirampur on LinkedIn

microsoft extensions sai software development postgres clickhouse

Mike Volpi, Partner at Index Ventures

Behind The Tech with Kevin Scott

Play Episode Listen Later May 14, 2024 71:44

Mike Volpi is a longtime venture capitalist who joined Index Ventures in 2009 to establish the firm's San Francisco office and North American operations. Prior, he was Chief Strategy Officer at Cisco, overseeing a run of acquisitions still studied today as a model for technology merger strategy. Mike invests primarily in artificial intelligence, infrastructure, and open-source companies, and currently serves on the boards of multiple companies including Aurora, ClickHouse, Cockroach Labs, Cohere, Confluent, Covariant.ai, Kong, Scale, Sonos, and Wealthfront. In this episode, Kevin and Mike discuss Mike's early childhood, how he got interested in the study of engineering, and his career experiences—including what led to Mike's long career at Cisco and his current Partner position at Index—including his board experiences with multiple companies.   Mike Volpi | Index Ventures Kevin Scott   Behind the Tech with Kevin Scott   Discover and follow other Microsoft podcasts.   

761: Cloudflare Analytics Engine, Workers + more with Ben Vinegar

Syntax - Tasty Web Development Treats

Play Episode Listen Later Apr 26, 2024 52:02

Scott and Wes dive into Cloudflare's Analytics Engine and Workers with special guest Ben Vinegar, Syntax's General Manager. Tune in as they explore Clickhouse, data tracking, infrastructure costs, and transitioning from software products to managing a podcast. Show Notes 00:00 Welcome to Syntax! 01:17 Who is Ben Vinegar? Episode 434 with Ben. 02:21 Brought to you by Sentry.io. 04:00 Cloudflare analytics engine. Counterscale.dev. Episode 634 with Armin. 09:08 What is clickhouse? 11:01 Can Clickhouse be used for things outside of analytics tracking? 13:46 What kind of events are you able to track? 15:00 How do you assign values to track? Counterscale Schema. 18:40 Data type limitations. 19:55 The troubles with sampling data. 23:57 Sample intervals. 24:24 Pricing for these services. 25:34 How it actually runs. 27:31 Infrastructure costs and pricing models. 30:19 Running production apps in Cloudflare. 31:49 Cloudflare and HonoJS. 32:47 One year with Sentry and Ben's role with Syntax. Episode 600 with David. 39:33 How does it feel going from a software project to a media project? Syntax Team. 43:00 How do you sell Syntax to Sentry? 48:37 Sick Picks & Shameless Plugs Sick Picks Ben: Randy's YouTube, Boom. Shameless Plugs Ben: Counterscale.dev Hit us up on Socials! Syntax: X Instagram Tiktok LinkedIn Threads Wes: X Instagram Tiktok LinkedIn Threads Scott:X Instagram Tiktok LinkedIn Threads Randy: X Instagram YouTube Threads

running data boom general managers workers pricing infrastructure analytics engine socials armin cloudflare vinegar sentry syntax clickhouse

Ep. 568 w/ Griffin Parry CEO/Founder at m3ter

Building The Future Show - Radio / TV / Podcast

Play Episode Listen Later Apr 23, 2024 46:33 Transcription Available

m3ter is revolutionizing usage-based billing for B2B software scale-ups by automating complex bill calculations and integrations, enabling companies to accelerate billing cycles, minimize revenue loss, and foster product innovation.Our platform ensures accurate, trustable billing processes, empowering businesses to focus on growth and customer success.Trusted by industry leaders like ClickHouse, Onfido, and Sift, m3ter delivers reliability, scalability, and insights, streamlining pricing and billing operations across Finance, Product, Engineering, and Sales & Customer Success teams.m3ter makes it easy to generate accurate bills you and your customers can trust.https://www.m3ter.com

founders technology entrepreneur sales finance startups product investors engineering b2b trusted customer success parry ceo founder sift onfido clickhouse

Establish A Single Source Of Truth For Your Data Consumers With A Semantic Layer

Data Engineering Podcast

Play Episode Listen Later Apr 7, 2024 56:23

Summary Maintaining a single source of truth for your data is the biggest challenge in data engineering. Different roles and tasks in the business need their own ways to access and analyze the data in the organization. In order to enable this use case, while maintaining a single point of access, the semantic layer has evolved as a technological solution to the problem. In this episode Artyom Keydunov, creator of Cube, discusses the evolution and applications of the semantic layer as a component of your data platform, and how Cube provides speed and cost optimization for your data consumers. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management This episode is brought to you by Datafold – a testing automation platform for data engineers that prevents data quality issues from entering every part of your data workflow, from migration to dbt deployment. Datafold has recently launched data replication testing, providing ongoing validation for source-to-target replication. Leverage Datafold's fast cross-database data diffing and Monitoring to test your replication pipelines automatically and continuously. Validate consistency between source and target at any scale, and receive alerts about any discrepancies. Learn more about Datafold by visiting dataengineeringpodcast.com/datafold (https://www.dataengineeringpodcast.com/datafold). Dagster offers a new approach to building and running data platforms and data pipelines. It is an open-source, cloud-native orchestrator for the whole development lifecycle, with integrated lineage and observability, a declarative programming model, and best-in-class testability. Your team can get up and running in minutes thanks to Dagster Cloud, an enterprise-class hosted solution that offers serverless and hybrid deployments, enhanced security, and on-demand ephemeral test deployments. Go to dataengineeringpodcast.com/dagster (https://www.dataengineeringpodcast.com/dagster) today to get started. Your first 30 days are free! Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics. Trusted by teams of all sizes, including Comcast and Doordash, Starburst is a data lake analytics platform that delivers the adaptability and flexibility a lakehouse ecosystem promises. And Starburst does all of this on an open architecture with first-class support for Apache Iceberg, Delta Lake and Hudi, so you always maintain ownership of your data. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst (https://www.dataengineeringpodcast.com/starburst) and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino. Your host is Tobias Macey and today I'm interviewing Artyom Keydunov about the role of the semantic layer in your data platform Interview Introduction How did you get involved in the area of data management? Can you start by outlining the technical elements of what it means to have a "semantic layer"? In the past couple of years there was a rapid hype cycle around the "metrics layer" and "headless BI", which has largely faded. Can you give your assessment of the current state of the industry around the adoption/implementation of these concepts? What are the benefits of having a discrete service that offers the business metrics/semantic mappings as opposed to implementing those concepts as part of a more general system? (e.g. dbt, BI, warehouse marts, etc.) At what point does it become necessary/beneficial for a team to adopt such a service? What are the challenges involved in retrofitting a semantic layer into a production data system? evolution of requirements/usage patterns technical complexities/performance and cost optimization What are the most interesting, innovative, or unexpected ways that you have seen Cube used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Cube? When is Cube/a semantic layer the wrong choice? What do you have planned for the future of Cube? Contact Info LinkedIn (https://www.linkedin.com/in/keydunov/) keydunov (https://github.com/keydunov) on GitHub Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. Links Cube (https://cube.dev/) Semantic Layer (https://en.wikipedia.org/wiki/Semantic_layer) Business Objects (https://en.wikipedia.org/wiki/BusinessObjects) Tableau (https://www.tableau.com/) Looker (https://cloud.google.com/looker/?hl=en) Podcast Episode (https://www.dataengineeringpodcast.com/looker-with-daniel-mintz-episode-55/) Mode (https://mode.com/) Thoughtspot (https://www.thoughtspot.com/) LightDash (https://www.lightdash.com/) Podcast Episode (https://www.dataengineeringpodcast.com/lightdash-exploratory-business-intelligence-episode-232/) Embedded Analytics (https://en.wikipedia.org/wiki/Embedded_analytics) Dimensional Modeling (https://en.wikipedia.org/wiki/Dimensional_modeling) Clickhouse (https://clickhouse.com/) Podcast Episode (https://www.dataengineeringpodcast.com/clickhouse-data-warehouse-episode-88/) Druid (https://druid.apache.org/) BigQuery (https://cloud.google.com/bigquery?hl=en) Starburst (https://www.starburst.io/) Pinot (https://pinot.apache.org/) Snowflake (https://www.snowflake.com/en/) Podcast Episode (https://www.dataengineeringpodcast.com/snowflakedb-cloud-data-warehouse-episode-110/) Arrow Datafusion (https://arrow.apache.org/datafusion/) Metabase (https://www.metabase.com/) Podcast Episode (https://www.dataengineeringpodcast.com/metabase-with-sameer-al-sakran-episode-29) Superset (https://superset.apache.org/) Alation (https://www.alation.com/) Collibra (https://www.collibra.com/) Podcast Episode (https://www.dataengineeringpodcast.com/collibra-enterprise-data-governance-episode-188) Atlan (https://atlan.com/) Podcast Episode (https://www.dataengineeringpodcast.com/atlan-data-team-collaboration-episode-179) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)

Find Out About The Technology Behind The Latest PFAD In Analytical Database Development

Data Engineering Podcast

Play Episode Listen Later Feb 25, 2024 56:00

Summary Building a database engine requires a substantial amount of engineering effort and time investment. Over the decades of research and development into building these software systems there are a number of common components that are shared across implementations. When Paul Dix decided to re-write the InfluxDB engine he found the Apache Arrow ecosystem ready and waiting with useful building blocks to accelerate the process. In this episode he explains how he used the combination of Apache Arrow, Flight, Datafusion, and Parquet to lay the foundation of the newest version of his time-series database. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Dagster offers a new approach to building and running data platforms and data pipelines. It is an open-source, cloud-native orchestrator for the whole development lifecycle, with integrated lineage and observability, a declarative programming model, and best-in-class testability. Your team can get up and running in minutes thanks to Dagster Cloud, an enterprise-class hosted solution that offers serverless and hybrid deployments, enhanced security, and on-demand ephemeral test deployments. Go to dataengineeringpodcast.com/dagster (https://www.dataengineeringpodcast.com/dagster) today to get started. Your first 30 days are free! Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics. Trusted by teams of all sizes, including Comcast and Doordash, Starburst is a data lake analytics platform that delivers the adaptability and flexibility a lakehouse ecosystem promises. And Starburst does all of this on an open architecture with first-class support for Apache Iceberg, Delta Lake and Hudi, so you always maintain ownership of your data. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst (https://www.dataengineeringpodcast.com/starburst) and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino. Join us at the top event for the global data community, Data Council Austin. From March 26-28th 2024, we'll play host to hundreds of attendees, 100 top speakers and dozens of startups that are advancing data science, engineering and AI. Data Council attendees are amazing founders, data scientists, lead engineers, CTOs, heads of data, investors and community organizers who are all working together to build the future of data and sharing their insights and learnings through deeply technical talks. As a listener to the Data Engineering Podcast you can get a special discount off regular priced and late bird tickets by using the promo code dataengpod20. Don't miss out on our only event this year! Visit dataengineeringpodcast.com/data-council (https://www.dataengineeringpodcast.com/data-council) and use code dataengpod20 to register today! Your host is Tobias Macey and today I'm interviewing Paul Dix about his investment in the Apache Arrow ecosystem and how it led him to create the latest PFAD in database design Interview Introduction How did you get involved in the area of data management? Can you start by describing the FDAP stack and how the components combine to provide a foundational architecture for database engines? This was the core of your recent re-write of the InfluxDB engine. What were the design goals and constraints that led you to this architecture? Each of the architectural components are well engineered for their particular scope. What is the engineering work that is involved in building a cohesive platform from those components? One of the major benefits of using open source components is the network effect of ecosystem integrations. That can also be a risk when the community vision for the project doesn't align with your own goals. How have you worked to mitigate that risk in your specific platform? Can you describe the operational/architectural aspects of building a full data engine on top of the FDAP stack? What are the elements of the overall product/user experience that you had to build to create a cohesive platform? What are some of the other tools/technologies that can benefit from some or all of the pieces of the FDAP stack? What are the pieces of the Arrow ecosystem that are still immature or need further investment from the community? What are the most interesting, innovative, or unexpected ways that you have seen parts or all of the FDAP stack used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on/with the FDAP stack? When is the FDAP stack the wrong choice? What do you have planned for the future of the InfluxDB IOx engine and the FDAP stack? Contact Info LinkedIn (https://www.linkedin.com/in/pauldix/) pauldix (https://github.com/pauldix) on GitHub Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. Links FDAP Stack Blog Post (https://www.influxdata.com/blog/flight-datafusion-arrow-parquet-fdap-architecture-influxdb/) Apache Arrow (https://arrow.apache.org/) DataFusion (https://arrow.apache.org/datafusion/) Arrow Flight (https://arrow.apache.org/docs/format/Flight.html) Apache Parquet (https://parquet.apache.org/) InfluxDB (https://www.influxdata.com/products/influxdb/) Influx Data (https://www.influxdata.com/) Podcast Episode (https://www.dataengineeringpodcast.com/influxdb-timeseries-data-platform-episode-199) Rust Language (https://www.rust-lang.org/) DuckDB (https://duckdb.org/) ClickHouse (https://clickhouse.com/) Voltron Data (https://voltrondata.com/) Podcast Episode (https://www.dataengineeringpodcast.com/voltron-data-apache-arrow-episode-346/) Velox (https://github.com/facebookincubator/velox) Iceberg (https://iceberg.apache.org/) Podcast Episode (https://www.dataengineeringpodcast.com/iceberg-with-ryan-blue-episode-52/) Trino (https://trino.io/) ODBC == Open DataBase Connectivity (https://en.wikipedia.org/wiki/Open_Database_Connectivity) GeoParquet (https://github.com/opengeospatial/geoparquet) ORC == Optimized Row Columnar (https://orc.apache.org/) Avro (https://avro.apache.org/) Protocol Buffers (https://protobuf.dev/) gRPC (https://grpc.io/) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)

ai technology data flight arrow trusted doordash python databases comcast iceberg hug sql analytical pfad ctos starburst parquet trino grpc avro hudi influxdb clickhouse duckdb apache arrow apache iceberg velox paul dix freak fandango orchestra protocol buffers database development

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Data Engineering Podcast

Play Episode Listen Later Feb 18, 2024 58:46

Summary A data lakehouse is intended to combine the benefits of data lakes (cost effective, scalable storage and compute) and data warehouses (user friendly SQL interface). Multiple open source projects and vendors have been working together to make this vision a reality. In this episode Dain Sundstrom, CTO of Starburst, explains how the combination of the Trino query engine and the Iceberg table format offer the ease of use and execution speed of data warehouses with the infinite storage and scalability of data lakes. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Dagster offers a new approach to building and running data platforms and data pipelines. It is an open-source, cloud-native orchestrator for the whole development lifecycle, with integrated lineage and observability, a declarative programming model, and best-in-class testability. Your team can get up and running in minutes thanks to Dagster Cloud, an enterprise-class hosted solution that offers serverless and hybrid deployments, enhanced security, and on-demand ephemeral test deployments. Go to dataengineeringpodcast.com/dagster (https://www.dataengineeringpodcast.com/dagster) today to get started. Your first 30 days are free! Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics. Trusted by teams of all sizes, including Comcast and Doordash, Starburst is a data lake analytics platform that delivers the adaptability and flexibility a lakehouse ecosystem promises. And Starburst does all of this on an open architecture with first-class support for Apache Iceberg, Delta Lake and Hudi, so you always maintain ownership of your data. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst (https://www.dataengineeringpodcast.com/starburst) and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino. Join in with the event for the global data community, Data Council Austin. From March 26th-28th 2024, they'll play host to hundreds of attendees, 100 top speakers, and dozens of startups that are advancing data science, engineering and AI. Data Council attendees are amazing founders, data scientists, lead engineers, CTOs, heads of data, investors and community organizers who are all working togethr to build the future of data. As a listener to the Data Engineering Podcast you can get a special discount of 20% off your ticket by using the promo code dataengpod20. Don't miss out on their only event this year! Visit: dataengineeringpodcast.com/data-council (https://www.dataengineeringpodcast.com/data-council) today. Your host is Tobias Macey and today I'm interviewing Dain Sundstrom about building a data lakehouse with Trino and Iceberg Interview Introduction How did you get involved in the area of data management? To start, can you share your definition of what constitutes a "Data Lakehouse"? What are the technical/architectural/UX challenges that have hindered the progression of lakehouses? What are the notable advancements in recent months/years that make them a more viable platform choice? There are multiple tools and vendors that have adopted the "data lakehouse" terminology. What are the benefits offered by the combination of Trino and Iceberg? What are the key points of comparison for that combination in relation to other possible selections? What are the pain points that are still prevalent in lakehouse architectures as compared to warehouse or vertically integrated systems? What progress is being made (within or across the ecosystem) to address those sharp edges? For someone who is interested in building a data lakehouse with Trino and Iceberg, how does that influence their selection of other platform elements? What are the differences in terms of pipeline design/access and usage patterns when using a Trino/Iceberg lakehouse as compared to other popular warehouse/lakehouse structures? What are the most interesting, innovative, or unexpected ways that you have seen Trino lakehouses used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on the data lakehouse ecosystem? When is a lakehouse the wrong choice? What do you have planned for the future of Trino/Starburst? Contact Info LinkedIn (https://www.linkedin.com/in/dainsundstrom/) dain (https://github.com/dain) on GitHub Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. Links Trino (https://trino.io/) Starburst (https://www.starburst.io/) Presto (https://prestodb.io/) JBoss (https://en.wikipedia.org/wiki/JBoss_Enterprise_Application_Platform) Java EE (https://www.oracle.com/java/technologies/java-ee-glance.html) HDFS (https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html) S3 (https://aws.amazon.com/s3/) GCS == Google Cloud Storage (https://cloud.google.com/storage?hl=en) Hive (https://hive.apache.org/) Hive ACID (https://cwiki.apache.org/confluence/display/hive/hive+transactions) Apache Ranger (https://ranger.apache.org/) OPA == Open Policy Agent (https://www.openpolicyagent.org/) Oso (https://www.osohq.com/) AWS Lakeformation (https://aws.amazon.com/lake-formation/) Tabular (https://tabular.io/) Iceberg (https://iceberg.apache.org/) Podcast Episode (https://www.dataengineeringpodcast.com/iceberg-with-ryan-blue-episode-52/) Delta Lake (https://delta.io/) Podcast Episode (https://www.dataengineeringpodcast.com/delta-lake-data-lake-episode-85/) Debezium (https://debezium.io/) Podcast Episode (https://www.dataengineeringpodcast.com/debezium-change-data-capture-episode-114) Materialized View (https://en.wikipedia.org/wiki/Materialized_view) Clickhouse (https://clickhouse.com/) Druid (https://druid.apache.org/) Hudi (https://hudi.apache.org/) Podcast Episode (https://www.dataengineeringpodcast.com/hudi-streaming-data-lake-episode-209) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)

Data Dynamo: Journey into the World of Databases

You Say Data, I Say Dayta

Play Episode Listen Later Feb 6, 2024 30:56

Robert Hodges is CEO of Altinity, which helps enterprises build real-time analytics using ClickHouse. He has been working on databases as well as the applications that use them since the early 1980s. The journey included Sybase and Oracle in the 1990s, MySQL and PostgreSQL in the early 2000s, followed by analytic databases starting in 2010. Robert's technical interests include distributed systems, open source software, and Kubernetes. He is stoked to be part of the next wave of technology based on open source analytic databases. It's a privilege to help users discover creative new uses for data and technology!

ceo oracle databases dynamo kubernetes mysql postgresql sybase clickhouse

Tackling Real Time Streaming Data With SQL Using RisingWave

Data Engineering Podcast

Play Episode Listen Later Feb 4, 2024 56:55

Summary Stream processing systems have long been built with a code-first design, adding SQL as a layer on top of the existing framework. RisingWave is a database engine that was created specifically for stream processing, with S3 as the storage layer. In this episode Yingjun Wu explains how it is architected to power analytical workflows on continuous data flows, and the challenges of making it responsive and scalable. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics. Trusted by teams of all sizes, including Comcast and Doordash, Starburst is a data lake analytics platform that delivers the adaptability and flexibility a lakehouse ecosystem promises. And Starburst does all of this on an open architecture with first-class support for Apache Iceberg, Delta Lake and Hudi, so you always maintain ownership of your data. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst (https://www.dataengineeringpodcast.com/starburst) and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino. Dagster offers a new approach to building and running data platforms and data pipelines. It is an open-source, cloud-native orchestrator for the whole development lifecycle, with integrated lineage and observability, a declarative programming model, and best-in-class testability. Your team can get up and running in minutes thanks to Dagster Cloud, an enterprise-class hosted solution that offers serverless and hybrid deployments, enhanced security, and on-demand ephemeral test deployments. Go to dataengineeringpodcast.com/dagster (https://www.dataengineeringpodcast.com/dagster) today to get started. Your first 30 days are free! Your host is Tobias Macey and today I'm interviewing Yingjun Wu about the RisingWave database and the intricacies of building a stream processing engine on S3 Interview Introduction How did you get involved in the area of data management? Can you describe what RisingWave is and the story behind it? There are numerous stream processing engines, near-real-time database engines, streaming SQL systems, etc. What is the specific niche that RisingWave addresses? What are some of the platforms/architectures that teams are replacing with RisingWave? What are some of the unique capabilities/use cases that RisingWave provides over other offerings in the current ecosystem? Can you describe how RisingWave is architected and implemented? How have the design and goals/scope changed since you first started working on it? What are the core design philosophies that you rely on to prioritize the ongoing development of the project? What are the most complex engineering challenges that you have had to address in the creation of RisingWave? Can you describe a typical workflow for teams that are building on top of RisingWave? What are the user/developer experience elements that you have prioritized most highly? What are the situations where RisingWave can/should be a system of record vs. a point-in-time view of data in transit, with a data warehouse/lakehouse as the longitudinal storage and query engine? What are the most interesting, innovative, or unexpected ways that you have seen RisingWave used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on RisingWave? When is RisingWave the wrong choice? What do you have planned for the future of RisingWave? Contact Info yingjunwu (https://github.com/yingjunwu) on GitHub Personal Website (https://yingjunwu.github.io/) LinkedIn (https://www.linkedin.com/in/yingjun-wu-4b584536/) Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. Links RisingWave (https://risingwave.com/) AWS Redshift (https://aws.amazon.com/redshift/) Flink (https://flink.apache.org/) Podcast Episode (https://www.dataengineeringpodcast.com/apache-flink-with-fabian-hueske-episode-57) Clickhouse (https://clickhouse.com/) Podcast Episode (https://www.dataengineeringpodcast.com/clickhouse-data-warehouse-episode-88/) Druid (https://druid.apache.org/) Materialize (https://materialize.com/) Spark (https://spark.apache.org/) Trino (https://trino.io/) Snowflake (https://www.snowflake.com/en/) Kafka (https://kafka.apache.org/) Iceberg (https://iceberg.apache.org/) Podcast Episode (https://www.dataengineeringpodcast.com/iceberg-with-ryan-blue-episode-52/) Hudi (https://hudi.apache.org/) Podcast Episode (https://www.dataengineeringpodcast.com/hudi-streaming-data-lake-episode-209) Postgres (https://www.postgresql.org/) Debezium (https://debezium.io/) Podcast Episode (https://www.dataengineeringpodcast.com/debezium-change-data-capture-episode-114) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)

20VC: Did Figma Kill M&A Markets in 2024, The Three Biggest Mistakes Made in Growth Investing, The Three Requirements Companies Need to Go Public in 2024 with Ed Sim and Jamin Ball

The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch

Play Episode Listen Later Jan 10, 2024 66:39

Jamin Ball is a Partner @ Altimeter Capital where he sits on the board of Airbyte, Clickhouse, dbt Labs, Prisma, Tabular. Jamin has also led investments in Deel, MotherDuck, Personio and Starburst. Prior to Altimeter, Jamin spent 5 years at Redpoint where he led investments in Workato, Monte Carlo, Cityblock Health, Root Insurance. Ed Sim is one of the best seed round investors in venture as the Founder and Managing Partner @ Boldstart, Ed focuses specifically on developer, infra and SaaS at pre-seed and seed round. Over the last decade, Ed has backed some of the best including Snyk, BigID, Kustomer, Front and Superhuman. In Today's Episode We Discuss: 1. How to Invest Successfully in 2024: What are the three biggest mistakes growth investors can make in 2024? Why should founders not start a platform company? What were Jamin and Ed's biggest mistakes from the ZIRP era? How does Jamin justify paying an $8BN price for Hopin? What were his lessons? 2. The M&A Markets in 2024: Did Figma kill the M&A markets for 2024? What should we expect in M&A? Why will private companies buying private companies be a massive segment in 2024? What are Ed and Jamin's biggest tips to founders considering selling their company in 2024? 3. When Will IPOs Come Back: What will be the catalyst to the opening of the IPO markets? Will Stripe and Databricks go public in 2024? What others should we expect? What are the three requirements for a company to go public in 2024? 4. Firesales: Investors Need Cashback: Why does Ed believe now is the time in the cycle where late-stage investors want cash back to distribute back to their LPs or to recycle? What should we expect to see in terms of acqui-hires and firesales? What are the different incentives when comparing founders vs early stage VCs vs late stage VCs when it comes to acquisitions?

Designing Data Transfer Systems That Scale

Data Engineering Podcast

Play Episode Listen Later Dec 4, 2023 63:57

Summary The first step of data pipelines is to move the data to a place where you can process and prepare it for its eventual purpose. Data transfer systems are a critical component of data enablement, and building them to support large volumes of information is a complex endeavor. Andrei Tserakhau has dedicated his careeer to this problem, and in this episode he shares the lessons that he has learned and the work he is doing on his most recent data transfer system at DoubleCloud. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack (https://www.dataengineeringpodcast.com/rudderstack) You shouldn't have to throw away the database to build with fast-changing data. You should be able to keep the familiarity of SQL and the proven architecture of cloud warehouses, but swap the decades-old batch computation model for an efficient incremental engine to get complex queries that are always up-to-date. With Materialize, you can! It's the only true SQL streaming database built from the ground up to meet the needs of modern data products. Whether it's real-time dashboarding and analytics, personalization and segmentation or automation and alerting, Materialize gives you the ability to work with fresh, correct, and scalable results — all in a familiar SQL interface. Go to dataengineeringpodcast.com/materialize (https://www.dataengineeringpodcast.com/materialize) today to get 2 weeks free! This episode is brought to you by Datafold – a testing automation platform for data engineers that finds data quality issues for every part of your data workflow, from migration to deployment. Datafold has recently launched a 3-in-1 product experience to support accelerated data migrations. With Datafold, you can seamlessly plan, translate, and validate data across systems, massively accelerating your migration project. Datafold leverages cross-database diffing to compare tables across environments in seconds, column-level lineage for smarter migration planning, and a SQL translator to make moving your SQL scripts easier. Learn more about Datafold by visiting dataengineeringpodcast.com/datafold (https://www.dataengineeringpodcast.com/datafold) today! Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics. Trusted by teams of all sizes, including Comcast and Doordash, Starburst is a data lake analytics platform that delivers the adaptability and flexibility a lakehouse ecosystem promises. And Starburst does all of this on an open architecture with first-class support for Apache Iceberg, Delta Lake and Hudi, so you always maintain ownership of your data. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst (https://www.dataengineeringpodcast.com/starburst) and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino. Your host is Tobias Macey and today I'm interviewing Andrei Tserakhau about operationalizing high bandwidth and low-latency change-data capture Interview Introduction How did you get involved in the area of data management? Your most recent project involves operationalizing a generalized data transfer service. What was the original problem that you were trying to solve? What were the shortcomings of other options in the ecosystem that led you to building a new system? What was the design of your initial solution to the problem? What are the sharp edges that you had to deal with to operate and use that initial implementation? What were the limitations of the system as you started to scale it? Can you describe the current architecture of your data transfer platform? What are the capabilities and constraints that you are optimizing for? As you move beyond the initial use case that started you down this path, what are the complexities involved in generalizing to add new functionality or integrate with additional platforms? What are the most interesting, innovative, or unexpected ways that you have seen your data transfer service used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on the data transfer system? When is DoubleCloud Data Transfer the wrong choice? What do you have planned for the future of DoubleCloud Data Transfer? Contact Info LinkedIn (https://www.linkedin.com/in/andrei-tserakhau/) Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. To help other people find the show please leave a review on Apple Podcasts (https://podcasts.apple.com/us/podcast/data-engineering-podcast/id1193040557) and tell your friends and co-workers Links DoubleCloud (https://double.cloud/) Kafka (https://kafka.apache.org/) MapReduce (https://en.wikipedia.org/wiki/MapReduce) Change Data Capture (https://en.wikipedia.org/wiki/Change_data_capture) Clickhouse (https://clickhouse.com/) Podcast Episode (https://www.dataengineeringpodcast.com/clickhouse-data-warehouse-episode-88/) Iceberg (https://iceberg.apache.org/) Podcast Episode (https://www.dataengineeringpodcast.com/iceberg-with-ryan-blue-episode-52/) Delta Lake (https://delta.io/) Podcast Episode (https://www.dataengineeringpodcast.com/delta-lake-data-lake-episode-85/) dbt (https://www.getdbt.com/) OpenMetadata (https://open-metadata.org/) Podcast Episode (https://www.dataengineeringpodcast.com/openmetadata-universal-metadata-layer-episode-237/) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/) Speaker - Andrei Tserakhau, DoubleCloud Tech Lead. He has over 10 years of IT engineering experience and for the last 4 years was working on distributed systems with a focus on data delivery systems.

Companion databases

Postgres FM

Play Episode Listen Later Nov 17, 2023 45:58

Nikolay and Michael discuss companion databases — when and why you might want to add another database management system to your stack (or not), and some specifics for analytics, timeseries, search, and vectors. Here are some links to things they mentioned:Heap were using Postgres + Citus for analytics as of 2022 https://www.heap.io/blog/juggling-state-machines-incident-response-and-data-soup-a-glimpse-into-heaps-engineering-culture Heap recently moved their core analytics to SingleStore (we only spotted this after recording

google technology companion databases rum hydra sql heap postgresql postgres jepsen nikolay tembo timescale clickhouse singlestore citus

Shining Some Light In The Black Box Of PostgreSQL Performance

Data Engineering Podcast

Play Episode Listen Later Nov 6, 2023 54:51

Summary Databases are the core of most applications, but they are often treated as inscrutable black boxes. When an application is slow, there is a good probability that the database needs some attention. In this episode Lukas Fittl shares some hard-won wisdom about the causes and solution of many performance bottlenecks and the work that he is doing to shine some light on PostgreSQL to make it easier to understand how to keep it running smoothly. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack (https://www.dataengineeringpodcast.com/rudderstack) You shouldn't have to throw away the database to build with fast-changing data. You should be able to keep the familiarity of SQL and the proven architecture of cloud warehouses, but swap the decades-old batch computation model for an efficient incremental engine to get complex queries that are always up-to-date. With Materialize, you can! It's the only true SQL streaming database built from the ground up to meet the needs of modern data products. Whether it's real-time dashboarding and analytics, personalization and segmentation or automation and alerting, Materialize gives you the ability to work with fresh, correct, and scalable results — all in a familiar SQL interface. Go to dataengineeringpodcast.com/materialize (https://www.dataengineeringpodcast.com/materialize) today to get 2 weeks free! Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics. Trusted by teams of all sizes, including Comcast and Doordash, Starburst is a data lake analytics platform that delivers the adaptability and flexibility a lakehouse ecosystem promises. And Starburst does all of this on an open architecture with first-class support for Apache Iceberg, Delta Lake and Hudi, so you always maintain ownership of your data. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst (https://www.dataengineeringpodcast.com/starburst) and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino. This episode is brought to you by Datafold – a testing automation platform for data engineers that finds data quality issues before the code and data are deployed to production. Datafold leverages data-diffing to compare production and development environments and column-level lineage to show you the exact impact of every code change on data, metrics, and BI tools, keeping your team productive and stakeholders happy. Datafold integrates with dbt, the modern data stack, and seamlessly plugs in your data CI for team-wide and automated testing. If you are migrating to a modern data stack, Datafold can also help you automate data and code validation to speed up the migration. Learn more about Datafold by visiting dataengineeringpodcast.com/datafold (https://www.dataengineeringpodcast.com/datafold) Your host is Tobias Macey and today I'm interviewing Lukas Fittl about optimizing your database performance and tips for tuning Postgres Interview Introduction How did you get involved in the area of data management? What are the different ways that database performance problems impact the business? What are the most common contributors to performance issues? What are the useful signals that indicate performance challenges in the database? For a given symptom, what are the steps that you recommend for determining the proximate cause? What are the potential negative impacts to be aware of when tuning the configuration of your database? How does the database engine influence the methods used to identify and resolve performance challenges? Most of the database engines that are in common use today have been around for decades. How have the lessons learned from running these systems over the years influenced the ways to think about designing new engines or evolving the ones we have today? What are the most interesting, innovative, or unexpected ways that you have seen to address database performance? What are the most interesting, unexpected, or challenging lessons that you have learned while working on databases? What are your goals for the future of database engines? Contact Info LinkedIn (https://www.linkedin.com/in/lfittl/) @LukasFittl (https://twitter.com/LukasFittl) on Twitter Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. To help other people find the show please leave a review on Apple Podcasts (https://podcasts.apple.com/us/podcast/data-engineering-podcast/id1193040557) and tell your friends and co-workers Links PGAnalyze (https://pganalyze.com/) Citus Data (https://www.citusdata.com/) Podcast Episode (https://www.dataengineeringpodcast.com/citus-data-with-ozgun-erdogan-and-craig-kerstiens-episode-13/) ORM == Object Relational Mapper (https://en.wikipedia.org/wiki/Object%E2%80%93relational_mapping) N+1 Query (https://docs.sentry.io/product/issues/issue-details/performance-issues/n-one-queries/) Autovacuum (https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM) Write-ahead Log (https://en.wikipedia.org/wiki/Write-ahead_logging) pgstatio (https://pgpedia.info/p/pg_stat_io.html) randompagecost (https://postgresqlco.nf/doc/en/param/random_page_cost/) pgvector (https://github.com/pgvector/pgvector) Vector Database (https://en.wikipedia.org/wiki/Vector_database) Ottertune (https://ottertune.com/) Podcast Episode (https://www.dataengineeringpodcast.com/ottertune-database-performance-optimization-episode-197/) Citus Extension (https://github.com/citusdata/citus) Hydra (https://github.com/hydradatabase/hydra) Clickhouse (https://clickhouse.tech/) Podcast Episode (https://www.dataengineeringpodcast.com/clickhouse-data-warehouse-episode-88/) MyISAM (https://en.wikipedia.org/wiki/MyISAM) MyRocks (http://myrocks.io/) InnoDB (https://en.wikipedia.org/wiki/InnoDB) Great Expectations (https://greatexpectations.io/) Podcast Episode (https://www.dataengineeringpodcast.com/great-expectations-data-contracts-episode-352) OpenTelemetry (https://opentelemetry.io/) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)

Surveying The Market Of Database Products

Data Engineering Podcast

Play Episode Listen Later Oct 30, 2023 47:12

Summary Databases are the core of most applications, whether transactional or analytical. In recent years the selection of database products has exploded, making the critical decision of which engine(s) to use even more difficult. In this episode Tanya Bragin shares her experiences as a product manager for two major vendors and the lessons that she has learned about how teams should approach the process of tool selection. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack (https://www.dataengineeringpodcast.com/rudderstack) You shouldn't have to throw away the database to build with fast-changing data. You should be able to keep the familiarity of SQL and the proven architecture of cloud warehouses, but swap the decades-old batch computation model for an efficient incremental engine to get complex queries that are always up-to-date. With Materialize, you can! It's the only true SQL streaming database built from the ground up to meet the needs of modern data products. Whether it's real-time dashboarding and analytics, personalization and segmentation or automation and alerting, Materialize gives you the ability to work with fresh, correct, and scalable results — all in a familiar SQL interface. Go to dataengineeringpodcast.com/materialize (https://www.dataengineeringpodcast.com/materialize) today to get 2 weeks free! This episode is brought to you by Datafold – a testing automation platform for data engineers that finds data quality issues before the code and data are deployed to production. Datafold leverages data-diffing to compare production and development environments and column-level lineage to show you the exact impact of every code change on data, metrics, and BI tools, keeping your team productive and stakeholders happy. Datafold integrates with dbt, the modern data stack, and seamlessly plugs in your data CI for team-wide and automated testing. If you are migrating to a modern data stack, Datafold can also help you automate data and code validation to speed up the migration. Learn more about Datafold by visiting dataengineeringpodcast.com/datafold (https://www.dataengineeringpodcast.com/datafold) Data projects are notoriously complex. With multiple stakeholders to manage across varying backgrounds and toolchains even simple reports can become unwieldy to maintain. Miro is your single pane of glass where everyone can discover, track, and collaborate on your organization's data. I especially like the ability to combine your technical diagrams with data documentation and dependency mapping, allowing your data engineers and data consumers to communicate seamlessly about your projects. Find simplicity in your most complex projects with Miro. Your first three Miro boards are free when you sign up today at dataengineeringpodcast.com/miro (https://www.dataengineeringpodcast.com/miro). That's three free boards at dataengineeringpodcast.com/miro (https://www.dataengineeringpodcast.com/miro). Your host is Tobias Macey and today I'm interviewing Tanya Bragin about her views on the database products market Interview Introduction How did you get involved in the area of data management? What are the aspects of the database market that keep you interested as a VP of product? How have your experiences at Elastic informed your current work at Clickhouse? What are the main product categories for databases today? What are the industry trends that have the most impact on the development and growth of different product categories? Which categories do you see growing the fastest? When a team is selecting a database technology for a given task, what are the types of questions that they should be asking? Transactional engines like Postgres, SQL Server, Oracle, etc. were long used as analytical databases as well. What is driving the broad adoption of columnar stores as a separate environment from transactional systems? What are the inefficiencies/complexities that this introduces? How can the database engine used for analytical systems work more closely with the transactional systems? When building analytical systems there are numerous moving parts with intricate dependencies. What is the role of the database in simplifying observability of these applications? What are the most interesting, innovative, or unexpected ways that you have seen Clickhouse used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on database products? What are your prodictions for the future of the database market? Contact Info LinkedIn (https://www.linkedin.com/in/tbragin/) Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. To help other people find the show please leave a review on Apple Podcasts (https://podcasts.apple.com/us/podcast/data-engineering-podcast/id1193040557) and tell your friends and co-workers Links Clickhouse (https://clickhouse.com/) Podcast Episode (https://www.dataengineeringpodcast.com/clickhouse-data-warehouse-episode-88/) Elastic (https://www.elastic.co/) OLAP (https://en.wikipedia.org/wiki/Online_analytical_processing) OLTP (https://en.wikipedia.org/wiki/Online_transaction_processing) Graph Database (https://en.wikipedia.org/wiki/Graph_database) Vector Database (https://en.wikipedia.org/wiki/Vector_database) Trino (https://trino.io/) Presto (https://prestodb.io/) Foreign data wrapper (https://wiki.postgresql.org/wiki/Foreign_data_wrappers) dbt (https://www.getdbt.com/) Podcast Episode (https://www.dataengineeringpodcast.com/dbt-data-analytics-episode-81/) OpenTelemetry (https://opentelemetry.io/) Iceberg (https://iceberg.apache.org/) Podcast Episode (https://www.dataengineeringpodcast.com/tabular-iceberg-lakehouse-tables-episode-363) Parquet (https://parquet.apache.org/) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)

497: Axiom with Seif Lotfy

Giant Robots Smashing Into Other Giant Robots

Play Episode Listen Later Oct 19, 2023 39:13

Victoria is joined by guest co-host Joe Ferris, CTO at thoughtbot, and Seif Lotfy, the CTO and Co-Founder of Axiom. Seif discusses the journey, challenges, and strategies behind his data analytics and observability platform. Seif, who has a background in robotics and was a 2008 Sony AIBO robotic soccer world champion, shares that Axiom pivoted from being a Datadog competitor to focusing on logs and event data. The company even built its own logs database to provide a cost-effective solution for large-scale analytics. Seif is driven by his passion for his team and the invaluable feedback from the community, emphasizing that sales validate the effectiveness of a product. The conversation also delves into Axiom's shift in focus towards developers to address their need for better and more affordable observability tools. On the business front, Seif reveals the company's challenges in scaling across multiple domains without compromising its core offerings. He discusses the importance of internal values like moving with urgency and high velocity to guide the company's future. Furthermore, he touches on the challenges and strategies of open-sourcing projects and advises avoiding platforms like Reddit and Hacker News to maintain focus. Axiom (https://axiom.co/) Follow Axiom on LinkedIn (https://www.linkedin.com/company/axiomhq/), X (https://twitter.com/AxiomFM), GitHub (https://github.com/axiomhq), or Discord (https://discord.com/invite/axiom-co). Follow Seif Lotfy on LinkedIn (https://www.linkedin.com/in/seiflotfy/) or X (https://twitter.com/seiflotfy). Visit his website at seif.codes (https://seif.codes/). Follow thoughtbot on X (https://twitter.com/thoughtbot) or LinkedIn (https://www.linkedin.com/company/150727/). Become a Sponsor (https://thoughtbot.com/sponsorship) of Giant Robots! Transcript: VICTORIA: This is the Giant Robots Smashing Into Other Giant Robots Podcast, where we explore the design, development, and business of great products. I'm your host, Victoria Guido, and with me today is Seif Lotfy, CTO and Co-Founder of Axiom, the best home for your event data. Seif, thank you for joining me. SEIF: Hey, everybody. Thanks for having me. This is awesome. I love the name of the podcast, given that I used to compete in robotics. VICTORIA: What? All right, we're going to have to talk about that. And I also want to introduce a guest co-host today. Since we're talking about cloud, and observability, and data, I invited Joe Ferris, thoughtbot CTO and Director of Development of our platform engineering team, Mission Control. Welcome, Joe. How are you? JOE: Good, thanks. Good to be back again. VICTORIA: Okay. I am excited to talk to you all about observability. But I need to go back to Seif's comment on competing with robots. Can you tell me a little bit more about what robots you've built in the past? SEIF: I didn't build robots; I used to program them. Remember the Sony AIBOs, where Sony made these dog robots? And we would make them compete. There was an international competition where we made them play soccer, and they had to be completely autonomous. They only communicate via Bluetooth or via wireless protocols. And you only have the camera as your sensor as well as...a chest sensor throws the ball near you, and then yeah, you make them play football against each other, four versus four with a goalkeeper and everything. Just look it up: RoboCup AIBO. Look it up on YouTube. And I...2008 world champion with the German team. VICTORIA: That sounds incredible. What kind of crowds are you drawing out for a robot soccer match? Is that a lot of people involved with that? SEIF: You would be surprised how big the RoboCup competition is. It's ridiculous. VICTORIA: I want to go. I'm ready. I want to, like, I'll look it up and find out when the next one is. SEIF: No more Sony robots but other robots. Now, there's two-legged robots. So, they make them play as two-legged robots, much slower than four-legged robots, but works. VICTORIA: Wait. So, the robots you were playing soccer with had four legs they were running around on? SEIF: Yeah, they were dogs [laughter]. VICTORIA: That's awesome. SEIF: We all get the same robot. It's just a competition on software, right? On a software level. And some other competitions within the RoboCup actually use...you build your own robot and stuff like that. But this one was...it's called the Standard League, where we all have a robot, and we have to program it. JOE: And the standard robot was a dog. SEIF: Yeah, I think back then...we're talking...it's been a long time. I think it started in 2001 or something. I think the competition started in 2001 or 2002. And I compete from 2006 to 2008. Robots back then were just, you know, simple. VICTORIA: Robots today are way too complicated [laughs]. SEIF: Even AI is more complicated. VICTORIA: That's right. Yeah, everything has gotten a lot more complicated [laughs]. I'm so curious how you went from being a world-champion robot dog soccer player [laughs] programmer [laughs] to where you are today with Axiom. Can you tell me a little bit more about your journey? SEIF: The journey is interesting because it came from open source. I used to do open source on the side a lot–part of the GNOME Project. That's where I met Neil and the rest of my team, Mikkel Kamstrup, the whole crowd, basically. We worked on GNOME. We worked on Ubuntu. Like, most of them were working professionally on it. I was working for another company, but we worked on the same project. We ended up at Xamarin, which was bought by Microsoft. And then we ended up doing Axiom. But we've been around each other professionally since 2009, most of us. It's like a little family. But how we ended up exactly in observability, I think it's just trying to fix pain points in my life. VICTORIA: Yeah, I was reading through the docs on Axiom. And there's an interesting point you make about organizations having to choose between how much data they have and how much they want to spend on it. So, maybe you can tell me a little bit more about that pain point and what you really found in the early stages that you wanted to solve. SEIF: So, the early stages of what we wanted to solve we were mainly dealing with...so, the early, early stage, we were actually trying to be a Datadog competitor, where we were going to be self-hosted. Eventually, we focused on logs because we found out that's what was a big problem for most people, just event data, not just metric but generally event data, so logs, traces, et cetera. We built out our own logs database completely from scratch. And one of the things we stumbled upon was; basically, you have three things when it comes to logging, which is low cost, low latency, and large scale. That's what everybody wants. But you can't get all three of them; you can only get two of them. And we opted...like, we chose large scale and low cost. And when it comes to latency, we say it should be just fast enough, right? And that's where we focused on, and this is how we started building it. And with that, this is how we managed to stand out by just having way lower cost than anybody else in the industry and dealing with large scale. VICTORIA: That's really interesting. And how did you approach making the ingestion pipeline for masses amount of data more efficient? SEIF: Just make it coordination-free as possible, right? And get rid of Kafka because Kafka just, you know, drains your...it's where you throw in money. Like maintaining Kafka...it's like back then Elasticsearch, right? Elasticsearch was the biggest part of your infrastructure that would cost money. Now, it's also Kafka. So, we found a way to have our own internal way of queueing things without having to rely on Kafka. As I said, we wrote everything from scratch to make it work. Like, every now and then, I think that we can spin this out of the company and make it a new product. But now, eyes on the prize, right? JOE: It's interesting to hear that somebody who spent so much time in the open-source community ended up rolling their own solution to so many problems. Do you feel like you had some lessons learned from open source that led you to reject solutions like Kafka, or how did that journey go? SEIF: I don't think I'm rejecting Kafka. The problem is how Kafka is built, right? Kafka is still...you have to set up all these servers. They have to communicate, et cetera, etcetera. They didn't build it in a way where it's stateless, and that's what we're trying to go to. We're trying to make things as stateless as possible. So, Kafka was never built for the cloud-native era. And you can't really rely on SQS or something like that because it won't deal with this high throughput. So, that's why I said, like, we will sacrifice some latency, but at least the cost is low. So, if messages show after half a second or a second, I'm good. It doesn't have to be real-time for me. So, I had to write a couple of these things. But also, it doesn't mean that we reject open source. Like, we actually do like open source. We open-source a couple of libraries. We contribute back to open source, right? We needed a solution back then for that problem, and we couldn't find any. And maybe one day, open source will have, right? JOE: Yeah. I was going to ask if you considered open-sourcing any of your high latency, high throughput solutions. SEIF: Not high latency. You make it sound bad. JOE: [laughs] SEIF: You make it sound bad. It's, like, fast enough, right? I'm not going to compete on milliseconds because, also, I'm competing with ClickHouse. I don't want to compete with ClickHouse. ClickHouse is low latency and large scale, right? But then the cost is, you know, off the charts a bit sometimes. I'm going the other route. Like, you know, it's fast enough. Like, how, you know, if it's under two, three seconds, everybody's happy, right? If the results come within two, three seconds, everybody is happy. If you're going to build a real-time trading system on top of it, I'll strongly advise against that. But if you're building, you know, you're looking at dashboards, you're more in the observability field, yeah, we're good. VICTORIA: Yeah, I'm curious what you found, like, which customer personas that market really resonated with. Like, is there a particular, like, industry type where you're noticing they really want to lower their cost, and they're okay with this just fast enough latency? SEIF: Honestly, with the current recession, everybody is okay with giving up some of the speed to reduce the money because I think it's not linear reduction. It's more exponential reduction at this point, right? You give up a second, and you're saving 30%. You give up two seconds, all of a sudden, you're saving 80%. So, I'd say in the beginning, everybody thought they need everything to be very, very fast. And now they're realizing, you know, with limitations you have around your budget and spending, you're like, okay, I'm okay with the speed. And, again, we're not slow. I'm just saying people realize they don't need everything under a second. They're okay with waiting for two seconds. VICTORIA: That totally resonates with me. And I'm curious if you can add maybe a non-technical or a real-life example of, like, how this impacts the operations of a company or organization, like, if you can give us, like, a business-y example of how this impacts how people work. SEIF: I don't know how, like, how do people work on that? Nothing changed, really. They're still doing the, like...really nothing because...and that aspect is you run a query, and, again, as I said, you're not getting the result in a second. You're just waiting two seconds or three seconds, and it's there. So, nothing really changed. I think people can wait three seconds. And we're still like–when I say this, we're still faster than most others. We're just not as fast as people who are trying to compete on a millisecond level. VICTORIA: Yeah, that's okay. Maybe I'll take it back even, like, a step further, right? Like, our audience is really sometimes just founders who almost have no formal technical training or background. So, when we talk about observability, sometimes people who work in DevOps and operations all understand it and kind of know why it's important [laughs] and what we're talking about. So, maybe you could, like, go back to -- SEIF: Oh, if you're asking about new types of people who've been using it -- VICTORIA: Yeah. Like, if you're going to explain to, like, a non-technical founder, like, why your product is important, or, like, how people in their organization might use it, what would you say? SEIF: Oh, okay, if you put it like that. It's more of if you have data, timestamp data, and you want to run analytics on top of it, so that could be transactions, that could be web vitals, rather than count every time somebody visits, you have a timestamp. So, you can count, like, how many visitors visited the website and what, you know, all these kinds of things. That's where you want to use something like Axiom. That's outside the DevOps space, of course. And in DevOps space, there's so many other things you use Axiom for, but that's outside the DevOps space. And we actually...we implemented as zero-config integration with Vercel that kind of went viral. And we were, for a while, the number one enterprise for self-integration because so many people were using it. So, Vercel users are usually not necessarily writing the most complex backends, but a lot of things are happening on the front-end side of things. And we would be giving them dashboards, automated dashboards about, you know, latencies, and how long a request took, and how long the response took, and the content type, and the status codes, et cetera, et cetera. And there's a huge user base around that. VICTORIA: I like that. And it's something, for me, you know, as a managing director of our platform engineering team, I want to talk more to founders about. It's great that you put this product and this app out into the world. But how do you know that people are actually using it? How do you know that people, like, maybe, are they all quitting after the first day and not coming back to your app? Or maybe, like, the page isn't loading or, like, it's not working as they expected it to. And, like, if you don't have anything observing what users are doing in your app, then it's going to be hard to show that you're getting any traction and know where you need to go in and make corrections and adjust. SEIF: We have two ways of doing this. Right now, internally, we use our own tools to see, like, who is sending us data. We have a deployment that's monitoring production deployment. And we're just, you know, seeing how people are using it, how much data they're sending every day, who stopped sending data, who spiked in sending data sets, et cetera. But we're using Mixpanel, and Dominic, our Head of Product, implemented a couple of key metrics to that for that specifically. So, we know, like, what's the average time until somebody starts going from building its own queries with the builder to writing APL, or how long it takes them from, you know, running two queries to five queries. And, you know, we just start measuring these things now. And it's been going...we've been growing healthy around that. So, we tend to measure user interaction, but also, we tend to measure how much data is being sent. Because let's keep in mind, usually, people go in and check for things if there's a problem. So, if there's no problem, the user won't interact with us much unless there's a notification that kicks off. We also just check, like, how much data is being sent to us the whole time. VICTORIA: That makes sense. Like, you can't just rely on, like, well, if it was broken, they would write a [chuckles], like, a question or something. So, how do you get those metrics and that data around their interactions? So, that's really interesting. So, I wonder if we can go back and talk about, you know, we already mentioned a little bit about, like, the early days of Axiom and how you got started. Was there anything that you found in the early discovery process that was surprising and made you pivot strategy? SEIF: A couple of things. Basically, people don't really care about the tech as much as they care [inaudible 12:51] and the packaging, so that's something that we had to learn. And number two, continuous feedback. Continuous feedback changed the way we worked completely, right? And, you know, after that, we had a Slack channel, then we opened a Discord channel. And, like, this continuous feedback coming in just helps with iterating, helps us with prioritizing, et cetera. And that changed the way we actually developed product. VICTORIA: You use Slack and Discord? SEIF: No. No Slack anymore. We had a community Slack. We had a community [inaudible 13:19] Slack. Now, there's no community Slack. We only have a community Discord. And the community Slack is...sorry, internally, we use Slack, but there's a community Discord for the community. JOE: But how do you keep that staffed? Is it, like, everybody is in the Discord during working hours? Is it somebody's job to watch out for community questions? SEIF: I think everybody gets involved now just...and you can see it. If you go on our Discord, you will just see it. Just everyone just gets involved. I think just people are passionate about what they're doing. At least most people are involved on Discord, right? Because there's, like, Discord the help sections, and people are just asking questions and other people answering. And now, we reached a point where people in the community start answering the questions for other people in the community. So, that's how we see it's starting to become a healthy community, et cetera. But that is one of my favorite things: when I see somebody from the community answering somebody else, that's a highlight for me. Actually, we hired somebody from that community because they were so active. JOE: Yeah, I think one of the biggest signs that a product is healthy is when there's a healthy ecosystem building up around it. SEIF: Yeah, and Discord reminds me of the old days of open sources like IRC, just with memes now. But because all of us come from the old IRC days, being on Discord and chatting around, et cetera, et cetera, just gives us this momentum back, gave us this momentum back, whereas Slack always felt a bit too businessy to me. JOE: Slack is like IRC with emoji. Discord is IRC with memes. SEIF: I would say Slack reminds me somehow of MSN Messenger, right? JOE: I feel like there's a huge slam on MSN Messenger here. SEIF: [laughs] What do you guys use internally, Slack or? I think you're using Slack, right? Or Teams. Don't tell me you're using Teams. JOE: No, we're using Slack. SEIF: Okay, good, because I shit talk. Like, there is this, I'll sh*t talk here–when I start talking about Teams, so...I remember that one thing Google did once, and that failed miserably. JOE: Google still has, like, seven active chat products. SEIF: Like, I think every department or every, like, group of engineers just uses one of them internally. I'm not sure. Never got to that point. But hey, who am I to judge? VICTORIA: I just feel like I end up using all of them, and then I'm just rotating between different tabs all day long. You maybe talked me into using Discord. I feel like I've been resisting it, but you got me with the memes. SEIF: Yeah, it's definitely worth it. It's more entertaining. More noise, but more entertaining. You feel it's alive, whereas Slack is...also because there's no, like, history is forever. So, you always go back, and you're like, oh my God, what the hell is this? VICTORIA: Yeah, I have, like, all of them. I'll do anything. SEIF: They should be using Axiom in the background. Just send data to Axiom; we can keep your chat history. VICTORIA: Yeah, maybe. I'm so curious because, you know, you mentioned something about how you realized that it didn't matter really how cool the tech was if the product packaging wasn't also appealing to people. Because you seem really excited about what you've built. So, I'm curious, so just tell us a little bit more about how you went about trying to, like, promote this thing you built. Or was, like, the continuous feedback really early on, or how did that all kind of come together? SEIF: The continuous feedback helped us with performance, but actually getting people to sign up and pay money it started early on. But with Vercel, it kind of skyrocketed, right? And that's mostly because we went with the whole zero-config approach where it's just literally two clicks. And all of a sudden, Vercel is sending your data to Axiom, and that's it. We will create [inaudible 16:33]. And we worked very closely with Vercel to do this, to make this happen, which was awesome. Like, yeah, hats off to them. They were fantastic. And just two clicks, three clicks away, and all of a sudden, we created Axiom organization for you, the data set for you. And then we're sending it...and the data from Vercel is being forwarded to it. I think that packaging was so simple that it made people try it out quickly. And then, the experience of actually using Axiom was sticky, so they continued using it. And then the price was so low because we give 500 gigs for free, right? You send us 500 gigs a month of logs for free, and we don't care. And you can start off here with one terabyte for 25 bucks. So, people just start signing up. Now, before that, it was five terabytes a month for $99, and then we changed the plan. But yeah, it was cheap enough, so people just start sending us more and more and more data eventually. They weren't thinking...we changed the way people start thinking of “what am I going to send to Axiom” or “what am I going to send to my logs provider or log storage?” To how much more can I send? And I think that's what we wanted to reach. We wanted people to think, how much more can I send? JOE: You mentioned latency and cost. I'm curious about...the other big challenge we've seen with observability platforms, including logs, is cardinality of labels. Was there anything you had to sacrifice upfront in terms of cardinality to manage either cost or volume? SEIF: No, not really. Because the way we designed it was that we should be able to deal with high cardinality from scratch, right? I mean, there's open-source ways of doing, like, if you look at how, like, a column store, if you look at a column store and every dimension is its own column, it's just that becomes, like, you can limit on the amount of columns you're creating, but you should never limit on the amount of different values in a column could be. So, if you're having something like stat tags, right? Let's say hosting, like, hostname should be a column, but then the different hostnames you have, we never limit that. So, the cardinality on a value is something that is unlimited for us, and we don't really see it in cost. It doesn't really hit us on cost. It reflects a bit on compression if you get into technical details of that because, you know, high cardinality means a lot of different data. So, compression is harder, but it's not repetitive. But then if you look at, you know, oh, I want to send a lot of different types of fields, not values with fields, so you have hostname, and latency, and whatnot, et cetera, et cetera, yeah, that's where limitation starts because then they have...it's like you're going to a wide range of...and a wider dimension. But even that, we, yeah, we can deal with thousands at this point. And we realize, like, most people will not need more than three or four. It's like a Postgres table. You don't need more than 3,000 to 4000 columns; else, you know, you're doing a lot. JOE: I think it's actually pretty compelling in terms of cost, though. Like, that's one of the things we've had to be most careful about in terms of containing cost for metrics and logs is, a lot of providers will...they'll either charge you based on the number of unique metric combinations or the performance suffers greatly. Like, we've used a lot of Prometheus-based solutions. And so, when we're working with developers, even though they don't need more than, you know, a few dozen metric combinations most of the time, it's hard for people to think of what they need upfront. It's much easier after you deploy it to be able to query your data and slice it retroactively based on what you're seeing. SEIF: That's the detail. When you say we're using Prometheus, a lot of the metrics tools out there are using, just like Prometheus, are using the Gorilla data structure. And the real data structure was never designed to deal with high cardinality labels. So, basically, to put it in a simple way, every combination of tags you send for metrics is its own file on disk. That's, like, the very simple way of explaining this. And then, when you're trying to search through everything, right? And you have a lot of these combinations. I actually have to get all these files from this conversion back together, you know, and then they're chunked, et cetera. So, it's a problem. Generally, how metrics are doing it...most metrics products are using it, even VictoriaMetrics, et cetera. What they're doing is they're using either the Prometheus TSDB data structure, which is based on Gorilla. Influx was doing the same thing. They pivoted to using more and more like the ones we use, and Honeycomb uses, right? So, we might not be as fast on metrics side as these highly optimized. But then when it comes to high [inaudible 20:49], once we start dealing with high cardinality, we will be faster than those solutions. And that's on a very technical level. JOE: That's pretty cool. I realize we're getting pretty technical here. Maybe it's worth defining cardinality for the audience. SEIF: Defining cardinality to the...I mean, we just did that, right? JOE: What do you think, Victoria? Do you know what cardinality is now? [laughs] VICTORIA: All right. Now I'm like, do I know? I was like, I think I know what it means. Cardinality is, like, let's say you have a piece of data like an event or a transaction. SEIF: It's like the distinct count on a property that gives you the cardinality of a property. VICTORIA: Right. It's like how many pieces of information you have about that one event, basically, yeah. JOE: But with some traditional metrics stores, it's easy to make mistakes. For example, you could have unbounded cardinality by including response time as one of the labels -- SEIF: Tags. JOE: And then it's just going to -- SEIF: Oh, no, no. Let me give you a better one. I put in timestamp at some point in my life. JOE: Yeah, I feel like everybody has done that one. [laughter] SEIF: I've put a system timestamp at some point in my life. There was the actual timestamp, and there was a system timestamp that I would put because I wanted to know when the...because I couldn't control the timestamp, and the only timestamp I had was a system timestamp. I would always add the actual timestamp of when that event actually happened into a metric, and yeah, that did not scale. MID-ROLL AD: Are you an entrepreneur or start-up founder looking to gain confidence in the way forward for your idea? At thoughtbot, we know you're tight on time and investment, which is why we've created targeted 1-hour remote workshops to help you develop a concrete plan for your product's next steps. Over four interactive sessions, we work with you on research, product design sprint, critical path, and presentation prep so that you and your team are better equipped with the skills and knowledge for success. Find out how we can help you move the needle at tbot.io/entrepreneurs. VICTORIA: Yeah. I wonder if you could maybe share, like, a story about when it's gone wrong, and you've suddenly charged a lot of money [laughs] just to get information about what's happening in the system. Any, like, personal experiences with observability that kind of informed what you did with Axiom? SEIF: Oof, I have a very bad one, like, a very, very bad one. I used to work for a company. We had to deploy Elasticsearch on Windows Servers, and it was US-East-1. So, just a combination of Elasticsearch back in 2013, 2014 together with Azure and Windows Server was not a good idea. So, you see where this is going, right? JOE: I see where it's going. SEIF: Eventually, we had, like, we get all these problems because we used Elasticsearch and Kibana as our, you know, observability platform to measure everything around the product we were building. And funny enough, it cost us more than actually maintaining the infrastructure of the product. But not just that, it also kept me up longer because most of the downtimes I would get were not because of the product going down. It's because my Elasticsearch cluster started going down, and there's reasons for that. Because back then, Microsoft Azure thought that it's okay for any VM to lose connection with the rest of the VMs for 30 seconds per day. And then, all of a sudden, you have Elasticsearch with a split-brain problem. And there was a phase where I started getting alerted so much that back then, my partner threatened to leave me. So I bought a...what I think was a shock bracelet or a shock collar via Bluetooth, and I connected it to phone for any notification. And I bought that off Alibaba, by the way. And I would charge it at night, put it on my wrist, and go to sleep. And then, when alert happens, it will fully discharge the battery on me every time. JOE: Okay, I have to admit, I did not see where that was going. SEIF: Yeah, did that for a while; definitely did not save my relationship either. But eventually, that was the point where, you know, we started looking into other observability tools like Datadog, et cetera, et cetera, et cetera. And that's where the actual journey began, where we moved away from Elasticsearch and Kibana to look for something, okay, that we don't have to maintain ourselves and we can use, et cetera. So, it's not about the costs as much; it was just pain. VICTORIA: Yeah, pain is a real pain point, actual physical [chuckles] and emotional pain point [laughter]. What, like, motivates you to keep going with Axiom and to keep, like, the wind in your sails to keep working on it? SEIF: There's a couple of things. I love working with my team. So, honestly, I just wake up, and I compliment my team. I just love working with them. They're a lot of fun to work with. And they challenge me, and I challenge them back. And I upset them a lot. And they can't upset me, but I upset them. But I love working with them, and I love working with that team. And the other thing is getting, like, having this constant feedback from customers just makes you want to do more and, you know, close sales, et cetera. It's interesting, like, how I'm a very technical person, and I'm more interested in sales because sales means your product works, the product, the technical parts, et cetera. Because if technically it's not working, you can't build a product on top of it. And if you're not selling it, then what's the point? You only sell when the product is good, more or less, unless you're Oracle. VICTORIA: I had someone ask me about Oracle recently, actually. They're like, "Are you considering going back to it?" And I'm maybe a little allergic to it from having a federal consulting background [laughs]. But maybe they'll come back around. I don't know. We'll see. SEIF: Did you sell your soul back then? VICTORIA: You know, I feel like I just grew up in a place where that's what everyone did was all. SEIF: It was Oracle, IBM, or HP back in the day. VICTORIA: Yeah. Well, basically, when you're working on applications that were built in, like, the '80s, Oracle was, like, this hot, new database technology [laughs] that they just got five years ago. So, that's just, yeah, interesting. SEIF: Although, from a database perspective, they did a lot of the innovations. A lot of first innovations could have come from Oracle. From a technical perspective, they're ridiculous. I'm not sure from a product perspective how good they are. But I know their sales team is so big, so huge. They don't care about the product anymore. They can still sell. VICTORIA: I think, you know, everything in tech is cyclical. So, you know, if they have the right strategy and they're making some interesting changes over there, there's always a chance [laughs]. Certain use cases, I mean, I think that's the interesting point about working in technology is that you know, every company is a tech company. And so, there's just a lot of different types of people, personas, and use cases for different types of products. So, I wonder, you know, you kind of mentioned earlier that, like, everyone is interested in Axiom. But, you know, I don't know, are you narrowing the market? Or, like, how are you trying to kind of focus your messaging and your sales for Axiom? SEIF: I'm trying to focus on developers. So, we're really trying to focus on developers because the experience around observability is crap. It's stupid expensive. Sorry for being straightforward, right? And that's what we're trying to change. And we're targeting developers mainly. We want developers to like us. And we'll find all these different types of developers who are using it, and that's the interesting thing. And because of them, we start adding more and more features, like, you know, we added tracing, and now that enables, like, billions of events pushed through for, you know, again, for almost no money, again, $25 a month for a terabyte of data. And we're doing this with metrics next. And that's just to address the developers who have been giving us feedback and the market demand. I will sum it up, again, like, the experience is crap, and it's stupid expensive. I think that's the [inaudible 28:07] of observability is just that's how I would sum it up. VICTORIA: If you could go back in time and talk to yourself when you were still a developer, now that you're CTO, what advice would you give yourself? JOE: Besides avoiding shock collars. VICTORIA: [laughs] Yes. SEIF: Get people's feedback quickly so you know you're on the right track. I think that's very, very, very, very important. Don't just work in the dark, or don't go too long into stealth mode because, eventually, people catch up. Also, ship when you're 80% ready because 100% is too late. I think it's the same thing here. JOE: Ship often and early. SEIF: Yeah, even if it's not fully ready, it's still feedback. VICTORIA: Ship often and early and talk to people [laughs]. Just, do you feel like, as a developer, did you have the skills you needed to be able to get the most out of those feedback and out of those conversations you were having with people around your product? SEIF: I still don't think I'm good enough. You're just constantly learning, right? I just accepted I'm part of a team, and I have my contributions. But as an individual, I still don't think I know enough. I think there's more I need to learn at this point. VICTORIA: I wonder, what questions do you have for me or Joe? SEIF: How did you start your podcast, and why the name? VICTORIA: Oh, man, I hope I can answer. So, the podcast was started...I think it's, like, we're actually about to be at our 500th Episode. So, I've only been a host for the last year. Maybe Joe even knows more than I do. But what I recall is that one person at thoughtbot thought it would be a great idea to start a podcast, and then they did it. And it seems like the whole company is obsessed with robots. I'm not really sure where that came from. There used to be a tiny robot in the office, is what I remember. And people started using that as, like, the mascot. And then, yeah, that's it, that's the whole thing. SEIF: Was the robot doing anything useful or just being cute? JOE: It was just cute, and it's hard to make a robot cute. SEIF: Was it a real robot, or was it like a -- JOE: No, there was, at one point, a toy robot. The name...I actually forget the origin–origin of the name, but the name Giant Robots comes from our blog. So, we named the podcast the same as the blog: Giant Robots Smashing Into Other Giant Robots. SEIF: Yes, it's called transformers. VICTORIA: Yeah, I like it. It's, I mean, now I feel like -- SEIF: [laughs] VICTORIA: We got to get more, like, robot dogs involved [laughs] in the podcast. SEIF: Like, I wanted to add one thing when we talked about, you know, what gets me going. And I want to mention that I have a six-month-old son now. He definitely adds a lot of motivation for me to wake up in the morning and work. But he also makes me wake up regardless if I want to or not. VICTORIA: Yeah, you said you had invented an alarm clock that never turns off. Never snoozes [laughs]. SEIF: Yes, absolutely. VICTORIA: I have the same thing, but it's my dog. But he does snooze, actually. He'll just, like, get tired and go back to sleep [laughs]. SEIF: Oh, I have a question. Do dogs have a Tamagotchi phase? Because, like, my son, the first three months was like a Tamagotchi. It was easy to read him. VICTORIA: Oh yeah, uh-huh. SEIF: Noisy but easy. VICTORIA: Yes, yes. SEIF: Now, it's just like, yeah, I don't know, like, the last month he has opinions at six months. I think it's because I raised him in Europe. I should take him back to the Middle East [laughs]. No opinions. VICTORIA: No, dogs totally have, like, a communication style, you know, I pretty much know what he, I mean, I can read his mind, obviously [laughs]. SEIF: Sure, but that's when they grow a bit. But what when they were very...when the dog was very young? VICTORIA: Yeah, they, I mean, they also learn, like, your stuff, too. So, they, like, learn how to get you to do stuff or, like, I know she'll feed me if I'm sitting here [laughs]. SEIF: And how much is one dog year, seven years? VICTORIA: Seven years. SEIF: Seven years? VICTORIA: Yeah, seven years? SEIF: Yeah. So, basically, in one year, like, three months, he's already...in one month, he's, you know, seven months old. He's like, yeah. VICTORIA: Yeah. In a year, they're, like, teenagers. And then, in two years, they're, like, full adults. SEIF: Yeah. So, the first month is basically going through the first six months of a human being. So yeah, you pass...the first two days or three days are the Tamagotchi phase that I'm talking about. VICTORIA: [chuckles] I read this book, and it was, like, to understand dogs, it's like, they're just like humans that are trying to, like, maximize the number of positive experiences that they have. So, like, if you think about that framing around all your interactions about, like, maybe you're trying to get your son to do something, you can be like, okay, how do I, like, I don't know, train him that good things happen when he does the things I want him to do? [laughs] That's kind of maybe manipulative but effective. So, you're not learning baby sign language? You're just, like, going off facial expressions? SEIF: I started. I know how Mama looks like. I know how Dada looks like. I know how more looks like, slowly. And he already does this thing that I know that when he's uncomfortable, he starts opening and closing his hands. And when he's completely uncomfortable and basically that he needs to go sleep, he starts pulling his own hair. VICTORIA: [laughs] I do the same thing [laughs]. SEIF: You pull your own hair when you go to sleep? I don't have that. I don't have hair. VICTORIA: I think I do start, like, touching my head though, yeah [inaudible 33:04]. SEIF: Azure took the last bit of hair I had! Went away with Azure, Elasticsearch, and the shock collar. VICTORIA: [laughs] SEIF: I have none of them left. Absolutely nothing. I should sue Elasticsearch for this shit. VICTORIA: [laughs] Let me know how that goes. Maybe there's more people who could join your lawsuit, you know, with a class action. SEIF: [laughs] Yeah. Well, one thing I wanted to also just highlight is, right now, one of the things that also makes the company move forward is we realized that in a single domain, we proved ourselves very valuable to specific companies, right? So, that was a big, big thing, milestone for us. And now we're trying to move into a handful of domains and see which one of those work out the best for us. Does that make sense? VICTORIA: Yeah. And I'm curious: what are the biggest challenges or hurdles that you associate with that? SEIF: At this point, you don't want just feedback. You want constructive criticism. Like, you want to work with people who will criticize the applic...and you iterate with them based on this criticism, right? They're just not happy about you and trying to create design partners. So, for us, it was very important to have these small design partners who can work with us to actually prove ourselves as valuable in a single domain. Right now, we need to find a way to scale this across several domains. And how do you do that without sacrificing? Like, how do you open into other domains without sacrificing the original domain you came from? So, there's a lot of things [inaudible 34:28]. And we are in the middle of this. Honestly, I Forrest Gumped my way through half of this, right? Like, I didn't know what I was doing. I had ideas. I think it's more of luck at this point. And I had luck. No, we did work. We did work a lot. We did sleepless nights and everything. But I think, in the last three years, we became more mature and started thinking more about product. And as I said, like, our CEO, Neil, and Dominic, our head of product, are putting everything behind being a product-led organization, not just a tech-led organization. VICTORIA: That's super interesting. I love to hear that that's the way you're thinking about it. JOE: I was just curious what other domains you're looking at pushing into if you can say. SEIF: So, we are going to start moving into ETL a bit more. We're trying to see how we can fit in specific ML scenarios. I can't say more about the other, though. JOE: Do you think you'll take the same approaches in terms of value proposition, like, low cost, good enough latency? SEIF: Yes, that's definitely one thing. But there's also...so, this is the values we're bringing to the customer. But also, now, our internal values are different. Now it's more of move with urgency and high velocity, as we said before, right? Think big, work small. The values in terms of values we're going to take to the customers it's the same ones. And maybe we'll add some more, but it's still going to be low-cost and large-scale. And, internally, we're just becoming more, excuse my French, agile. I hate that word so much. Should be good with Scrum. VICTORIA: It's painful, but everyone knows what you're talking about [laughs], you know, like -- SEIF: See, I have opinions here about Scrum. I think Scrum should be only used in terms of iceScrum [inaudible 36:04], or something like that. VICTORIA: Oh no [laughter]. Well, it's a Rugby term, right? Like, that's where it should probably stay. SEIF: I did not know it's a rugby term. VICTORIA: Yeah, so it should stay there, but -- SEIF: Yes [laughs]. VICTORIA: Yeah, I think it's interesting. Yeah, I like the being flexible. I like the just, like, continuous feedback and how you all have set up to, like, talk with your customers. Because you mentioned earlier that, like, you might open source some of your projects. And I'm just curious, like, what goes into that decision for you when you're going to do that? Like, what makes you think this project would be good for open source or when you think, actually, we need to, like, keep it? SEIF: So, we open source libraries, right? We actually do that already. And some other big organizations use our libraries; even our competitors use our libraries, that we do. The whole product itself or at least a big part of the product, like database, I'm not sure we're going to open source that, at least not anytime soon. And if we open source, it's going to be at a point where the value-add it brings is nothing compared to how well our product is, right? So, if we can replace whatever's at the back with...the storage engine we have in the back with something else and the product doesn't get affected, that's when we open source it. VICTORIA: That's interesting. That makes sense to me. But yeah, thank you for clarifying that. I just wanted to make sure to circle back. Since you have this big history in open source, yeah, I'm curious if you see... SEIF: Burning me out? VICTORIA: Burning you out, yeah [laughter]. Oh, that's a good question. Yeah, like, because, you know, we're about to be in October here. Do you have any advice or strategies as a maintainer for not getting burned out during the next couple of weeks besides, like, hide in a cave and without internet access [laughs]? SEIF: Stay away from Reddit and Hacker News. That's my goal for October now because I'm always afraid of getting too attached to an idea, or too motivated, or excited by an idea that I drift away from what I am actually supposed to be doing. VICTORIA: Last question is, is there anything else you would like to promote? SEIF: Yeah, check out our website; I think it's at axiom.co. Check it out. Sign up. And comment on Discord and talk to me. I don't bite, sometimes grumpy, but that's just because of lack of sleep in the morning. But, you know, around midday, I'm good. And if you're ever in Berlin and you want to hang out, I'm more than willing to hang out. VICTORIA: Whoo, that's awesome. Yeah, Berlin is great. I was there a couple of years ago but no plans to go back anytime soon, but maybe I'll keep that in mind. You can subscribe to the show and find notes along with a complete transcript for this episode at giantrobots.fm. If you have questions or comments, email us at hosts@giantrobots.fm. And you could find me on Twitter @victori_ousg. And this podcast is brought to you by thoughtbot and produced and edited by Mandy Moore. Thanks for listening. See you next time. Did you know thoughtbot has a referral program? If you introduce us to someone looking for a design or development partner, we will compensate you if they decide to work with us. More info on our website at tbot.io/referral. Or you can email us at referrals@thoughtbot.com with any questions. Special Guests: Joe Ferris and Seif Lotfy.

Making an Affordable Event Data Solution with Seif Lotfy

Screaming in the Cloud

Play Episode Listen Later Oct 19, 2023 27:49

Seif Lotfy, Co-Founder and CTO at Axiom, joins Corey on Screaming in the Cloud to discuss how and why Axiom has taken a low-cost approach to event data. Seif describes the events that led to him helping co-found a company, and explains why the team wrote all their code from scratch. Corey and Seif discuss their views on AWS pricing, and Seif shares his views on why AWS doesn't have to compete on price. Seif also reveals some of the exciting new products and features that Axiom is currently working on. About SeifSeif is the bubbly Co-founder and CTO of Axiom where he has helped build the next generation of logging, tracing, and metrics. His background is at Xamarin, and Deutche Telekom and he is the kind of deep technical nerd that geeks out on white papers about emerging technology and then goes to see what he can build.Links Referenced: Axiom: https://axiom.co/ Twitter: https://twitter.com/seiflotfy TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted guest episode is brought to us by my friends, and soon to be yours, over at Axiom. Today I'm talking with Seif Lotfy, who's the co-founder and CTO of Axiom. Seif, how are you?Seif: Hey, Corey, I am very good, thank you. It's pretty late here, but it's worth it. I'm excited to be on this interview. How are you today?Corey: I'm not dead yet. It's weird, I see you at a bunch of different conferences, and I keep forgetting that you do in fact live half a world away. Is the entire company based in Europe? And where are you folks? Where do you start and where do you stop geographically? Let's start there. We over—everyone dives right into product. No, no, no. I want to know where in the world people sit because apparently, that's the most important thing about a company in 2023.Seif: Unless you ask Zoom because they're undoing whatever they did. We're from New Zealand, all the way to San Francisco, and everything in between. So, we have people in Egypt and Nigeria, all around Europe, all around the US… and UK, if you don't consider it Europe anymore.Corey: Yeah, it really depends. There's a lot of unfortunate naming that needs to get changed in the wake of that.Seif: [laugh].Corey: But enough about geopolitics. Let's talk about industry politics. I've been a fan of Axiom for a while and I was somewhat surprised to realize how long it had been around because I only heard about you folks a couple of years back. What is it you folks do? Because I know how I think about what you're up to, but you've also gone through some messaging iteration, and it is a near certainty that I am behind the times.Seif: Well, at this point, we just define ourselves as the best home for event data. So, Axiom is the best home for event data. We try to deal with everything that is event-based, so time-series. So, we can talk metrics, logs, traces, et cetera. And right now predominantly serving engineering and security.And we're trying to be—or we are—the first cloud-native time-series platform to provide streaming search, reporting, and monitoring capabilities. And we're built from the ground up, by the way. Like, we didn't actually—we're not using Parquet [unintelligible 00:02:36] thing. We're completely everything from the ground up.Corey: When I first started talking to you folks a few years back, there were two points to me that really stood out, and I know at least one of them still holds true. The first is that at the time, you were primarily talking about log data. Just send all your logs over to Axiom. The end. And that was a simple message that was simple enough that I could understand it, frankly.Because back when I was slinging servers around and you know breaking half of them, logs were effectively how we kept track of what was going on, where. These days, it feels like everything has been repainted with a very broad brush called observability, and the takeaway from most company pitches has been, you must be smarter than you are to understand what it is that we're up to. And in some cases, you scratch below the surface and realize it no, they have no idea what they're talking about either and they're really hoping you don't call them on that.Seif: It's packaging.Corey: Yeah. It is packaging and that's important.Seif: It's literally packaging. If you look at it, traces and logs, these are events. There's a timestamp and just data with it. It's a timestamp and data with it, right? Even metrics is all the way to that point.And a good example, now everybody's jumping on [OTel 00:03:46]. For me, OTel is nothing else, but a different structure for time series, for different types of time series, and that can be used differently, right? Or at least not used differently but you can leverage it differently.Corey: And the other thing that you did that was interesting and is a lot, I think, more sustainable as far as [moats 00:04:04] go, rather than things that can be changed on a billboard or whatnot, is your economic position. And your pricing has changed around somewhat, but I ran a number of analyses on your cost that you were passing on to customers and my takeaway was that it was a little bit more expensive to store data for logs in Axiom than it was to store it in S3, but not by much. And it just blew away the price point of everything else focused around logs, including AWS; you're paying 50 cents a gigabyte to ingest CloudWatch logs data over there. Other companies are charging multiples of that and Cisco recently bought Splunk for $28 billion because it was cheaper than paying their annual Splunk bill. How did you get to that price point? Is it just a matter of everyone else being greedy or have you done something different?Seif: We looked at it from the perspective of… so there's the three L's of logging. I forgot the name of the person at Netflix who talked about that, but basically, it's low costs, low latency, large scale, right? And you will never be able to fulfill all three of them. And we decided to work on low costs and large scale. And in terms of low latency, we won't be low as others like ClickHouse, but we are low enough. Like, we're fast enough.The idea is to be fast enough because in most cases, I don't want to compete on milliseconds. I think if the user can see his data in two seconds, he's happy. Or three seconds, he's happy. I'm not going to be, like, one to two seconds and make the cost exponentially higher because I'm one second faster than the other. And that's, I think, that the way we approached this from day one.And from day one, we also started utilizing the idea of existence of Open—Object Storage, we have our own compressions, our own encodings, et cetera, from day one, too, so and we still stick to that. That's why we never converted to other existing things like Parquet. Also because we are a Schema-On-Read, which Parquet doesn't allow you really to do. But other than that, it's… from day one, we wanted to save costs by also making coordination free. So, ingest has to be coordination free, right, because then we don't run a shitty Kafka, like, honestly a lot—a lot of the [logs 00:06:19] companies who running a Kafka in front of it, the Kafka tax reflects in what they—the bill that you're paying for them.Corey: What I found fun about your pricing model is it gets to a point that for any reasonable workload, how much to log or what to log or sample or keep everything is no longer an investment decision; it's just go ahead and handle it. And that was originally what you wound up building out. Increasingly, it seems like you're not just the place to send all the logs to, which to be honest, I was excited enough about that. That was replacing one of the projects I did a couple of times myself, which is building highly available, fault-tolerant, rsyslog clusters in data centers. Okay, great, you've gotten that unlocked, the economics are great, I don't have to worry about that anymore.And then you started adding interesting things on top of it, analyzing things, replaying events that happen to other players, et cetera, et cetera, it almost feels like you're not just a storage depot, but you also can forward certain things on under a variety of different rules or guises and format them as whatever on the other side is expecting them to be. So, there's a story about integrating with other observability vendors, for example, and only sending the stuff that's germane and relevant to them since everyone loves to charge by ingest.Seif: Yeah. So, we did this one thing called endpoints, the number one. Endpoints was a beginning where we said, “Let's let people send us data using whatever API they like using, let's say Elasticsearch, Datadog, Honeycomb, Loki, whatever, and we will just take that data and multiplex it back to them.” So, that's how part of it started. This allows us to see, like, how—allows customers to see how we compared to others, but then we took it a bit further and now, it's still in closed invite-only, but we have Pipelines—codenamed Pipelines—which allows you to send data to us and we will keep it as a source of truth, then we will, given specific rules, we can then ship it anywhere to a different destination, right, and this allows you just to, on the fly, send specific filter things out to, I don't know, a different vendor or even to S3 or you could send it to Splunk. But at the same time, you can—because we have all your data, you can go back in the past, if the incident happens and replay that completely into a different product.Corey: I would say that there's a definite approach to observability, from the perspective of every company tends to visualize stuff a little bit differently. And one of the promises of OTel that I'm seeing that as it grows is the idea of oh, I can send different parts of what I'm seeing off to different providers. But the instrumentation story for OTel is still very much emerging. Logs are kind of eternal and the only real change we've seen to logs over the past decade or so has been instead of just being plain text and their positional parameters would define what was what—if it's in this column, it's an IP address and if it's in this column, it's a return code, and that just wound up being ridiculous—now you see them having schemas; they are structured in a variety of different ways. Which, okay, it's a little harder to wind up just cat'ing a file together and piping it to grep, but there are trade-offs that make it worth it, in my experience.This is one of those transitional products that not only is great once you get to where you're going, from my playing with it, but also it meets you where you already are to get started because everything you've got is emitting logs somewhere, whether you know it or not.Seif: Yes. And that's why we picked up on OTel, right? Like, one of the first things, we now support… we have an OTel endpoint natively bec—or as a first-class citizen because we wanted to build this experience around OTel in general. Whether we like it or not, and there's more reasons to like it, OTel is a standard that's going to stay and it's going to move us forward. I think of OTel as will have the same effect if not bigger as [unintelligible 00:10:11] back of the day, but now it just went away from metrics, just went to metrics, logs, and traces.Traces is, for me, very interesting because I think OTel is the first one to push it in a standard way. There were several attempts to make standardized [logs 00:10:25], but I think traces was something that OTel really pushed into a proper standard that we can follow. It annoys me that everybody uses a different bits and pieces of it and adds something to it, but I think it's also because it's not that mature yet, so people are trying to figure out how to deliver the best experience and package it in a way that it's actually interesting for a user.Corey: What I have found is that there's a lot that's in this space that is just simply noise. Whenever I spend a protracted time period working on basically anything and I'm still confused by the way people talk about that thing, months or years later, I'm starting to get the realization that maybe I'm not the problem here. And I'm not—I don't mean this to be insulting, but one of the things I've loved about you folks is I've always understood what you're saying. Now, you can hear that as, “Oh, you mean we talk like simpletons?” No, it means what you're talking about resonates with at least a subset of the people who have the problem you solve. That's not nothing.Seif: Yes. We've tried really hard because one of the things we've tried to do is actually bring observability to people who are not always busy or it's not part of their day to day. So, we try to bring into [Versal 00:11:37] developers, right, with doing a Versal integration. And all of a sudden, now they have their logs, and they have a metrics, and they have some traces. So, all of a sudden, they're doing the observability work. Or they have actual observability, for their Versal based, [unintelligible 00:11:54]-based product.And we try to meet the people where they are, so we try to—instead of actually telling people, “You should send us data.”—I mean, that's what they do now—we try to find, okay, what product are you using and how can we grab data from there and send it to us to make your life easier? You see that we did that with Versal, we did that with Cloudflare. AWS, we have extensions, Lambda extensions, et cetera, but we're doing it for more things. For Netlify, it's a one-click integration, too, and that's what we're trying to do to actually make the experience and the journey easier.Corey: I want to change gears a little bit because something that we spent a fair bit of time talking about—it's why we became friends, I would think anyway—is that we have a shared appreciation for several things. One of which, at most notable to anyone around us is whenever we hang out, we greet each other effusively and then immediately begin complaining about costs of cloud services. What is your take on the way that clouds charge for things? And I know it's a bit of a leading question, but it's core and foundational to how you think about Axiom, as well as how you serve customers.Seif: They're ripping us off. I'm sorry [laugh]. They just—the amount of money they make, like, it's crazy. I would love to know what margins they have. That's a big question I've always had. I'm like, what are the margins they have at AWS right now?Corey: Across the board, it's something around 30 to 40%, last time I looked at it.Seif: That's a lot, too.Corey: Well, that's also across the board of everything, to be clear. It is very clear that some services are subsidized by other services. As it should be. If you start charging me per IAM call, we're done.Seif: And also, I mean, the machine learning stuff. Like, they won't be doing that much on top of it right now, right, [else nobody 00:13:32] will be using it.Corey: But data transfer? Yeah, there's a significant upcharge on that. But I hear you. I would moderate it a bit. I don't think that I would say that it's necessarily an intentional ripoff. My problem with most cloud services that they offer is not usually that they're too expensive—though there are exceptions to that—but rather that the dimensions are unpredictable in advance. So, you run something for a while and see what it costs. From where I sit, if a customer uses your service and then at the end of usage is surprised by how much it cost them, you've kind of screwed up.Seif: Look, if they can make egress free—like, you saw how Cloudflare just did the egress of R2 free? Because I am still stuck with AWS because let's face it, for me, it is still my favorite cloud, right? Cloudflare is my next favorite because of all the features that are trying to develop and the pace they're picking, the pace they're trying to catch up with. But again, one of the biggest things I liked is R2, and R2 egress is free. Now, that's interesting, right?But I never saw anything coming back from S3 from AWS on S3 for that, like you know. I think Amazon is so comfortable because from a product perspective, they're simple, they have the tools, et cetera. And the UI is not the flashiest one, but you know what you're doing, right? The CLI is not the flashiest one, but you know what you're doing. It is so cool that they don't really need to compete with others yet.And I think they're still dominantly the biggest cloud out there. I think you know more than me about that, but [unintelligible 00:14:57], like, I think they are the biggest one right now in terms of data volume. Like, how many customers are using them, and even in terms of profiles of people using them, it's very, so much. I know, like, a lot of the Microsoft Azure people who are using it, are using it because they come from enterprise that have been always Microsoft… very Microsoft friendly. And eventually, Microsoft also came in Europe in these all these different weird ways. But I feel sometimes ripped off by AWS because I see Cloudflare trying to reduce the prices and AWS just looking, like, “Yeah, you're not a threat to us so we'll keep our prices as they are.”Corey: I have it on good authority from folks who know that there are reasons behind the economic structures of both of those companies based—in terms of the primary direction the traffic flows and the rest. But across the board, they've done such a poor job of articulating this that, frankly, I think the confusion is on them to clear up, not us.Seif: True. True. And the reason I picked R2 and S3 to compare there and not look at Workers and Lambdas because I look at it as R2 is S3 compatible from an API perspective, right? So, they're giving me something that I already use. Everything else I'm using, I'm using inside Amazon, so it's in a VPC, but just the idea. Let me dream. Let me dream that S3 egress will be free at some point.Corey: I can dream.Seif: That's like Christmas. It's better than Christmas.Corey: What I'm surprised about is how reasonable your pricing is in turn. You wind up charging on the basis of ingest, which is basically the only thing that really makes sense for how your company is structured. But it's predictable in advance, the free tier is, what, 500 gigs a month of ingestion, and before people think, “Oh, that doesn't sound like a lot,” I encourage you to just go back and think how much data that really is in the context of logs for any toy project. Like, “Well, our production environment spits out way more than that.” Yes, and by the word production that you just used, you probably shouldn't be using a free trial of anything as your critical path observability tooling. Become a customer, not a user. I'm a big believer in that philosophy, personally. For all of my toy projects that are ridiculous, this is ample.Seif: People always tend to overestimate how much logs they're going to be sending. Like so, there's one thing. What you said it right: people who already have something going on, they already know how much logs they'll be sending around. But then eventually they're sending too much, and that's why we're back here and they're talking to us. Like, “We want to ttry your tool, but you know, we'll be sending more than that.” So, if you don't like our pricing, go find something else because I think we are the cheapest out there right now. We're the competitive the cheapest out there right now.Corey: If there is one that is less expensive, I'm unaware of it.Seif: [laugh].Corey: And I've been looking, let's be clear. That's not just me saying, “Well, nothing has skittered across my desk.” No, no, no, I pay attention to this space.Seif: Hey, where's—Corey, we're friends. Loyalty.Corey: Exactly.Seif: If you find something, you tell me.Corey: Oh, if I find something, I'll tell everyone.Seif: Nononon, you tell me first and you tell me in a nice way so I can reduce the prices on my site [laugh].Corey: This is how we start a price was, industry-wide, and I would love to see it.Seif: [laugh]. But there's enough channels that we share at this point across different Slacks and messaging apps that you should be able to ping me if you find one. Also, get me the name of the CEO and the CTO while you're at it.Corey: And where they live. Yes, yes, of course. The dire implications will be awesome.Seif: That was you, not me. That was your suggestion.Corey: Exactly.Seif: I will not—[laugh].Corey: Before we turn into a bit of an old thud and blunder, let's talk about something else that I'm curious about here. You've been working on Axiom for something like seven years now. You come from a world of databases and events and the like. Why start a company in the model of Axiom? Even back then, when I looked around, my big problem with the entire observability space could never have been described as, “You know what we need? More companies that do exactly this.” What was it that you saw that made you say, “Yeah, we're going to start a company because that sounds easy.”Seif: So, I'll be very clear. Like, I'm not going to, like, sugarcoat this. We kind of got in a position where it [forced counterweighted 00:19:10]. And [laugh] by that I mean, we came from a company where we were dealing with logs. Like, we actually wrote an event crash analytics tool for a company, but then we ended up wanting to use stuff like Datadog, but we didn't have the budget for that because Datadog was killing us.So, we ended up hosting our own Elasticsearch. And Elasticsearch, it costs us more to maintain our Elasticsearch cluster for the logs than to actually maintain our own little infrastructure for the crash events when we were getting, like, 1 billion crashes a month at this point. So eventually, we just—that was the first burn. And then you had alert fatigue and then you had consolidating events and timestamps and whatnot. The whole thing just seemed very messy.So, we started off after some company got sold, we started off by saying, “Okay, let's go work on a new self-hosted version of the [unintelligible 00:20:05] where we do metrics and logs.” And then that didn't go as well as we thought it would, but we ended up—because from day one, we were working on cloud na—because we d—we cloud ho—we were self-hosted, so we wanted to keep costs low, we were working on and making it stateless and work against object store. And this is kind of how we started. We realized, oh, our cost, we can host this and make it scale, and won't cost us that much.So, we did that. And that started gaining more attention. But the reason we started this was we wanted to start a self-hosted version of Datadog that is not costly, and we ended up doing a Software as a Service. I mean, you can still come and self-hosted, but you'll have to pay money for it, like, proper money for that. But we do as a SaaS version of this and instead of trying to be a self-hosted Datadog, we are now trying to compete—or we are competing with Datadog.Corey: Is the technology that you've built this on top of actually that different from everything else out there, or is this effectively what you see in a lot of places: “Oh, yeah, we're just going to manage Elasticsearch for you because that's annoying.” Do you have anything that distinguishes you from, I guess, the rest of the field?Seif: Yeah. So, very just bluntly, like, I think Scuba was the first thing that started standing out, and then Honeycomb came into the scene and they start building something based on Scuba, the [unintelligible 00:21:23] principles of Scuba. Then one of the authors of actual Scuba reached out to me when I told him I'm trying to build something, and he's gave me some ideas, and I start building that. And from day one, I said, “Okay, everything in S3. All queries have to be serverless.”So, all the queries run on functions. There's no real disks. It's just all on S3 right now. And the biggest issue—achievement we got to lower our cost was to get rid of Kafka, and have—let's say, in behind the scenes we have our own coordination-free mechanism, but the idea is not to actually have to use Kafka at all and thus reduce the costs incredibly. In terms of technology, no, we don't use Elasticsearch.We wrote everything from the ground up, from scratch, even the query language. Like, we have our own query language that's based—modeled after Kusto—KQL by Microsoft—so everything we have is built from absolutely from the ground up. And no Elastic. I'm not using Elastic anymore. Elastic is a horror for me. Absolutely horror.Corey: People love the API, but no, I've never met anyone who likes managing Elasticsearch or OpenSearch, or whatever we're calling your particular flavor of it. It is a colossal pain, it is subject to significant trade-offs, regardless of how you work with it, and Amazon's managed offering doesn't make it better; it makes it worse in a bunch of ways.Seif: And the green status of Elasticsearch is a myth. You'll only see it once: the first time you start that cluster, that's what the Elasticsearch cluster is green. After that, it's just orange, or red. And you know what? I'm happy when it's orange. Elasticsearch kept me up for so long. And we had actually a very interesting situation where we had Elasticsearch running on Azure, on Windows machines, and I would have server [unintelligible 00:23:10]. And I'd have to log in and every day—you remember, what's it called—RP… RP Something. What was it called?Corey: RDP? Remote Desktop Protocol, or something else?Seif: Yeah, yeah. Where you have to log in, like, you actually have visual thing, and you have to go in and—Corey: Yep.Seif: And visually go in and say, “Please don't restart.” Every day, I'd have to do that. Please don't restart, please don't restart. And also a lot of weird issues, and also at that point, Azure would decide to disconnect the pod, wanted to try to bring in a new pod, and all these weird things were happening back then. So, eventually, end up with a [unintelligible 00:23:39] decision. I'm talking 2013, '14, so it was back in the day when Elasticsearch was very young. And so, that was just a bad start for me.Corey: I will say that Azure is the most cost-effective cloud because their security is so clown shoes, you can just run whatever you want in someone else's account and it's free to you. Problem solved.Seif: Don't tell people how we save costs, okay?Corey: [laugh]. I love that.Seif: [laugh]. Don't tell people how we do this. Like, Corey, come on [laugh], you're exposing me here. Let me tell you one thing, though. Elasticsearch is the reason I literally use a shock collar or a shock bracelet on myself every time it went down—which was almost every day, instead of having PagerDuty, like, ring my phone.And, you know, I'd wake up and my partner back then would wake up. I bought a Bluetooth collar off of Alibaba that would tase me every time I'd get a notification, regardless of the notification. So, some things are false alarm, but I got tased for at least two, three weeks before I gave up. Every night I'd wake up, like, to a full discharge.Corey: I would never hook myself up to a shocker tied to outages, even if I owned a company. There are pleasant ways to wake up, unpleasant ways to wake up, and even worse. So, you're getting shocked for some—so someone else can wind up effectively driving the future of the business. You're, more or less, the monkey that gets shocked awake to go ahead and fix the thing that just broke.Seif: [laugh]. Well, the fix to that was moving from Azure to AWS without telling anybody. That got us in a lot of trouble. Again, that wasn't my company.Corey: They didn't notice that you did this, or it caused a lot of trouble because suddenly nothing worked where they thought it would work?Seif: They—no, no, everything worked fine on AWS. That's how my love story began. But they didn't notice for, like, six months.Corey: That's kind of amazing.Seif: [laugh]. That was specta—we rewrote everything from C# to Node.js and moved everything away from Elasticsearch, started using Redshift, Redis and a—you name it. We went AWS all the way and they didn't even notice. We took the budget from another department to start filling that in.But we cut the costs from $100,000 down to, like, 40, and then eventually down to $30,000 a month.Corey: More than a little wild.Seif: Oh, God, yeah. Good times, good times. Next time, just ask me to tell you the full story about this. I can't go into details on this podcast. I'll get in a lot—I think I'll get in trouble. I didn't sign anything though.Corey: Those are the best stories. But no, I hear you. I absolutely hear you. Seif, I really want to thank you for taking the time to speak with me. If people want to learn more, where should they go?Seif: So, axiom.co—not dot com. Dot C-O. That's where they learn more about Axiom. And other than that, I think I have a Twitter somewhere. And if you know how to write my name, you'll—it's just one word and find me on Twitter.Corey: We will put that all in the [show notes 00:26:33]. Thank you so much for taking the time to speak with me. I really appreciate it.Seif: Dude, that was awesome. Thank you, man.Corey: Seif Lotfy, co-founder and CTO of Axiom, who has brought this promoted guest episode our way. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment that one of these days, I will get around to aggregating in some horrifying custom homebrew logging system, probably built on top of rsyslog.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

christmas god ceo amazon netflix europe uk service san francisco zoom co founders data microsoft new zealand event solution software cloud nigeria windows workers ip loyalty i am saas cto loki api affordable cisco bluetooth screaming ui aws alibaba devops azure kafka traces scuba pipelines s3 logs node cloudflare lambda axiom elastic microsoft azure r2 splunk honeycomb cli datadog redis redshift elasticsearch parquet seif pagerduty endpoints slacks xamarin vpc corey quinn otel lambdas clickhouse cloudwatch duckbill group versal remote desktop protocol chief cloud economist last week in aws corey is

Podcasts about clickhouse

Best podcasts about clickhouse

Data Engineering Podcast

Contributor

DevZen Podcast

The Swyx Mixtape

Screaming in the Cloud

Postgres FM

The Data Stack Show

Can I get that software in blue?

The Data Engineering Show

Latest news about clickhouse

Latest podcast episodes about clickhouse

Pourquoi l'internet mondial était en panne le 18/11 ?

#231 - On décrypte avec Blef les news de 2025 : Sommets Snowflake et Databricks, ClickHouse, DuckDB, BigQuery

Episode 42 | Tanya Bragin | VP Product & Marketing @ Clickhouse

E182: The Rise of ClickHouse

#216 Konsistenz und Isolation: von Write Skew bis Dirty Reads

Why the Middle Layer of Your Agency Org Chart May Not Survive AI with Jennifer Bagley | Ep #841

#214 Daten aus Spotify & Co: Architektur einer skalierbaren API-Data-Pipeline

Berlin Buzzwords 2025 Conference Interviews

When not to use Postgres

ClickStack: ClickHouse's New Observability Stack Unveiled - OpenObservability Talks S6E03

Nebius: The Nvidia-Backed AI Stock You've Probably Never Heard Of

180: István Mészáros: Merging web and product analytics on top of the warehouse with a zero-copy architecture

E177: RunReveal's Anti SIEM SIEM Platform (With AI That Actually Works!)

Linux под прикрытием — Episode 505

EP228 SIEM in 2025: Still Hard? Reimagining Detection at Cloud Scale and with More Pipelines

The PHP Podcast: 2025.05.29

ClickHouse: Breaking the Speed Limit for Observability and Analytics - OpenObservability Talks S5E12

Rethinking Workplace Connection in a Remote World with Marina Farthouat

244: Postgres to ClickHouse: Simplifying the Modern Data Stack with Aaron Katz & Sai Krishna Srirampur

The PRQL: Data Migration Made Easy: Postgres, ClickHouse, and the Future of Analytics with Aaron Katz and Sai Krishna Srirampur

Shopify's Journey to Planet-Scale Observability - OpenObservability Talks S5E09

Agent Engineering with Pydantic + Graphs — with Samuel Colvin

New SLAP & FLOP Attacks, OCSP Fades Away, DeepSeek's ClickHouse, OAuth 2.0 Security - ASW #316

New SLAP & FLOP Attacks, OCSP Fades Away, DeepSeek's ClickHouse, OAuth 2.0 Security - ASW #316

DeepSeek Security Failure: Cyber Security Today, Friday, January 31, 2025

090 | Better Stack – Juraj Masár, CEO & Co-Founder

Tanya Bragin - Clickhouse, Open Source vs Commercial, and More

Bootstrapping SaaS, JS Monitoring, & Web Performance | Todd Gardner - Frontend Masters Podcast Ep.21

Announcing the ClickHouse Podcast

Friendly Competition within the ClickHouse Ecosystem with Robert Hodges

E145: Bootstrapping an Open Source Monitoring Platform

Improve Data Quality Through Engineering Rigor And Business Engagement With Synq

How ClickHouse powers Netflix, Uber and Spotify's Analytics | Aaron Katz, CEO of ClickHouse

S2E9: Sai Srirampur, PeerDB

Mike Volpi, Partner at Index Ventures

761: Cloudflare Analytics Engine, Workers + more with Ben Vinegar

Ep. 568 w/ Griffin Parry CEO/Founder at m3ter

Establish A Single Source Of Truth For Your Data Consumers With A Semantic Layer

Find Out About The Technology Behind The Latest PFAD In Analytical Database Development

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Data Dynamo: Journey into the World of Databases

Tackling Real Time Streaming Data With SQL Using RisingWave

20VC: Did Figma Kill M&A Markets in 2024, The Three Biggest Mistakes Made in Growth Investing, The Three Requirements Companies Need to Go Public in 2024 with Ed Sim and Jamin Ball

Designing Data Transfer Systems That Scale

Companion databases

Shining Some Light In The Black Box Of PostgreSQL Performance

Surveying The Market Of Database Products

497: Axiom with Seif Lotfy

Making an Affordable Event Data Solution with Seif Lotfy