Podcasts about MinIO

86PODCASTS
151EPISODES
36mAVG DURATION
1MONTHLY NEW EPISODE
Aug 27, 2025LATEST

POPULARITY

20172018201920202021202220232024

Best podcasts about MinIO

Screaming in the Cloud

24 episodes with MinIO

Artificial Intelligence in Industry with Daniel Faggella

7 episodes with MinIO

GreyBeards on Storage

6 episodes with MinIO

Gestalt IT Rundown

5 episodes with MinIO

BSD Now

3 episodes with MinIO

Packet Pushers - Full Podcast Feed

2 episodes with MinIO

AWS Morning Brief

2 episodes with MinIO

Packet Pushers - Briefings In Brief

2 episodes with MinIO

Vox 2 Box

2 episodes with MinIO

Latest podcast episodes about MinIO

D2DO280: Architect for Your AI Success With F5 and MinIO (Sponsored)

Packet Pushers - Full Podcast Feed

Play Episode Listen Later Aug 27, 2025 34:33

In the changing landscape of AI data infrastructure, F5 and MinIO are partnering on a solution that brings together the best of each company. This solution bookends the AI stack—it uses F5 for reliable, secure, and observable data delivery and MinIO’s AIStor for storage of all data types. The goal is to help organizations be... Read more »

success ai technology networking cloud infrastructure architects aws devops azure f5 minio

Play Episode Listen Later Jun 1, 2025 48:52 Transcription Available

Spin up, share, nuke. We each build a throwaway server, and then rate each others' setups.Sponsored By:Tailscale: Tailscale is a programmable networking software that is private and secure by default - get it free on up to 100 devices! 1Password Extended Access Management: 1Password Extended Access Management is a device trust solution for companies with Okta, and they ensure that if a device isn't trusted and secure, it can't log into your cloud apps. Support LINUX UnpluggedLinks:💥 Gets Sats Quick and Easy with Strike📻 LINUX Unplugged on Fountain.FMTUI ChallengeTUI Challenge ScorecardSelf-Hosted 150: The Last One — Before hitting the road, we test the limits of local-first file sharing, debate what self-hosting really is, and share our all-time favorite apps.Pick: ws4kp — A web-based WeatherStar 4000Pick: ytdl-sub — Lightweight tool to automate downloading and metadata generation with yt-dlp.

strike fountain open source brew server ubi fuse homebrew last one nix lightweight plex disposable dnf vps okta debian nextcloud chris fisher bluefin tailscale rhel self hosted minio systemd jupiter broadcasting linux podcast silverblue linux unplugged wes payne

Building a serverless database replica with Carl Sverre

Database School

Play Episode Listen Later Apr 18, 2025 88:59

Want to learn more SQLite? Check out my SQLite course: https://highperformancesqlite.com In this episode, Carl Sverre and I discuss why syncing everything is a bad idea and how his new project, Graft, makes edge-native, partially replicated databases possible. We dig into SQLite, object storage, transactional guarantees, and why Graft might be the foundation for serverless database replicas. SQLSync: https://sqlsync.dev Stop syncing everything blog post: https://sqlsync.dev/posts/stop-syncing-everything Graft: https://github.com/orbitinghail/graft Follow Carl: Twitter: https://twitter.com/carlsverre LinkedIn: https://www.linkedin.com/in/carlsverre Website: https://carlsverre.com/ Follow Aaron: Twitter: https://twitter.com/aarondfrancis LinkedIn: https://www.linkedin.com/in/aarondfrancis Website: https://aaronfrancis.com - find articles, podcasts, courses, and more. Chapters: 00:00 - Intro and Carl's controversial blog title 01:00 - Why “stop syncing everything” doesn't mean stop syncing 02:30 - The problem with full database syncs 03:20 - Quick recap of SQL Sync and multiplayer SQLite 04:45 - How SQL Sync works using physical replication 06:00 - The limitations that led to building Graft 09:00 - What is Graft? A high-level overview 16:30 - Syncing architecture: how Graft scales 18:00 - Graft's stateless design and Fly.io integration 20:00 - S3 compatibility and using Tigris as backend 22:00 - Latency tuning and express zone support 24:00 - Can Graft run locally or with Minio? 27:00 - Page store vs meta store in Graft 36:00 - Index-aware prefetching in SQLite 38:00 - Prefetching intelligence: Graft vs driver 40:00 - The benefits of Graft's architectural simplicity 48:00 - Three use cases: apps, web apps, and replicas 50:00 - Sync timing and perceived latency 59:00 - Replaying transactions vs logical conflict resolution 1:03:00 - What's next for Graft and how to get involved 1:05:00 - Hacker News reception and blog post feedback 1:06:30 - Closing thoughts and where to find Carl

chapters index databases sync s3 syncing replicas graft serverless tigris latency hacker news replaying sqlite sverre minio

Architektura on-premises PaaS - open source jako alternatywa dla chmury

Patoarchitekci

Play Episode Listen Later Apr 18, 2025 58:05

Zastanawiasz się jak zbudować własną Architekturę on-premises PaaS bez uzależnienia od chmury? W tym odcinku Patoarchitekci analizują open-source'owe alternatywy dla usług chmurowych. Łukasz i Szymon omawiają Kubernetesa, Ranchera i inne kluczowe komponenty własnej platformy. Prowadzący szczegółowo rozkładają na czynniki pierwsze budowę platformy PaaS. Od operatorów baz danych i cache'u Redis, przez storage obiektowy Minio, po monitoring z Grafaną. Dowiesz się, kiedy ma sens przenoszenie workloadów z chmury na on-prem i jak uniknąć typowych pułapek przy budowie własnej infrastruktury. Chcesz odzyskać kontrolę nad swoją infrastrukturą i kosztami? Posłuchaj tego odcinka i przekonaj się, czy budowa własnego PaaS-a to dobry pomysł dla Twojej organizacji. Pamiętaj tylko, że MVP platformy to dopiero początek – prawdziwe wyzwania zaczynają się przy jej utrzymaniu! A teraz nie ma co się obijać!

ATA 680 Backups en Android con Restic

Atareao con Linux

Play Episode Listen Later Mar 20, 2025 20:04

Como hacer copias de seguridad o #backup en #android utilizando #restic #termux y #minio de forma sencilla, segura y cifrada.Hace unos días te comenté que estaba estudiando la posibilidad de reemplazar BorgBackup, la herramienta que utilizo por defecto para copias de seguridad, y de la que te he hablado en varias ocasiones. Es una herramienta de la que estoy realmente satisfecho y que me ha ahorrado mas de un disgusto, como por ejemplo lo que te conté en el episodio 173 titulado Hice un rm -rf, salvado por Borg. Sin embargo, hace poco descubrí Restic del que te hablé en el episodio 677 titulado No pierdasa tus datos. Backups infalibles con Restic y Minio, y llevo unas semanas comparando uno con el otro. Y, realmente estoy tan satisfecho con este último, con Restic que he decidido implantarlo en otros dispositivos donde hasta el momento no estaba haciendo copias de seguridad, y me refiero a mis dispositivos Android. Así, en este episodio te hablaré de backups en Android.Más información y enlaces en las notas del episodio

sin android telegram hace github mastodon borg backups hice minio

ATA 680 Backups en Android con Restic

Sospechosos Habituales

Play Episode Listen Later Mar 20, 2025 20:04

sin android telegram hace github mastodon borg backups hice minio sospechosos habituales wintablet

#092 - Introducing Data Services Manager 2.2 featuring Cormac Hogan

Unexplored Territory

Play Episode Listen Later Mar 10, 2025 20:35

Recently VMware Data Services Manager 2.2 was released, so I had to invite my good friend Cormac Hogan to discuss all the enhancements we introduced to an already great product. Of course, we also discussed the Tech Preview for the Object Storage Service, which enables you to deploy MinIO at scale! Disclaimer: The thoughts and opinions shared in this podcast are our own/guest(s), and not necessarily those of Broadcom or VMware by Broadcom.

hogan vmware broadcom cormac tech preview data services minio

ATA 677 No pierdas tus datos. Backups infalibles con Restic y Minio

Atareao con Linux

Play Episode Listen Later Mar 10, 2025 20:35

Buscas un sistema seguro y fiable para tus copias de seguridad? Monta tu sistema utilizando #restic, #resticprofile y #minio para tus #backups infaliblesHace años que te vengo hablando sobre copias de seguridad. En concreto en el episodio 173 te comenté como había hecho un rm -rf, y fuí salvado por Borg. Se que no es un tema tan atractivo como hablar de multimedia, o de servicios que impactan, pero es algo imprescindible. El problema es que solo te acuerdas de las copas de seguridad cuando realmente las necesitas, y en ese momento, es posible que te acuerdes para mal, porque o bien en un momento determinado decidiste no hacerlas o decidiste dejarlo para mas adelante (maldita procrastinación), o simplemente no comprobaste que realmente se estuvieran haciendo de forma correcta. Así, tener un sistema de copias de seguridad eficaz y eficiente es realmente imprescindible y fundamental. En este episodio te hablaré sobre Restic, una alternativa a Borg que estoy probando y que con mucha probabilidad se quedé como sistema por defecto en las próximas semanas.Más información y enlaces en las notas del episodio

telegram github datos mastodon borg monta backups buscas pierdas minio

ATA 677 No pierdas tus datos. Backups infalibles con Restic y Minio

Sospechosos Habituales

Play Episode Listen Later Mar 10, 2025 20:35

telegram github datos mastodon borg monta backups buscas pierdas minio sospechosos habituales wintablet

Agrigento Capitale Cultura, Minio "10/1 passaggio testimone da Pesaro"

Ultim'ora

Play Episode Listen Later Dec 30, 2024 1:51

AGRIGENTO (ITALPRESS) - "Mi auguro che ci sia un clima propositivo e positivo perché saremo sotto i riflettori internazionali". Lo ha detto Giacomo Minio, presidente della Fondazione Agrigento Capitale Italiana della Cultura 2025, a margine della conferenza stampa di presentazione del nuovo logo e della campagna di comunicazione.col/sat/gsl

cultura presidente capitale fondazione passaggio pesaro agrigento testimone minio

Agrigento Capitale Cultura, Minio "10/1 passaggio testimone da Pesaro"

Ultim'ora

Play Episode Listen Later Dec 30, 2024 1:51

cultura presidente capitale fondazione passaggio pesaro agrigento testimone minio

Empowering Enterprises: OPEA, AI, and the Future of Storage

Open at Intel

Play Episode Listen Later Dec 11, 2024 16:06

In this episode, Daniel Valdivia, an engineer from MinIO, discusses his participation at KubeCon and his work in Kubernetes integrations and AI initiatives. We discussed the significance of object storage standardization via the Open Platform for Enterprise AI (OPEA), emphasizing the flexibility and scalability of MinIO's offerings. Daniel highlights MinIO's contributions to open source projects like PyTorch and Spark and shares insights on new hardware technologies like PCIe Gen 5. Daniel also announces the launch of MinIO's new AI store, designed to empower enterprises to efficiently manage exascale infrastructure and AI pipelines. 00:00 Introduction 00:13 Meet Daniel Valdivia: Engineer at Minio 00:24 The Importance of Kubernetes Integrations 00:43 Intel's Open Platform for Enterprise AI 00:58 MinIO's Unique Object Storage Solutions 01:56 Community Participation and Contributions 02:18 Ensuring Compatibility with AI Hardware 03:20 The Role of OPEA in Enterprise AI 05:56 Open Source Contributions and Challenges 09:12 Future of AI and Hardware Innovations 13:23 Big Announcement 14:40 Conclusion and Final Thoughts Guest: Daniel Valdivia is an engineer with MinIO where he focuses on Kubernetes, ML/AI and VMware. Prior to joining MinIO, Daniel was the Head of Machine Learning for Espressive. Daniel has held senior application development roles with ServiceNow, Oracle and Freescale. Daniel holds a Bachelor of Engineering from Tecnológico de Monterrey, Campus Guadalajara and Bachelor of Science in Computer Engineering from Instituto Tecnológico y de Estudios Superiores de Monterrey.

Northwestern Mutual's Cloud Adoption Journey - with Ahmed Azam of Northwestern Mutual

Artificial Intelligence in Industry with Daniel Faggella

Play Episode Listen Later Dec 5, 2024 18:16

Today's guest is Ahmed Azam, Head of Infrastructure and Cloud Services at Northwestern Mutual. Ahmed joins Emerj Senior Editor Matthew DeMello to explore the organization's transformative journey in adopting cloud technology. With roots tracing back to 1857, Northwestern Mutual has continuously evolved, leveraging technological advancements to maintain a competitive edge. Ahmed shares insightful stories about the company's pioneering history, including its early adoption of mainframe computing and the more recent integration of cloud-based solutions. This episode is sponsored by MinIO. Find out more about sponsored content and how to engage with the Emerj audience at emerj.com/ad1.

head infrastructure northwestern mutual cloud services azam cloud adoption adoption journey minio emerj

Tech Bytes: MinIO Optimizes Object Storage for AI Infrastructure (Sponsored)

Packet Pushers - Full Podcast Feed

Play Episode Listen Later Nov 18, 2024 19:00

Today on the Tech Bytes podcast we welcome back sponsor MinIO to talk about how AI is altering the data infrastructure landscape, and why organizations are looking to build AI infrastructure on-prem. We also dig into MinIO's AIStor, a software-only, distributed object store that offers simplicity, scalability, and performance for AI infrastructure and other high-performance... Read more »

ai technology news infrastructure minio object storage tech bytes

Tech Bytes: MinIO Optimizes Object Storage for AI Infrastructure (Sponsored)

Packet Pushers - Briefings In Brief

Play Episode Listen Later Nov 18, 2024 19:00

ai technology news infrastructure minio object storage tech bytes

Is IT Security Too Stressful for the Money? | The Gestalt IT Rundown: November 13, 2024

Gestalt IT Rundown

Play Episode Listen Later Nov 13, 2024 38:24

Pay rates for IT security professionals are rising faster than inflation, but burnout and stress are growing faster. A survey of UK security professionals revealed the fast pace of modern security and the risk of unknown failure is causing skilled practitioners to leave the field. Would yet more pay fix the problem, or is there another way to address IT security staff retention? This and more on the Rundown. Time Stamps: 0:00 - Welcome to the Rundown 1:19 - Can AMD Top NVIDIA? 3:50 - Quantum AI Isn't a Thing 7:13 - MinIO Introduces AIStor 12:23 - Amazon Employee Details Exposed in MoveIt Breach 15:20 - Marslink is Further Away than Starlink 18:19 - AI is writing Google's Code 22:05 - Amazon Won't Go Nuclear 26:41 - Is IT Security Too Stressful for the Money? 35:45 - The Weeks Ahead 37:23 - Thanks for Watching Hosts: Tom Hollingsworth: https://www.linkedin.com/in/networkingnerd/ Alastair Cooke: https://www.linkedin.com/in/alastaircooke/ Follow Gestalt IT Website: https://www.GestaltIT.com/ Twitter: https://www.twitter.com/GestaltIT LinkedIn: https://www.linkedin.com/company/Gestalt-IT #Rundown, #AI, #AIStor, #CyberSecurity, #AWS, @NetworkingNerd, @DemitaasseNZ, @GestaltIT, @TechstrongTV, @TheGuturumGroup, @TechFieldDay, @AMD, @NVIDIA, @MinIO, @GoogleCloud, @Google, @AWSCloud,

money ai google uk security cybersecurity nvidia aws amd stressful gestalt google cloud rundown minio

Managing End Point Storage in Hybrid Data Strategies for Financial Services - with Yonas Yohannes of Oracle

Artificial Intelligence in Industry with Daniel Faggella

Play Episode Listen Later Oct 2, 2024 23:53

Today's guest is Yonas Yohannes, CTO of FinTech and FIS at Oracle. An accomplished executive and author, Yonas joins us on today's podcast to explain the evolving role of endpoint storage for driving new AI capabilities at the edge. He breaks down AI's true value beyond the marketing hype, and its broader impact on infrastructure across industries, with a special focus on financial services. Throughout the episode, Yonas addresses the real challenges businesses face in adopting AI while ensuring transparency and avoiding regulatory risk. This episode is sponsored by MinIO. Learn how brands work with Emerj and other Emerj Media options at emerj.com/ad1.

ai strategy data managing oracle hybrid cto fintech storage financial services fis yonas yohannes minio emerj

AI Infrastructure Investments for Insurance Workflows - with Ylan Kazi of Blue Cross Blue Shield

Artificial Intelligence in Industry with Daniel Faggella

Play Episode Listen Later Sep 26, 2024 17:37

Today's guest is Ylan Kazi, Chief Data and AI Officer at Blue Cross Blue Shield North Dakota. Ylan joins us on today's program to discuss the complexities faced by leaders in legacy industries, such as healthcare, as they navigate the balance between infrastructure investments in cloud technologies and end-point storage to meet business goals. Throughout the episode, Ylan shares insights on developing a robust business strategy for cloud migration, highlighting common pitfalls like cost overruns and outdated mindsets. This episode is sponsored by MinIO. Find out more about sponsored content and how to engage with the Emerj audience at emerj.com/ad1.

insurance investments infrastructure workflows blue cross blue shield kazi chief data minio emerj

#156 Intel Capital's Senior Managing Director Mark Rostick

Smart Venture Podcast

Play Episode Listen Later Sep 19, 2024 57:02

Mark Rostick is a Vice President & Senior Managing Director located in Raleigh, NC. He is a voting member of Intel Capital's investment committee. He joined Intel Capital in 1999. Mark also co-manages our Cloud domain investment activities and portfolio. He has deep investment experience in cloud applications, infrastructure hardware and software, as well as AI/ML. As a member of Intel Capital's Investment Committee, he is responsible for approving investments proposed by Intel Capital investors, as well as managing the group's personnel and operations. Mark currently serves as a director or observer on the boards of Beep, RunPod, Hypersonic, Immuta, Lilt, MinIO, Opaque Systems, Tetrate, and Verta. Prior to Intel, Mark worked as a practicing attorney and in banking. You can learn more about: How to invest in the top AI/ML companies How to build a successful career in corporate venture The evolving landscape of enterprise software investments #IntelCapital #VentureCapital #TechInvestment #CloudComputing #AI #ML ===================== YouTube: @GraceGongCEO Newsletter: @SmartVenture LinkedIn: @GraceGong TikTok: @GraceGongCEO IG: @GraceGongCEO Twitter: @GraceGongGG ===================== Join the SVP fam with your host Grace Gong. In each episode, we are going to have conversations with some of the top investors, superstar founders, as well as well-known tech executives in silicon valley. We will have a coffee chat with them to learn their ways of thinking and actionable tips on how to build or invest in a successful company.

Infrastructure Challenges in Life Sciences Through the Lens of Data - with Robert Wenier of AstraZeneca

Artificial Intelligence in Industry with Daniel Faggella

Play Episode Listen Later Sep 19, 2024 24:31

Today's guest is Robert Wenier, Global Head of Cloud and Infrastructure at AstraZeneca. Robert joins us on the program to explore the complex decisions faced by leaders in legacy industries as they balance infrastructure investments between cloud technologies and end-point storage. How can they align these investments with their business goals while managing the competing forces of performance, risk, and cost? We break down the strategic considerations: ensuring technology delivers the required performance, carefully monitoring risks like security and capacity, and managing costs to create value and maintain margins. This episode is sponsored by MinIO. Learn how brands work with Emerj and other Emerj Media options at emerj.com/ad1.

challenges data cloud infrastructure lens global head astrazeneca life sciences minio emerj

Driving AI Infrastructure in Compliance-Heavy Industries - with Shardul Vikram of SAP

Artificial Intelligence in Industry with Daniel Faggella

Play Episode Listen Later Sep 5, 2024 21:41

Today's guest is Shardul Vikram, CTO and Head of Data & AI for SAP Industries and Customer Experience. Shardul joins Emerj Senior Editor Matthew DeMello on today's program to explore the evolving landscape of cloud adoption and storage solutions within the life sciences and financial services industries. A decade has passed since cloud technology burst onto the scene with great promises, yet today, not everything resides “on the cloud.” As the hype around new technologies like AI starts to cool, Shardul offers legacy and regulated industry leaders actionable insights on driving a balanced approach—leveraging both cloud and endpoint storage to achieve their unique goals. Today's episode is part of a special series sponsored by MinIO for a deep dive into the challenges and opportunities at the intersection of infrastructure investment, technology strategy, and competitive advantage in today's evolving landscape. Learn how brands work with Emerj and other Emerj Media options at emerj.com/ad1.

head ai driving infrastructure cto compliance industries customer experience vikram data ai minio emerj shardul

1,015: Aligning Investor Narratives with Operational Strategy | Mark Khavkin, CFO, MinIO

CFO Thought Leader

Play Episode Listen Later Jul 10, 2024 41:24

Mark Khavkin tells us that from the very beginning of his career journey—a 2008 role as an investment professional with a European private equity firm—he was able to gain experience in board strategy, investor relations, and entrepreneurial exploration. This foundation allowed him to read boardroom dynamics from very early on and prepared him to anticipate a variety of operational perspectives that would set the stage for his path forward. Transitioning to Silicon Valley, Khavkin joined eBay's corporate development team, where he learned to align acquisition opportunities with the strategic goals of business units and technology leaders—experience that deepened his understanding of operational management and strategic planning. A pivotal moment came when a former eBay divisional CFO who had served as a mentor invited Khavkin to join oDesk (later Upwork) as FP&A lead. This role allowed him to influence company culture and drive change from within the finance function. At Upwork, Khavkin tells us he sharpened his ability to integrate investor narratives with internal strategies, from marketing to product development. His ability to present a cohesive story from market opportunities to long-term strategy proved instrumental during the early milestones of Upwork's IPO journey. Throughout his career, Khavkin has come to pursue experiences that would require a unique blend of investment acumen, strategic insight, and leadership impact. His journey highlights the importance of understanding both investor perspectives and operational realities, while crafting a narrative that demonstrates insight into both.

strategy european silicon valley investors narrative transitioning ebay cfo aligning ipo operational upwork odesk minio

1,015: Aligning Investor Narratives with Operational Strategy | Mark Khavkin, CFO, MinIO

Planning Aces

Play Episode Listen Later Jul 10, 2024 41:24

strategy investors narrative aligning operational minio

Tech Bytes: High Performance, Scalable Object Storage with MinIO (Sponsored)

Packet Pushers - Full Podcast Feed

Play Episode Listen Later Jul 8, 2024 18:27

Today on the Tech Bytes podcast we talk with Jonathan Symonds, Chief Marketing Officer at MinIO about MinIO’s object storage offering; a software-defined, Amazon S3-compatible object storage that offers high performance and scale for modern workloads and AI/ML. We discuss how MinIO helps customers across industries drive AI innovation and AI architectures, how object storage... Read more »

ai technology news chief marketing officers high performance scalable ai ml amazon s3 minio object storage tech bytes

Tech Bytes: High Performance, Scalable Object Storage with MinIO (Sponsored)

Packet Pushers - Fat Pipe

Play Episode Listen Later Jul 8, 2024 18:27

ai technology news chief marketing officers high performance scalable ai ml amazon s3 minio object storage tech bytes

Tech Bytes: High Performance, Scalable Object Storage with MinIO (Sponsored)

Packet Pushers - Briefings In Brief

Play Episode Listen Later Jul 8, 2024 18:27

ai technology news chief marketing officers high performance scalable ai ml amazon s3 minio object storage tech bytes

State of the Art: Training >70B LLMs on 10,000 H100 clusters

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Jun 25, 2024 81:49

It's return guest season here at Latent Space! We last talked to Kanjun in October and Jonathan in May (and December post Databricks acquisition): Imbue and Databricks are back for a rare treat: a double-header interview talking about DBRX from Databricks and Imbue 70B, a new internal LLM that “outperforms GPT-4o” zero-shot on a range of reasoning and coding-related benchmarks and datasets, while using 7x less data than Llama 3 70B.While Imbue, being an agents company rather than a model provider, are not releasing their models today, they are releasing almost everything else: * Cleaned-up and extended versions of 11 of the most popular NLP reasoning benchmarks* An entirely new code-focused reasoning benchmark* A fine-tuned 70B model, built with Meta Llama 3, to identify ambiguity* A new dataset of 450,000 human judgments about ambiguity* Infrastructure scripts for bringing a cluster from bare metal to robust, high performance training* Our cost-aware hyperparameter optimizer, CARBS, which automatically and systematically fine-tunes all hyperparameters to derive optimum performance for models of any sizeAs well as EXTREMELY detailed posts on the infrastructure needs, hyperparameter search, and clean versions of the sorry state of industry standard benchmarks. This means for the FIRST TIME (perhaps since Meta's OPT-175B in 2022?) you have this level of educational detail into the hardware and ML nitty gritty of training extremely large LLMs, and if you are in fact training LLMs of this scale you now have evals, optimizers, scripts, and human data/benchmarks you can use to move the industry forward together with Imbue.We are busy running the sold-out AI Engineer World's Fair today, and so are unable to do our usual quality writeup, however, please enjoy our show notes and the excellent conversation! Thanks also to Kanjun, Ashley, Tom and the rest of team Imbue for setting up this interview behind the scenes.Video podTimestamps* [00:00:00] Introduction and catch up with guests* [00:01:55] Databricks' text to image model release* [00:03:46] Details about the DBRX model* [00:05:26] Imbue's infrastructure, evaluation, and hyperparameter optimizer releases* [00:09:18] Challenges of training foundation models and getting infrastructure to work* [00:12:03] Details of Imbue's cluster setup* [00:18:53] Process of bringing machines online and common failures* [00:22:52] Health checks and monitoring for the cluster* [00:25:06] Typical timelines and team composition for setting up a cluster* [00:27:24] Monitoring GPU utilization and performance* [00:29:39] Open source tools and libraries used* [00:32:33] Reproducibility and portability of cluster setup* [00:35:57] Infrastructure changes needed for different model architectures* [00:40:49] Imbue's focus on text-only models for coding and reasoning* [00:42:26] CARBS hyperparameter tuner and cost-aware optimization* [00:51:01] Emergence and CARBS* [00:53:18] Evaluation datasets and reproducing them with high quality* [00:58:40] Challenges of evaluating on more realistic tasks* [01:06:01] Abstract reasoning benchmarks like ARC* [01:10:13] Long context evaluation and needle-in-a-haystack tasks* [01:13:50] Function calling and tool use evaluation* [01:19:19] Imbue's future plans for coding and reasoning applications* [01:20:14] Databricks' future plans for useful applications and upcoming blog postsTranscriptSWYX [00:00:00]: Welcome to the Latent Space Podcast, another super special edition. Today, we have sort of like a two-header. John Frankel from Mosaic Databricks, or Databricks Mosaic, and Josh Albrecht from MBU. Welcome.JOSH [00:00:12]: Hey, glad to be here.SWYX [00:00:14]: Thank you for having us. Hey, so both of you are kind of past guests. Jonathan, you were actually one of the most popular episodes from last year talking about MPT7B. Remember the days when we trained large models and there was 7B?JONATHAN [00:00:30]: Yeah, back when reproducing LLAMA1-7B was considered a huge accomplishment for the field. Those are the good old days. I miss that.SWYX [00:00:38]: As the things have accelerated a lot. Actually, let's do a quick catch up and Josh, you can chime on in as well. So Databricks got acquired. I talked to you at New York.JONATHAN [00:00:45]: Mosaic got acquired, although sometimes it feels like Mosaic acquired Databricks because, you know, we're having a lot of fun being here. But, you know, yeah.SWYX [00:00:52]: Yeah. I mean, you are chief scientist now of Databricks.JONATHAN [00:00:55]: Chief AI scientist. Careful with the title. As much as I would love to understand how Spark works, I'm going to have to defer that to much smarter people than me.SWYX [00:01:03]: Got it. And I don't know about like what you would highlight so far as a post-acquisition, but the most recent news is that you guys released DBRX. Is that the thing that most people should be aware of?JONATHAN [00:01:13]: Actually, that's no longer the most recent news. Honestly, the most recent news, we announced this, but it was at our Data and AI Summit last week. So it was announced among like 100,000 other things, is that we finally released our text to image model, which has been a year in the making through a collaboration directly with Shutterstock. There was a lot of work put into finding a dataset that we were comfortable with working on and trying to build a model that honestly, I felt like I could trust and that others might be able to trust to put out in the world. So that model was released last week. It's unfortunately just available via API due to the fact that the data is quite sensitive and quite valuable. It's Shutterstock's entire business in a lot of ways, but I'm still really excited that there's now a model that is trained on a dataset where the provenance of every single image is known, and it's a damn good model. So I'm really proud of the team on that.SWYX [00:01:55]: Yeah, amazing. Josh, do you have any thoughts on image model questions?JOSH [00:01:59]: That is not my area of expertise, but I was excited to see the release of it last week as well, and very happy that you guys did a nice job on the data side of everything there. So that was cool to see.SWYX [00:02:09]: I think what's unusual is like, I think Shutterstock's doing multiple deals in multiple labs. So what is the Shutterstock model? Like, I guess, is this the house model for Shutterstock? Is this Databricks' version of the Shutterstock model? Like, what is this?JONATHAN [00:02:22]: The way that I would think about it is that Shutterstock is doing an amazing business in AI across the board. Their dataset is kind of widely known to be the best stock photos dataset in the world, the most comprehensive, the biggest. When you think about like, what dataset am I going to train a multimodal model on? You call Shutterstock. And I, at least I've heard in the news, like OpenAI, Google, Meta, Apple have all called Shutterstock and made those deals. So a lot of models have had Shutterstock data incorporated into them. But this is the only model I know of so far where it was, you know, exclusively and specifically trained just on the vanilla Shutterstock data. There was nothing else mixed in. We didn't go and scrape the web and find other data or combined datasets or anything like that. And so this is, in some sense, the house blend. But the other piece is that it's just a dataset where the provenance of every image is known in public. Where did the data come from? It is the Shutterstock collection. That's it. You know, nothing less, nothing more. And certainly being at Databricks, if I've learned one thing, I've learned about enterprise customers and what they want out of AI. And one of the things they ask for most is just, what can you tell me about the data the model was trained on? And here, especially for text to image models, where images are just tricky subject matter, there's been a lot of kind of legal conversation about images, especially. It's nice to just have something where I can point to it and say, you know, if you want to know where the images came from, these are what they are and this is how they got there.SWYX [00:03:36]: I will talk a little bit about Databricks because it's relevant to the rest of today's episode. So Databricks, sorry, I keep misspeaking. It's DBRX.JONATHAN [00:03:46]: DBRX, actually, there's been a pronunciation update. It is now D-B-Rex. So we have decided to add a dinosaur mascot because what model doesn't like a mascot? So literally, I wish I could pull it up. There is a little plush dinosaur that we had made. It's like the world's cutest dinosaur, but it is the official mascot of D-B-Rex. And there's a little dinosaur logo that, you know, you'll probably see around a little bit more because DBRX is a mouthful, but D-B-Rex, like, you know, it's just kind of...SWYX [00:04:13]: Rolls off the tongue. I love mascots. Like every company should have a mascot. And I think Hugging Face got it right. You need an emoji mascot because that's the minimal viable image.JONATHAN [00:04:21]: I probably shouldn't talk at all about, you know, Velociraptor, but, you know, that's a, maybe that's something we can talk about later in the summer. I'll just leave it at that.SWYX [00:04:28]: Okay. That's a hint to names. I feel like your names leak a lot of alpha. So just to quickly cover the headline details, DBRX, as Make Sure Experts model, that's fairly big, 132 billion total parameters, so 36 billion active on any input, pre-trained on 12 trillion tokens of text and code, and did really well on evals to the point where you had to dye your hair blue. That's my high level conclusion.JONATHAN [00:04:53]: Never make a bet with your team two weeks out from model launch, even when, you know, human eval is looking quite bad. Because if you set some bar, even if it's arbitrary and you think there's no way in hell they're going to hit it, apparently money doesn't motivate people anymore. Humiliating their boss motivates people. So Josh, you should really take a hint from this. You know, you cannot pay someone enough money to make up for you dyeing your hair blue.JOSH [00:05:15]: I'll keep that in mind for our next model.SWYX [00:05:17]: It works. So speaking of Imbue's next model, perhaps Josh, you want to actually just say hi to the general sort of latent space audience and talk about what we're releasing today. Yeah.JOSH [00:05:26]: I'm Josh, CTO of Imbue, and we're not releasing the model. We're not releasing the weights, but we are releasing a bunch of different things that should make it easier for other people to make their own models. So I think right now, training foundation models from scratch is like a very difficult, time-consuming, expensive, kind of risky endeavor, especially for smaller companies. And the things that we're releasing hopefully make that at least a little bit easier. So the things that we're releasing fall into kind of three different buckets. One is infrastructure and scripts for dealing with the kind of hardware and hardware failures and understanding how well is the actually lowest level of thing actually working so that you can actually do your training at all and at a reasonable speed without having to constantly restart, etc. So infrastructure and training scripts. A second set of things is around the evaluation. So after you've trained it, like how well is this actually working and how do you know how well it's working? We're releasing a whole bunch of different data there, a new benchmark about code, reasoning, understanding, as well as our own private versions of 11 different open source benchmarks. So things like pool queue or ANLI, where we've gone through and kind of cleaned up the data as much as possible by looking at all the ones that models get wrong or that are flagged for ambiguity and also our own kind of private reproductions of those where we've done like a kind of clean room black box, like, okay, this is what the data set is supposed to be. Here are some examples. Let's make our own version of this to make sure that there is no data contamination, etc. To make sure that we're actually, you know, not testing on train. And then I think a final thing that we're releasing there is around 450,000 human judgments about ambiguity and question quality, which we used in the process of cleaning these evaluations and we also hope will be helpful for other people training kind of similar models. And then the third thing is CARBS, our hyperparameter, our cost-aware hyperparameter optimizer, which was especially helpful for being able to experiment at much smaller scales and then scale those experiments up to the much larger scale kind of on the first try without having to retry it. You don't want to be training, you know, 10, 20 different 70B models. You really want to get these larger modelsSWYX [00:07:30]: right on the first try.JOSH [00:07:30]: And so the ability to kind of tune things very precisely and learn scaling laws, not just for, you know, the like data and flops, but also for learning rate and all the other hyperparameters and see like how should you scale these things up was extremely valuable to us as we were training the larger models. Yeah, that's a lot of stuff.SWYX [00:07:49]: Yeah, exactly. So there's a bunch of stuffJOSH [00:07:50]: we'll have to go through all of it.JONATHAN [00:07:52]: Yeah, I just want to throw in how excited I am about this. This is the stuff that nobody ever talks about. That is the difference between success and failure in this stuff. Like, can you get your cluster to run? Can you get software on your cluster? Can you figure out what broke? Because fault tolerance is still not really built into any of the fundamental primitives of training models. And so if something breaks, you have to go figure out what broke, your job stops, you have to restart your job. It is a nightmare just to get to the point where anything can train on the cluster. A basic MPI hello world that has the GPUs talk to each other is hard enough, let alone actually training a model, let alone getting good performance out of the GPUs, let alone actually getting a model that converges to anything interesting. There's so many levels of things you have to accomplish. This is the kind of stuff that matters. I think to a point that Josh made earlier, before we got on here, there are plenty of weights out there. Nobody's released this.JOSH [00:08:46]: Yeah, that was part of the motivation actually is that there are lots of other things that are complimentary, but I have not seen nearly as much discussion about some of these other things that we think are pretty important. I mean, in some sense,SWYX [00:08:56]: I'm very excited to have Jonathan on because this is a little bit, you're a bread and butter with Mosaic. And I think you've released some part with Composer. And I think it's just really interesting to see like a different take, basically a full stack take that's kind of open source today.JONATHAN [00:09:18]: Yeah, it's really kind of, it's been an ordeal to figure this out. And every time something changes, whether it's a new GPU or even a new driver update, you get new creative errors and new things go wrong. And, you know, we've dealt with the weirdest things from, you know, our InfiniBand cables getting stolen from the data center twice, like in boxes before they arrived at the data center. Like, you know, Porch Pirate basically had stolen our InfiniBand cables back when those were hard to come by. To like, you know, weird recalls of switches to like the strangest stuff has happened. I have my favorite GPU failures I've seen, like ones where the GPU doesn't fail, it has a correctable memory issue and the memory correction causes the GPU to become a straggler and hold up the whole job. Like weird stuff happens and figuring out how to not just identify all of that, but then eventually productize it, is in some sense, the entire story of Mosaic and now Databricks in terms of our ML offering. Really, the thing we offer is we have gone through this suffering and figured out how to even productize that. It has been a pain in the butt.SWYX [00:10:20]: Yeah, it's a lot of work.JOSH [00:10:20]: I think my favorite failure was GPU is just giving wrong math. Like if they give errors, great, because you can see the errors, but if they just give you the wrong math back, not so fun.SWYX [00:10:30]: When did they give you wrong math?JOSH [00:10:32]: Like literally you could just, you know, add two things. For example, the numbers come back. They're not the numbers that they're supposed to be.JONATHAN [00:10:40]: I think it's important to say at this stage, just because like it, I think it goes without saying for Josh and I, but it's worth saying here, this isn't to say that like anything is wrong with us. It's not like NVIDIA did a bad job or, you know, Mellanox did a bad job or the like the server builder, the data center operator, the cloud provider, like the million other parties that are involved in building this. We are running these insane chips that are huge and complicated and built on tiny transistors at insane frequencies with insane heat in data centers that for the most part, were not built remotely for this kind of power or heat and have been retrofitted for this. Like failures happen on a good day with normal CPUs. And this is not a good day and not a normal CPU for the most part. It's fun to joke about all the weird things we see. This is not to say anybody's done anything wrong. This is just kind of part and parcel of working on a massive cluster running at multiple megawatts of power at a time.SWYX [00:11:32]: It's crazy. Yeah.JONATHAN [00:11:33]: So optical cables, like all sorts, like everything.SWYX [00:11:37]: I'll take the opportunity to start going to the sort of infra piece. There's just like a description of the infra just to give people a sense of what we talk about when we talk about massive clusters. So I'm just going to read off the blog post here. This post is about one cluster that has 4,092 H100 GPUs spread across 511 computers. They use unified fabric manager nodes, which manage the infinite band network. And you talk a little bit about your networking. Is there anything unusual about this setup that you'll call out to people?JOSH [00:12:03]: Yeah, actually this particular cluster is a little bit non-standard. The normal, like vanilla setup for these large clusters as vanilla as it can be is what's normally like a 127 node cluster. So closer to like 1024 GPUs instead of 4,000. Here we have a larger cluster. As you start to get into the larger clusters, the networking becomes a little bit more custom. It's a little bit more, it's a little bit trickier. It's a little bit more difficult to get these things to all be able to talk to each other at the same speed. And so this has, in this particular case, this is a three tier network architecture instead of two tiers, kind of the normal one. So most of the clusters are a little bit smaller. As you get to even larger scales, then this becomes even much more complicated,SWYX [00:12:43]: much more expensive.JOSH [00:12:43]: So we chose this particular scale, kind of knowing our own workloads and kind of what we wanted to do. This was kind of the right size for us. But yeah, I think it's not exactly vanilla already. It's already getting into kind of the custom territory.SWYX [00:12:54]: So my understanding is that there, and is there any part of this that comes with the Voltage Park deal that you guys had? Is that part of the hardware that you got from the deal with them?JOSH [00:13:04]: Yeah, so we worked really closely with Voltage Park to set up all their clusters and infrastructure and everything and kind of decide even like what to order, how should the networking work? Like we were very involved in kind of the construction and bring up of this. And that's what this post is about, is about that process of like bringing up all these, there's like different clusters in different places of different scales. So in this particular post, we're talking about this one 4096 GPU, but there are other clusters that they have as well. And we were very closely involved with figuring out the exact architecture and kind of the trade-offs that go along with picking, you know, those exact components. You really don't want to like place the wrong order because it takes months to get it and it's very expensive. So yeah, we were happy to help out with that.JONATHAN [00:13:43]: And then your bit of good cables get stolen.SWYX [00:13:44]: Yeah, yeah, exactly.JOSH [00:13:47]: We wanted to make sure that we ended up with compute that would work for us and that would also work for their other customers. And so we kind of helped design something so that we would get exactly what we were looking for. We knew that these kinds of details would be super important and that getting down to the level of the hardware and like having these good scripts and everything was going to be a core part of like actually getting this to work. I'm very glad that we did that. I don't think that most companies kind of take that full stack approach, but for us, it certainly paid off.SWYX [00:14:12]: Yeah, it's basically sort of built to spec. It's interesting that relationship because you usually, for the rest of us who don't operate at your scale, we take whatever we can get from cloud providers, but you are basically co-designing from the single machine up. And you described that a little bit. Do you want to take us through the process that you described here?JOSH [00:14:27]: Yeah, so for the actual, like the blog post and kind of bringing these machines online.SWYX [00:14:32]: Yeah.JOSH [00:14:32]: So yeah, I think the process, as we have it broken down in the blog post, there's kind of a few different layers. First is like getting the individual machines to work at all and then getting the machines to actually be able to talk to each other. So getting the InfiniBand networking to work and then getting to a point where, you know, not just the machines are working and they can talk to each other, but everything is actually working correctly. There's a big gap between like it's working at all to it's working perfectly correctly. And then after you have all this stuff working perfectly correctly, nice and healthy, then now you get into kind of the software data, like training issues. And then after that, you're still not done. Like now, even once you're training at full speed, things are going to fail over time. Things are going to change. There's going to be new, you know, firmware updates. Like how do you kind of deal with this change and flux over time without going crazySWYX [00:15:16]: and pulling your hair out,JOSH [00:15:16]: trying to like reproduce things or understand why there were regressions. And so there's a lot of work to kind of automate the infrastructure tooling as well. And kind of the first step, like bringing these things online in the first place, you know, you have hundreds of machines at this point. So you don't necessarily want to be like walking around with like a CD-ROM or a USB drive, like plugging it in with your keyboard, like hitting next, next, next on the OS install. That's not how this works. You do that for one machine. And then you use, we use this thing called Metal as a Service to bring up all the other machines. So it's a kind of server that can kind of install the operating system on these other machines. So most like when you're talking about these machines, like each machine is, you know, on the order of hundreds of thousands of dollars. So they usually come with a kind of out-of-band management interface as well. So they don't, they have their InfiniBand networking. They have their normal 100 gigabit per second Ethernet networking. These are like dual, redundant, et cetera. And then you also have this extra out-of-band management network. So you can log in and you can see like the boot screen or you can see the blue screen of death. You can like get in there and actually see what was wrong, which is pretty fun. And it makes it like possible to automate a lot of this work. So the beginning of that, and the blog post goes into much more detail about like exactly how we set these up and kind of the other errors that we ran into. When you're bringing these online, you'll definitely have failures. Even if they all worked in the factory, they get shipped, some parts come loose, something fails, something goes wrong. So when you're bringing them online, there'll be some that don't quite work for all sorts of reasons. As you start to be working with machines at this scale, like if something happens one in a thousand times, you're like pretty likely to see it. And so you can get pretty rare, weird things, especially since we had fairly early builds and fairly early versions of this hardware. Like these are some of the like first machines that were ever produced, some of the first GPUs. So you've got some extra special things there. We definitely worked with Dell, for example, on making fixes in the firmware level to be like, okay, like this thing is wrong. Like we need to update this at the firmware to like actually fix this particular thing. So we worked pretty closely with Dell and Nvidia. Yeah, that's what I'm saying. Like this stuff gets complicated. And the thing is like, you know, taking a step back, the whole reason we're doing this, right, is that we knew that this was going to be complicated. There would be these kinds of failures. And if we're just using, you know, AWS or some other cloud provider, these errors are still gonna be there and you're gonna have no way to know and no way to debug this and no way to diagnose what's going wrong. And so we would much rather be able to like call up Dell and say, hey, this isn't working. And they're like, yep, okay, cool. Let's debug it together. Oh, I see. Yeah, cool. We'll ship a firmware update and actually fix this for you. That was a much better experience than like, great, just magically fails. I guess we restart and hope that that machine goes away. Like that's not a very good place to be. So yeah, that's kind of the first place is getting to a place where like GPU training is working on your single node machines. You can observe stuff. We have tons of tooling around like, you know, Prometheus and all sorts of other tools for understanding what's going on in these machines because you don't want to be like logging into each one and looking at the temperature or something you really need to have tooling to collect all these metrics, et cetera. Unfortunately, all of the scripts that we have for this are like for this entire cluster and for all this infrastructure are a little bit like special purpose for our particular thing. So it's not that every script that we have, it's not that you can just like take this and plug this in. Even if we did open source all the tooling that we have, you'd still have to do like a lot of work to open source it. What we are releasing is as many of the things that we can that are going to be useful for other people. You're still going to have to have some way of kind of managing these things, making your own like logging aggregators, et cetera, et cetera. So that's kind of bringing them up to the like, you know, the single nodes that are working. From there, it goes into, I'm happy to keep going if you want. Well, I just want to leave the opportunity for JohnSWYX [00:18:53]: to comment if there's anything that's different from how he runs things.JONATHAN [00:18:57]: Oh, I mean, all I'll say is I'll endorse this and say this s**t is hard. Like this is really, really hard. And, you know, I have a special props to, you know, the folks in Vue because they were building this from the ground up. You know, at Databricks and at Mosaic, we typically work with cloud providers because some of this stuff is just, there's too much to handle. It's complicated. There's a lot to deal with. And this doesn't even get into things like physical security, you know, securing power if you're the data center operator. Like this gets infinitely complicated and you have to abstract somewhere. Like, you know, and then you get to the folks who are literally building their own custom chips and like, good God.SWYX [00:19:36]: Like, oh my God, that's, you know,JONATHAN [00:19:38]: if you're one of those folks, you're having, you know, pour one out for the infra people at some of the AI chip startups who are having a really, really interesting time right now. But this stuff is really hard. And I don't think we talk about it much because there's so many other things that are hard. But the other hard things, I think everybody's becoming pretty familiar with at this point. This is something that I don't think there's ever really been a comprehensive discussion of, at least not that I've seen.SWYX [00:20:00]: Yeah, so my impression is that you guys, Mosaic, have your own software for sort of spinning up and down machines, just like Imbue had to build. But Imbue probably, it sounds like Imbue, you guys went fuller stack. I don't know how to describe it. Like Mosaic is not working with Dell on like their firmware.JONATHAN [00:20:21]: No, no, we're typically working with like, you know, pick your cloud provider on their Dell firmware or what have you. Like, it's kind of, I think one of the things, I don't know, Josh, you can correct me on this. It's kind of impossible if you're doing training to not go all the way through the entire stack, regardless of what happens. Like somehow I'm still chatting with cloud providers about power contracts, even though the whole point of dealing with the cloud provider is not to have to think about power contracts. Somehow I'm still asking them about which InfiniBand provider they used this time to see if this is part of the bad batch of cables I encountered on that cloud provider or what have you. Or like, we're still talking about a firmware update from pick your provider. You can't not do this. It's convenient that they have data center staff who are worrying about what to send back to which provider when, and they have people who can go and wait for the InfiniBand cables so they don't get stolen outside. But, you know, it's kind of, it's impossible not to really go full stack if you're thinking about the infrastructure at all. I don't know, Josh, correct me. No, I think that's right.JOSH [00:21:17]: That's what we expected from the beginning as well, is that we would inevitably have to get into the details here. And I'm glad that we kind of just planned for it. I think it made it a lot easier from our perspective to have direct control over this. Instead of having to go to the cloud provider that goes to the data center, that goes to the supplier, we could just go direct to NVIDIA or DellSWYX [00:21:37]: or the data center,JOSH [00:21:37]: whoever was responsible and be like, hey, this thing needs to change. And they're like, oh, okay. Yeah, that is our responsibility. Great, we can fix that. So it was just a lot easier for us to fix these bugs than if we had to go through an extra layer of email.SWYX [00:21:48]: Something we discussed in the pre-show was that you had a rule of thumb for your cluster of reliability. You say here in the post, by and large, you expect around 3% of your machines to break every week. So you're basically going to turn through all your machines in a year.JOSH [00:22:04]: As it says in the post. So that would be true if it was a uniform failure like that. But as it says in the post, it's usually these kind of problematic nodes. And to be clear, that is the number that we've heard from other people is like they're having about 3%. I don't think we're experiencing failure rates that are that high. I think ours is actually quite a bit lower than that, probably because we've taken the time to like dig into a large, maybe larger number than we should have of these failures and get to the root cause of it and be like, oh, okay, like that's exactly what's going wrong.SWYX [00:22:33]: How do we fix this?JOSH [00:22:33]: How do we prevent this from happening? How do we make automated checks for this so that if it does happen, it just goes back to whoever owns that particular part of the process and they can fix it immediately.SWYX [00:22:43]: And that's part of what you're also open sourcing, which is the health checks, right? You got the NIC health checks, GPU health check, this space health check, Docker D message. I don't know what that is.JOSH [00:22:52]: That one is just a lot of stuff.SWYX [00:22:54]: Yeah.JOSH [00:22:55]: That one is one where we realized that actually like when these machines boot, sometimes they wouldn't actually boot cleanly all the way. Or when they rebooted, they had problems that they didn't have when they were working before, which was kind of frustrating. Like usually if you restart your computer,SWYX [00:23:08]: it gets better.JOSH [00:23:08]: Here you restart. It did not get better.SWYX [00:23:10]: It got worse.JOSH [00:23:10]: That was very frustrating. So this health check looks at every particular line we've ever seen from the boot, like in D message, like every single log line that your computer emitsSWYX [00:23:21]: and says like,JOSH [00:23:21]: have we ever seen this before?SWYX [00:23:23]: Is this expected?JOSH [00:23:23]: Is this in the right order? Or is there something out of place? If there's anything out of place, let me say, okay, great. Like now it goes into this, like longer, more triage list of like, all right, great. Like, is this acceptable?SWYX [00:23:33]: Should we flag this?JOSH [00:23:33]: Like, should someone take a look at this? So we're looking down at a very, very granular detail level, what's happening on these computers to make sure that nothing is out of place. And that's critical because without that, if you're running your training, as Jonathan said, and this thing is slow, like what are you supposed to do? Right?SWYX [00:23:49]: Like you really,JOSH [00:23:49]: you really want to be very certain that like all 4,000 of these GPUs are working like they're supposed to.SWYX [00:23:54]: We know that.JOSH [00:23:54]: And so if it's slow, it's because like we messed up the config or something else and not because of this earlier thing that's like really hard to detect in software later.JONATHAN [00:24:01]: Yeah. I think the, I'm just curious to ask,SWYX [00:24:03]: like, you know,JONATHAN [00:24:03]: suppose you were to set up another, let's say another H100 cluster and it were at a different data center. And instead of the vendor being Dell, it was super micro or what have you. How much of this would be repeatable? And how much of this would you have to redo? I, you know, I genuinely don't know.SWYX [00:24:18]: A decent amount.JOSH [00:24:19]: I think it would go a lot faster the second time. I think there's lots of learnings that we had. And also the blog post,SWYX [00:24:24]: you know, yes,JOSH [00:24:24]: we are releasing the health checks, releasing some scripts, but a lot of the valuable stuff is also in the blog post itself, in the details and kind of the, you know, the learnings that we've had and the sort of errors that we run into. We tried to as much as possible surface those to other peopleSWYX [00:24:36]: could learn from thoseJOSH [00:24:36]: and avoid the same mistakes or failures as well. But I think it would go a lot faster.SWYX [00:24:41]: Although, yes,JOSH [00:24:41]: there would certainly be some things that'd be a little bit different. I mean, there'd probably be different CPUsSWYX [00:24:46]: or whatever,JOSH [00:24:46]: but I think a lot of that stuff is less,SWYX [00:24:49]: it's less,JOSH [00:24:49]: that's the like, that's less variable. I think most of it would apply the second time around. Although I'm sure next timeSWYX [00:24:56]: we're building one,JOSH [00:24:56]: it'll probably be, you know, at a scale that's 10x as big with a different chip or something like this.SWYX [00:25:00]: And then who knows?JOSH [00:25:01]: Yeah, with Kinect X8,JONATHAN [00:25:02]: that will have its own fun behavior and all that good stuff. Yeah.SWYX [00:25:06]: Perhaps there's something that people don't discuss about, and you don't even talk about this in the blog, but I always wonder is what is the timeline that's like kind of reasonable for this amount of work, at least the initial stages? And also what does the team composition look like for setting up a cluster, right? Like what are the mix of skills that you typically would require to get all this going?JOSH [00:25:27]: I'm, I can't really speak to typical. One thing I am very proud of is how much we accomplished with such a ridiculously small team. Like our infrastructure team is like, you know, fluctuates from week to week, depending on like how many things are on fire and how much we need to build. But it's like between like three and six people, like it's small. It's not like some huge team of like tons and tons of engineers. But those people are very, very good at what they do. And so that has allowed us to get a lot of mileage out of out of these things. I think it's not that we're building everything, right? It's not that three to six people build this whole thing. I definitely want to like, you know, say thanks very much to Dell and H5 and NVIDIA and the other people that have done a lot of the work, like to bring up this cluster, you know, with 4000 GPUs and three tier networking, networking architecture, you have 12,000 cables. So that's 24,000 things that need to be plugged in. Like that's just a lot of stuff to plug in, right? And you don't want to mess it up. Like each one needs to be done correctly. Like it's a little bit loose. Like it doesn't really work.SWYX [00:26:23]: If you break it,JOSH [00:26:23]: you need to replace it. Like there's a lot of workSWYX [00:26:26]: that goes into this.JOSH [00:26:27]: Yeah.SWYX [00:26:28]: And then, you know,JOSH [00:26:28]: that's just like that's it. That's if you were to do everything right the first time.SWYX [00:26:32]: And if you didn'tJOSH [00:26:32]: have to fix anything. But inevitably, you know, you will have to replace something, which means like taking all the wires out, pulling the thing out, taking all the GPUs out, going and fixing some cable, putting it all back correctly, putting it back in, doing this every time. So there were a lot of people at Dell, NVIDIA and at H5 that all helped a ton with this stuff. I don't know the exact size of the Dell team. It also fluctuated over time.SWYX [00:26:55]: Yeah, excellent. And then, you know, you so you have all the hardware set up and now you're firing it up for a single node. There's a long description that you guys have about just like monitoring the MFU, right? And what each situation might look might be indicative of. One of the most interesting things to me that I saw from here is like, you know, if training immediately starts off at 60 to 80% MFU, something's wrong.SWYX [00:27:24]: But like, you know, like what what are like, you know, some anecdotes or, you know, notable scenarios here that you might you might call out as maybe counterintuitive or super interesting.JOSH [00:27:36]: There's just so many of them. I mean, one of them, which I think is probably pretty common, like common knowledge by this point. But like we did have a sort of likeSWYX [00:27:46]: which one was this exactly?JOSH [00:27:47]: I think for the MFU, like gradually getting worse over time. I think that one, when we saw that the first time we were like, what the heck is going on? Like, why does it get just like a little bit worse? This is so strange. Like, what is it getting lazy or tired or something? Like, is it heat? Like what's going on? And in this particular case, it was memory fragmentation. Because you have hundreds of machines, they're doing garbage collection slightly different times. And then they get slightly further apart and slightly more and more jittered until eventually they're all happening kind of at random times. And just like really messing up each one of your steps. So you just turn off garbage collection and call it a day, basically,SWYX [00:28:20]: to be honest.JOSH [00:28:20]: There's other things you can do if you want to be a little bit more sophisticated about it. But you can also just manuallyJONATHAN [00:28:25]: have it all garbage collect on some interval. Like that's what we've done. We just have a garbage collection callback that just runs. But I've seen the exact same thing.JOSH [00:28:33]: Yeah, yeah, exactly. So I thought that one was kind of funny. And we did trace that one down and look and we did find the actual call. Like, again, this goes to like having good tools. So we had really good tools where we could look at a bunch of like actual traces in C and be like, OK, cool. This is the thing that's taking a lot of time. Or like, you know, this is the thing that doesn't quite line up here. Like, oh, I guess it's garbage collection. OK, cool.SWYX [00:28:52]: Interesting.JOSH [00:28:52]: Yeah, let's just try taking it off.SWYX [00:28:54]: OK, great.JOSH [00:28:54]: That's what it was. Now we can fix it. So for each of them, like basically bugs are not hard if you have good tools. But if you don't have good tools, bugs can be very, very hard. So similarly for like heat, another thing that we saw was like, oh, you know, the CPU is getting throttled. OK, well, it's easy to see if you're monitoring the CPU throttling or monitoring the heat. If you're not monitoring that, it's really hard to know why it's just suddenly one of them is going slower. I noticed also in the pieceSWYX [00:29:17]: that you mentioned FSDP with 0.3. Actually, we met, I went to iClear and Guanhua from the DSP team was there presenting 0++. I was wondering if you want to make any call outs to, you know, particular open source or open library or open whatever implementation teams that were super helpful in your process. I think we ended up actuallyJOSH [00:29:39]: pulling from a whole bunch of different ones to pull things in into our own particular pipeline. So we use things from NVIDIA's, you know, Megatron stuff. We use stuff from probably DeepSpeed. I think we pulled in a bunch of different pieces from a bunch of different places. So it was really nice to see all these working open source like examples. I think I really appreciate all the effort that has gone into actually tuning these things because you can tune them, but it's a lot of work to like tune this stuff and do all this stuff from scratch. It's really nice to have like a working example. I think those are probably the two biggest ones, DeepSpeed and Megatron alone, but there are probably other ones as well.SWYX [00:30:13]: Is there a particular thing in the ecosystem where you would call out as like, you know, there should be something here that is open source, but like it's not really, it's like everyone kind of builds it on their own. I want to say something with the file system because everyone talks about the file system eventually.JOSH [00:30:28]: The file system actually was,SWYX [00:30:30]: I mean, we did somethingJOSH [00:30:31]: kind of dumb there. Like we have our own sort of local mirror so that we can, you know, like a crappy version of S3SWYX [00:30:38]: that's local,JOSH [00:30:38]: but it's just a pretty simple script, right?SWYX [00:30:41]: Like I think we run likeJOSH [00:30:41]: a little web server that just like serves files and then, you know, it can upload themSWYX [00:30:45]: and download them.JOSH [00:30:45]: Okay, great. And part of the reason we did that is that our internet connectionSWYX [00:30:50]: in the beginningJOSH [00:30:50]: was not the like full speedSWYX [00:30:52]: one that we wouldJOSH [00:30:52]: eventually have. And so we are a little bit more kind of bottlenecked in terms of internet bandwidth. And so we had this. I think we looked at a bunch of services out there like Minio and some other ones, but a lot of these like come with a lot of extra overhead and maintenance. And since we already have so much infrastructureSWYX [00:31:09]: to deal with,JOSH [00:31:09]: we kind of didn't want to, you know, bring in a whole other like cloud provider, virtualize something, something.SWYX [00:31:14]: We just wanted something simple.JOSH [00:31:14]: So we went with that, which has been quite helpful. Like our toolsSWYX [00:31:19]: are usually quite simple.JOSH [00:31:19]: It's like Bash and Python and SSH and Docker. Like we'd like to keep things simple so that's easier to debug, like less layers of infrastructure, less layers of abstraction, make it a lot easier to work with. Like we don't use Kubernetes,SWYX [00:31:30]: for example,JOSH [00:31:30]: and we just directly launch these things. And it's just been much easier to debug this way. One tool actually that does come into mind that I will call out is Kraken from Uber. That was great. We love that tool. We were a little bit skeptical. What is it?SWYX [00:31:44]: I'm sorry. Yeah.JOSH [00:31:45]: So Kraken is this, yeah, it's a distributed like Docker registry, basically, that uses BitTorrent to like transfer things between the machines in a sort of nice optimal way. Like in the very beginning, the naive way is like you have this one Docker registry, which was outside of the cluster. So every time we change an image, you know, there's many gigabytes that each of the 500 machines needs to download.SWYX [00:32:07]: So that just takesJOSH [00:32:07]: a really long time. So what this thing does is like just one of them downloads it and then like they all sort of broadcast all the pieces to each other. And it was just like a really nice, fast way of getting these images down. And it was very robust.SWYX [00:32:19]: Like there's a lotJOSH [00:32:19]: going on under the hood, but I think it's a pretty cool tool that we haven't really had any bugs with it at all. Amazing.SWYX [00:32:26]: Yeah. I mean, that's all my questions, I guess, for the info piece. I don't know if, John, you had something that you were sort of burning to ask or.JONATHAN [00:32:33]: No, all I can say is just sameSWYX [00:32:36]: in a lot of places, like, you know, and they're done thatJONATHAN [00:32:38]: seeing this plus one. I think the one big difference, you know, perhaps in philosophies is we've tried to basically standardize on as much commodity stuff as possible, just because, you know, I think the reason I asked about trying to do thisSWYX [00:32:50]: on multiple differentJONATHAN [00:32:50]: pieces of infrastructure is like, I think we're running on like six or seven different clouds right now. And everybody has done something slightly different. And my gosh, the little differences add up as you know, you've seen. And so, you know,SWYX [00:33:04]: our philosophy has been like, whatever the hellJONATHAN [00:33:05]: we can standardize, please let's standardize it. Like vanilla off the shelf FSDB.SWYX [00:33:10]: And like, you know,JONATHAN [00:33:10]: we wrote our own data loader, but we've tried to make that as much of a standard as we can across our infrastructure and in Databricks, because things just start getting really complicatedSWYX [00:33:18]: or like we useJONATHAN [00:33:18]: Kubernetes extensively because it at least gives us a uniform set of APIs. Like that's our hardware abstraction layer to a certain extent for everything else. So it's just, you know, a difference in philosophy there. But otherwise, like, yeah, this stuff is really, really hard. And I feel like we take for granted how much of this, you know, is done for us when you go and you just query chat GPT, for example. Like, oh my God, everything going on underneath that, you know, it's kind of a miracle that the machines boot up, let alone that you can like query a giant language model that's probably doing inference across multiple machines and was trained across thousands of machines. Like, you know, minor miracle.SWYX [00:33:54]: Yeah, it is an awesome amount of power that we invoke with a single API call that we take for granted these days. It's absurd. Yeah, I mean, like Kubernetes, like that point about Kubernetes, I will say as a former AWS employee, like it seems like it would be ideal for imbue to at some point make it more abstracted or agnostic because you're going to want to, you know, replicate your setup. We do have our ownJOSH [00:34:19]: sort of replacement. It's just a much simpler version of Kubernetes. Kubernetes is really designed for running services, not for running experiments. Like that's not its like main architecture. And so for us, like we have everything that's like, cool, you're going to run an experiment. So you want it to run to completion, right?SWYX [00:34:34]: OK, great.JOSH [00:34:34]: Like the primitives are sort of built around a slightly different style. And that makes it a lot easier, like just a lot simpler to fit that the nature of like these machines are going to disappear. They will need to be rebooted for infrastructure upgrades. They will like something will happen to the GPUs. Failure is like baked into this as like a core part of our infrastructure. So it's not that we don't have an abstraction. It's that it's a sort of simpler, more tailored abstraction for the particular work that we're doing.JONATHAN [00:34:58]: Yeah, I think it all depends on what your goals are. And like, I think the challenge in a lot of the deep learning stuff right now is that people are trying to like, people often build things that are more complicated than necessary to get the job done. And the complication is the enemy of everything. You know, don't use a fancier parallelism strategy than you have to. Don't use a fancier set of libraries than you have to.SWYX [00:35:18]: Don't do anythingJONATHAN [00:35:18]: that you don't have to do because it's hard enough as it is. Like, don't overcomplicateSWYX [00:35:23]: your own life.JONATHAN [00:35:23]: Don't try to bring in more tools or more fancy architecture tweaks if you absolutely don't have to.SWYX [00:35:29]: Like getting to the minimumJONATHAN [00:35:30]: necessary to get the job done. And it's really tempting to want to try to use everything. So like, I totally understand that one.SWYX [00:35:37]: I think the last piece I'll maybe call out is that I'm just going to weave this in just because I see the opportunity to do it. Are there any infrastructure shifts that need to be, that need to rise because of changing architecture? So I think, for example,SWYX [00:35:57]: you're announcing a dense model, a 70B dense model, whereas John just worked on DBRX and the image-to-text model, which presumably has different bottlenecks.JONATHAN [00:36:10]: That's correct for us. You know, we train both dense and mixture of expert models. The one we happened to, you know, kind of get permission to open source was a mixture of expert model. And those models are very demanding when it comes to network bandwidth, at least if you're training them in kind of FSTP 03 style, where there's just a lot of parameters getting shuffled back and forth. And your ratio of kind of compute to amount of data that you have to shuffle back and forth becomes a lot worse because you're now, you know, you're only using a fraction of the parameters for every token instead of all the parameters. And so we had to really push the envelope on getting all the stuff to the right places on time. And so actually the networking part of DBRX was the single hardest thing, I think, of the entire process. Just get MOE training, working at scale across a big cluster. We still managed to, I think, do it all with commodity parts, which was very exciting. You know, we were using FSTP and we eventually used HSTP so that we could have HSTP as a version of FSTP where you have multiple smaller replicas and you're doing data parallel within those replicas. And that helped a lot with network latency issues that we were running into just because we were transmitting so much data, you know, for every single part of the process. I think it actually, like, it was instructive for how Google designs their hardware and software together personally. Their training, as far as I understand, using kind of a 03 style of training and have been for a while. They also train mixture of expert models. TPUs have a very different network bandwidth to compute ratio. They have a lot more bandwidth just objectively. And TPUs per chip tend to be a little bit less compute intensive and have a little bit less memory. You know, it's just a different design choice. So the ratio of flops to bandwidth is very different. And that means that it's much easier for Google to be able to pull offSWYX [00:37:54]: some of this stuff.JONATHAN [00:37:54]: They also have interesting, you know, Torus style network architecture or Torus style, like, literal network architectureSWYX [00:38:00]: is not like the model,JONATHAN [00:38:00]: but the network.SWYX [00:38:02]: Is this the sort of block attention? I forgot what you call it. So this is just more or the,JONATHAN [00:38:07]: yeah, this is more, not the ring attention, but these are the ring all reduces. Like you have three different dimensions of rings because they kind of put you in these three dimensional Toruses from what I understand. And so like, you know, Google's infrastructure in some sense is kind of, I wouldn't say built for this, but maybe the way that Google trains models is built for a slightly different bit of infrastructure they have. And it's kind of neat to think about that. You know, as one thing that I think NVIDIA announced for, you know, for, for both the GH200 and the GB200 is this hybrid networking where you'll have blocks of NVLink network chips. I think for the GB200, I think it's like groups of 72 GPUs will all have NVLink to each other. So higher bandwidth, then you'll have normal networking of some kind, InfiniBand or Rocky or what have you between these blocks. And that's kind of a, you know, it's a change due to the fact that, you know, it's hard to build really high bandwidth networks over very large groups, but it is now a blocked networking. And you have to think about how you architect your model and your parallelism differently. You also have to think about fault tolerance differently because it now matters where you lose a GPU, whereas it didn't before. So, you know, it's, it's, it's just all really interesting and really fun speaking personally, but it's going to mean new nightmares when we all move to that generation and have to think about, you know, new versions of these problems.JOSH [00:39:20]: As you go up to larger scales, it gets quite different. Like right now, you know, if you're experiencing, let's say, for example, you experience a GPU failure every day, that's fine.SWYX [00:39:31]: Just restart.JOSH [00:39:31]: If you make your thing 24 times as big, now it's once an hour. Now it stops being quite as easy to just restart, right? So now you have to kind of break, like bake in this sort of redundancy that you didn't have before. So I think as you go up in scale, you end up running into like a lot of really interesting problems that also inform the, the actual like design. Yeah, I mean, as an orchestration guy,SWYX [00:39:52]: this is why I always emphasize like very cheap storage or very fast storage. So you can checkpoint more, but I don't think that's probably not the best solution to for fast, you know, training.JONATHAN [00:40:05]: Which works fine when you're doing language and then you move to vision or video. And then, you know, you have multi petabyte datasetsSWYX [00:40:12]: and getting, you know,JONATHAN [00:40:13]: cheap, fast multi petabyte storage starts to bite. Like I've certainly encountered issues where the literal data center where my GPUs were did not have enough, you know, object store to fit the datasets that people wanted to bring into that data center from whichever users were, were trying to bring them in. And then you get to a wholeSWYX [00:40:31]: different world of hurtJONATHAN [00:40:31]: where you have to keep your data in a different region because the region is just out of storage. So things get fun really fast.SWYX [00:40:39]: Speaking of vision, Josh, actually, you know, Embu is an agents company, but you're only, you're announcing a text-only model. What, where does, where does the vision side come in?JOSH [00:40:49]: I think we've actually done a lot of work in the past and people can see kind of our blog posts about sort of self-supervised learning and some other kind of vision-related stuff in the past as well. So we're very familiar with, with that stuff. But I think our main focus right now is on kind of, as we say, coding and reasoning. And there, there's certainly a visual component to some problems. But, you know, it's not necessarily required for all problems. And actually we found that for most of the kind of like code writing and, and reasoning problems that we care about, the visual part isn't really a huge important part of it. Sometimes if you really need to, you can maybe describeSWYX [00:41:24]: the thing.JOSH [00:41:24]: There are other like, you know, multimodal models that you can use off the shelf to sort of plug in for those particular piecesSWYX [00:41:30]: that you need, right?JOSH [00:41:30]: Like if something is driving a browser or whatever, like you can sometimes get away with not having to have that baked into the original model. So our folk were, you know, in a sense, we kind of do a lot across the stack. We're working on our own infrastructure and pre-training and RL and fine tuning and products and everything. But in another sense, we're very narrowly focused on the application side. So all of the stuff across the stack is kind of going toward a very particular purpose. And so that particular purpose right now doesn't really need vision. So we think that people are going to make all sorts of really cool image modelsSWYX [00:42:00]: like Jonathan, right?JOSH [00:42:00]: And all sorts of interesting multimodal models into the future. We'll let them go do that. That's great. We'll take advantage of that, partner with those people in the future. And right now we're really focused on kind of the core reasoning and coding capabilities and aspects of the model.SWYX [00:42:14]: I wanted to go into carbs since that's kind of the next layer of the stack. We talked about carbs in the first episode with Kanjin because you've actually had a blog post about it like a couple of years ago. Maybe let's introduce it.JONATHAN [00:42:26]: Has that been a couple of years now?JOSH [00:42:28]: No, it must have been at least one year. Hopefully it's not multiple years.SWYX [00:42:32]: Sorry, I'm counting AI time. Yeah, yeah. Yeah, I was going to sayJONATHAN [00:42:35]: you're making me feel really old right now.SWYX [00:42:39]: I count everything before the generally intelligent rename as like, you know, prehistory. Yeah. And now sort of modernity, right? So I actually thought carbs was more about hyperparameter optimization in a sense of like sort of parameters, hyperparameter search. Whereas, you know, when you introduced it, especially in this blog post, it's more about scaling laws and predictability of like, are we sort of in the right ballpark before we scale things up? Maybe sort of recount the history of carbs.JOSH [00:43:10]: Yeah, so it really is a little bit of both. So carbs is, it's maybe a backronym, but it's for cost aware Pareto region Bayesian search. So this is about technically how it works, but carbs is like, you know, we like pastries and stuff.SWYX [00:43:26]: So great, why not? But the point is thatJOSH [00:43:29]: it's a cost aware hyperparameter tuner. So most hyperparameter tuners, you kind of say, OK, here's this objective function. I want you to make this number as big as possible or as small as possible, whichever direction you want to go. So yeah, just go make this number, you know, as small as possible. OK, so it'll try a bunch of differentSWYX [00:43:46]: hyperparameters,JOSH [00:43:46]: a bunch of different configurationsSWYX [00:43:48]: to figure out, like,JOSH [00:43:48]: how do I tweak your network and architecture, et cetera, to get the kind of best performance I possibly can. That's usually saying, like, you know, almost all of these hyperparameter configurations are, let's say they're all going to use the same number of GPUs or the same number of nodes.SWYX [00:44:01]: So it's going to runJOSH [00:44:01]: for the same amount of time.SWYX [00:44:03]: So you can do that.JOSH [00:44:03]: You can get a number out and that's great. But what carbs does is it says,SWYX [00:44:07]: OK, actually,JOSH [00:44:07]: what if we relax that constraint? What if we say each of these different points, we're going to model how expensive it will be to sample this configuration. So if what if we train with just one one hundredth of the data? Like, how well can we do?SWYX [00:44:19]: What if we trainJOSH [00:44:19]: with one tenth of the data? What if we train with all the data? That way you can understand, like, as we get more and more data, as we spend more and more compute,SWYX [00:44:26]: as we make a biggerJOSH [00:44:26]: and bigger network, how does performance change with these things that change? Like how expensive it is to even explore this data point. So by doing that, we can see the scaling laws for not just, you know,SWYX [00:44:36]: the scaling lawsJOSH [00:44:36]: from like the, you know, Chantilla paper, the scaling laws for all parameters. We can see how does how does the number of layers change with this? How does the, you know, the learning rate change? How do the like, you know, various types of regularization change? So you can see these nice scaling laws. And as you're going across costs, like how should this be changing as you're scaling up your model? So that, coupled with the kind of metric that we chose, which is a very precise way of measuring performance, allowed us to really like hone in on parameters that worked really wellSWYX [00:45:05]: and understand, like,JOSH [00:45:05]: how do we want to scale those up, especially as we're changingSWYX [00:45:08]: things about the network?JOSH [00:45:08]: Like one of the things that we did is we used a custom tokenizer. As we change this tokenizer, changes a bunch of other things about the model. So how should we scale up this entirely new tokenizer? Like no one has ever made a model this large with this tokenizer before. And so how do we want toSWYX [00:45:22]: change all these things?JOSH [00:45:22]: Harps kind of shows you, like, look, as you change these parameters, like these other ones are kind of dependent on this.SWYX [00:45:28]: Like this is the, these areJOSH [00:45:28]: the relationships between them. So you can better understand, like, OK, if I'm going to scale this up 10x or 100x, like, where do I want to be? I can only go so far. And so, you know, we did run, like, I think maybe it was like a 14b one or somethingSWYX [00:45:40]: like that to check.JOSH [00:45:41]: But and so we had a bunch of like 1b or 14b and then at 70b. I don't think we had a, I think we just did like one at 14b. So you can, we get to check that like, oh, is this on the curve? Like, is this where we expect? It was like right there. So then great, go on to the next one. Yeah, I mean, that makes a lot of sense.SWYX [00:45:56]: I wonder if, so one of the key questions, and correct me if I'm wrong, but like usually people do search or do their evals just based on loss. But you actually evaluate based on, you know, the sort of end state evals that people might expect, like HellaSwag and Lombata, whatever. What is the norm here? Is there a norm?JOSH [00:46:20]: Yeah, I don't know if there's a hundred percent.SWYX [00:46:21]: I don't know. I only see loss on most people's reports.JOSH [00:46:25]: I think it's easy to, like, loss is very nice because it's very precise. It will tell you, like, very fine grained differences between like really small changes in your hyperparameters or network architecture. Whereas, especially at the smaller scales, if you're looking at like accuracy, it's very noisy. Like it might be zero or a hundred or like, you know, fluctuating by like 10 or 20 percentage points, which makes it really hard to tell, like, did that change actually mean anything? So our loss is sort of a combination of these two. Instead of saying, like, let's just look at perplexity, we say, let's look at perplexity on the tasks that we care about for multiple choice questions effectively.SWYX [00:47:00]: So we're saying like, yes,JOSH [00:47:00]: this is formulated as a multiple choice question, and we're going to look at the, like, you know, the loss of perplexity for this particular answer token. And that ends up being something that's like both targeted to what you actually care about and also very precise. The nice thing about this though is that it's independent of the data that you train on. One thing that's annoying about perplexity or about loss is that as you change your data set, this is really obnoxious because now it fundamentally changes your loss, right? And so you can't tell, like, how do I tweak my data set? But because we have this held out evaluation dat

god new york health ai google apple pr service training challenges speaking phd race failure data open train uber tool os metal honestly engineers spark infrastructure models pok cto iq nlp careful ipo openai gemini function arc evaluation nvidia api composer nic rolls typical kraken usb needle python gpt bash emergence aws ml mosaic llama prometheus apis carbs my god abstract llm cpu gpu agi sql pareto docker kubernetes kv jem rag gpus 7b sats vue abracadabra dsp megatron bayesian clusters rl json ethernet cleaned cpus cd roms shutterstock velociraptors databricks ssh humiliating bittorrent mpi harps ascii ai summit reproducibility kadabra torus h5 70b alakazam imbue minio mellanox embu infiniband mbu latent space mnist nvlink john frankel mfu

Balancing Infrastructure and Data Management in Financial Services - with Sheri Crawford of Scotiabank

Artificial Intelligence in Industry with Daniel Faggella

Play Episode Listen Later Jun 12, 2024 16:49

Today's guest is Sheri Crawford, Director of Data Governance at Scotiabank. Sheri joins us on the program today to discuss the biggest challenges for data management teams to drive the systems and the infrastructure necessary to capitalize on new data-heavy emerging use cases in generative AI. Throughout the episode, Sheri gives business leaders in financial services and beyond actionable insights into balancing consumer needs with infrastructure changes in the digital transformation process. Today's episode is sponsored by MinIO. Learn how brands work with Emerj and other Emerj Media options at emerj.com/ad1.

director ai balancing infrastructure crawford financial services data management data governance scotiabank minio emerj

Essentials for AI Infrastructure and Object-Based Storage for Enterprises - with Anand Babu Periasamy of MinIO

Artificial Intelligence in Industry with Daniel Faggella

Play Episode Listen Later May 8, 2024 26:50

Today's guest is Anand Babu Periasamy, Co-founder & Co-CEO of MinIO, Inc. MinIO is a software company that develops High-Performance Object Storage systems that are API compatible with the Amazon S3 cloud storage service. Anand joins us on today's podcast to discuss opportunities for IT and infrastructure leaders to scale AI across the enterprise. Throughout the episode, Anand explains at length what he sees as the critical ingredients for ensuring sustainable growth in infrastructure systems and the advantages of object storage regardless of industrial sector. This episode is sponsored by MinIO. Learn how brands work with Emerj and other Emerj Media options at emerj.com/ad1.

ai infrastructure essentials api storage object co ceo enterprises anand babu amazon s3 minio emerj

Open Source, AI, and Business Insights with AB Periasamy

Screaming in the Cloud

Play Episode Listen Later Mar 14, 2024 44:16

Join Corey Quinn and MinIO's co-founder and CEO, AB Periasamy, for a look into MinIO's strategic approach to integrating open-source contributions with its business objectives amidst the AI evolution. They discuss the effect of AI on data management, highlight the critical role of data replication, and advocate for the adoption of cloud-native architecture. Their conversation examines the insights of data replication, mentioning its pivotal role in ensuring efficient data management and storage. Overall, a recurring theme throughout the episode is the importance of simplifying technology to catalyze a broader understanding and utilization that can remain accessible and beneficial to all.Show Highlights: (00:00) - Intro(03:40) - MinIO's evolution and commitment to simplicity and scalability.(07:25) - The significance of data replication and object storage's versatility.(12:12) - Challenges and innovations in data backup and disaster recovery.(15:21) - Launch of MinIO's Enterprise Object Store and its comprehensive features.(20:50) - Balancing open-source contributions and commercial objectives.(30:32) - AI's growing influence on data storage strategies and MinIO's role.(34:33) - The shift towards software-defined data infrastructure driven by AI and cloud technologies.(39:40) - Resources and the future of tech (43:31) - Closing thoughts About A.B Periasamy:AB Periasamy is the CEO and co-founder of MinIO. One of the leading thinkers and technologists in the open source software movement, AB was a co-founder and CTO of GlusterFS which was acquired by RedHat in 2011. Following the acquisition, he served in the office of the CTO at RedHat prior to founding MinIO in late 2015. AB is an active angel investor and serves on the board of H2O.ai and the Free Software Foundation of India. He earned his BE in Computer Science and Engineering from Annamalai University.Links:MinIO: https://min.io/Kubernetes:https://kubernetes.io/AWS (Amazon Web Services): https://aws.amazon.com/Twitter: @abperiasamy

Staying True to Your Community and Your Bottom Line with Garima Kapoor

The Business of Open Source

Play Episode Listen Later Feb 7, 2024 39:09

Garima Kapoor, COO and co-founder of MinIO, joins me to share her journey from investor and advisor to co-founder of MinIO and the wealth of knowledge she's amassed along the way. In this episode, Garima explains how her experience in finance and belief in the power of open source helped MinIO to break into the data storage market. She also reviews the challenges she faced as a first-time founder and what others can learn from her mistakes and take away from some of their own. Since Garima started her journey with MinIO as CFO, she outlines that role for me and explains how she thinks a CFO should operate in an open source company. In reviewing mistakes she's seen from other founders, Garima states some principles that create the “foundation for any open source business.” - “You should always be very honest to your community. You should always be very transparent to the community”Highlights:Garima introduces herself and explains why she and her co-founders started MinIO (1:31)Garima describes how the MinIO founders honed in on a problem they wanted to solve (3:55)How the MinIO founders used open source crack the market (6:37)What triggers a user to purchase a commercial license for the product (10:33)Garima explains why she and her cofounders were set on their open source strategy from day one (11:35)Garima explores the differences between being an investor and advisor for other companies and starting her own. (13:25)Garima shares go-to-market advice for other founders (15:21)Garima outlines her strategy for building on small successes (18:38)Garima explains why she started as CFO for MinIO and breaks down the role a CFO can play in a new company (21:46)Why Garima thinks a CFO's role remains the same in an open source company as compared to a proprietary company (27:17)How to avoid competing with your open source product when you also have a commercial offering (34:06)Links:GarimaLinkedIn: https://www.linkedin.com/in/garimakap/Twitter: https://twitter.com/garimakapCompany: min.io

community business coo cfo real world open source bottom line staying true kapoor founder stories garima minio

AI workloads impacting cloud storage and data infrastructures

The Tech Trek

Play Episode Listen Later Jan 23, 2024 26:26

In this episode, Amir interviews Ugur Tigli, the CTO of MinIO, a high-performance object storage company. They discuss the infrastructure components of cloud storage, data protection, operating models, and costs and how they tie into AI workloads. Ugur explains that MinIO is an open-source, S3-compatible distributed object storage solution popular for its simplicity and ease of deployment. They also delve into why MinIO chose the open-source path and its benefits. Listen to the episode to learn more about cloud and AI workloads and the impact on cloud costs. Highlights [00:02:40] Dual licensing model. [00:04:15] Open source and security. [00:07:36] AI and data growth. [00:14:15] Complex data infrastructure evolution. [00:16:39] Object storage simplification. [00:20:19] AI and storage cost. [00:24:07] Integrating with external systems. Ugur Tigli is CTO at MinIO. In this current role, he oversees enterprise strategy and interfaces with MinIO's enterprise client base. He helps clients architect and deploy API-driven, cloud-native, and scalable enterprise-grade data infrastructure using MinIO. Ugur has almost two decades of experience building high-performance data infrastructure for global financial institutions. Before MinIO, he was a technology leader at Bank of America, serving as the Senior Vice President and Global Head of Hardware Engineering. He joined Bank of America through the acquisition of Merrill Lynch, where he was the Vice President for Storage Engineering. Ugur has a Bachelor of Science in Electrical Engineering from Lafayette College. https://www.linkedin.com/in/ugur-tigli-9a9323/ Thank you so much for checking out this episode of The Tech Trek, and we would appreciate it if you would take a minute to rate and review us on your favorite podcast player. Want to learn more about us? Head over at https://www.elevano.com Have questions or want to cover specific topics with our future guests? Please message me at https://www.linkedin.com/in/amirbormand (Amir Bormand)

Leading the World of Object Storage with Garima Kapoor

Lead at the Top of Your Game

Play Episode Listen Later Nov 14, 2023 41:20

IN THIS EPISODE...In this digital age, where the volume of data is growing exponentially, object storage has emerged as a fundamental technology, particularly well-suited for cloud computing and big data applications. It offers the advantages of easy scalability, durability, and accessibility, making it an integral part of modern data management solutions. Unlike traditional file systems, which organize data into hierarchical folders and directories, object storage takes a different approach.My guest today, Garima Kapoor, Ph.D., is the Co-Founder and Chief Operating Officer (COO) of MinIO, Inc., an industry-leading company that has pioneered a high-performance, S3-compatible object store. With a solid educational background and extensive experience, Garima has been instrumental in MinIO's remarkable journey. Under her strategic leadership, MinIO has emerged as a powerhouse in data storage, specializing in large-scale AI/ML, data lake, and database workloads. The innovative object store solution MinIO offers is designed to meet the demanding requirements of modern data-driven applications. It is characterized by its software-defined architecture, enabling seamless deployment on a wide range of environments, including cloud and on-premises infrastructure.------------Full show notes, links to resources mentioned, and other compelling episodes can be found at http://LeadYourGamePodcast.com. (Click the magnifying icon at the top right and type “Garima”)Love the show? Subscribe, rate, review, and share! ------------JUST FOR YOU: Increase your leadership acumen by identifying your personal Leadership Trigger. Take my free my free quiz and instantly receive your 5-page report. Need to up-level your workforce or execute strategic People initiatives? https://shockinglydifferent.com/contact or tweet @KaranRhodes.-------------ABOUT GARIMA KAPOOR:Garima Kapoor is a prominent figure in the tech industry, known for her role as the Chief Operating Officer (COO) and co-founder of MinIO, a cutting-edge technology company. With a solid financial background, she initially served as the company's Chief Financial Officer (CFO) before taking on her current leadership position. Garima is not only a successful entrepreneur but also an active investor and advisor to emerging technology companies in the dynamic landscape of Silicon Valley.Her academic journey is equally impressive, holding a Doctor of Philosophy (Ph.D.) in Accounting and Finance from Nirma University, a Masters in Economics from Banasthali Vidyapith, and a Bachelor of Science (BS) degree in Economics from Delhi University. Garima's multifaceted expertise and leadership have played a pivotal role in shaping the success of MinIO and contributing to the advancement of technology in the digital era.------------WHAT TO LISTEN FOR:WHAT TO LISTEN FOR:1. What does MinIO do, and how does it help organizations?2. What is object storage?3. What are the tips for building a successful startup?4. What is the role of fundraising and product development in the growth of a storage company?5. What is courageous agility, and how does it help to navigate unpredictable paths in leadership and...

doctors co founders masters bachelor finance economics silicon valley accounting s3 ai ml kapoor chief operating officer coo delhi university chief financial officer cfo garima science bs minio object storage philosophy ph

In today's symposium, we talk about a new strand of Chae$ malware, some developments in social engineering, privateers in a hybrid war, cyber ops as combat support, and some default passwords.

The CyberWire

Play Episode Listen Later Sep 5, 2023 28:34

A New variant of Chae$ malware is described. A "Smishing Triad" impersonates postal services. A MinIO storage exploit reported. Okta warns of attackers seeking senior admin privileges. LockBit compromises a UK security contractor. DDoS takes down a German financial regulator's site. Infamous Chisel as GRU combat support. Joe Carrigan on Meta uncovering a Chinese influence effort. Our guest is Connie Stack, CEO of Next DLP, discussing data breach notification procedure. And please -PLEASE- remember to change your default passwords. For links to all of today's stories check out our CyberWire daily news briefing: https://thecyberwire.com/newsletters/daily-briefing/12/169 Selected reading. Threat Profile: Chae$ 4 Malware (Morphisec) "Smishing Triad" Targeted USPS and US Citizens for Data Theft (Resecurity) 'Smishing Triad' Targeted USPS and US Citizens for Data Theft (Security Affairs) New Attack Vector In The Cloud: Attackers caught exploiting Object Storage Services (Security Joes) Hackers exploit MinIO storage system to breach corporate networks (BleepingComputer) Okta Warns of Social Engineering Attacks Targeting Super Administrator Privileges (The Hacker News) More Okta customers trapped in Scattered Spider's web (Register) Cross-Tenant Impersonation: Prevention and Detection (Okta Security) Breaking: UK MoD attacked by LockBit (Computing) German financial agency site disrupted by DDoS attack since Friday (BleepingComputer) LogicMonitor customers hacked in reported ransomware attacks (BleepingComputer) LogicMonitor customers hit by hackers, because of default passwords (TechCrunch) Learn more about your ad choices. Visit megaphone.fm/adchoices

PDF MalDoc warning, MinIO storage compromises, Okta helpdesk attacks

Cyber Security Headlines

Play Episode Listen Later Sep 5, 2023 7:16

New PDF MalDoc allows evasion of antivirus MinIO Storage system being used to compromise servers Okta warns of IT help desk attacks Thanks to today's episode sponsor, Comcast Data rules everything around us – but why are the people who need data the most unable to access it? What if you could boost the productivity of your security teams and their ability to collaborate by providing them access to the same shared and enriched data? You can. With DataBee™, from Comcast Technology Solutions. Learn how DataBee can help your organization make better informed decisions, quickly and cost-effectively. Visit https://comca.st/DataBee For the stories behind the headlines, head to CISOseries.com.

attacks storage okta compromises help desk minio ciso series

Object store for AI workloads | Anand Babu Periasamy, cofounder and CEO of MinIO

Infinite Machine Learning

Play Episode Listen Later Aug 28, 2023 40:29

Anand Babu "AB" Periasamy is the cofounder and CEO of MinIO, a high performance object storage for AI that's built for large scale workloads. They have raised $126M in funding from the likes of General Catalyst, Softbank, Intel Capital, and Nexus Venture Partners. It's the world's fastest growing object storage company with more than 1 billion Docker pulls and more than 35K stars on GitHub. He's also an angel investor with investments in companies like H2O.ai, Isovalent, Starburst, Postman, and many more. He was previously the cofounder and CTO of Gluster, which got acquired by Red Hat. In this episode, we cover a range of topics including: - Why is storage important for AI workflows - What are the characteristics of a good data storage product - Repatriation of data from public cloud to on-prem - Running ML experiments in parallel - AI compute offerings from data infrastructure providers - Making data infrastructure faster and cheaper AB's favorite book: An Awesome Book! (Author: Dallas Clayton)--------Where to find Prateek Joshi: Newsletter: https://prateekjoshi.substack.com Website: https://prateekj.com LinkedIn: https://www.linkedin.com/in/prateek-joshi-91047b19 Twitter: https://twitter.com/prateekvjoshi

ceo ai ab cto object github anand softbank workload red hat docker postman h2o babu repatriation starburst 35k general catalyst intel capital minio isovalent gluster nexus venture partners

China Plays Hardball with Western Companies | Gestalt IT Rundown: April 12, 2023

Gestalt IT Rundown

Play Episode Listen Later Apr 12, 2023 23:03

China is reportedly blocking multiple mergers and is now investigating Micron for an unspecified cybersecurity concern. Delays include Intel's acquisition of Tower Semiconductor, Maxlinear's purchase of Silicon Motion, Broadcom's acquisition of VMware, and Microsoft's bid for Activision Blizzard, as well as the long-delay in Cisco's acquisition of Acacia and the called-off purchase of NXP by Qualcomm which we reported on previously. The latest move against Micron seems designed to antagonize rather than investigate any real cybersecurity flaws at the memory provider. What's going on here? Time Stamps: 0:00 - Welcome to the Rundown 1:00 - Cisco Pulls Out of Russia 2:38 - Sony Invests in Raspberry Pi 4:10 - Always Greener on the Google AI Side 6:28 - Jacob Ziv Dies at 91 8:03 - MinIO Fires Back at Weka 13:48 - China Plays Hardball with Western Companies 21:44 - The Weeks Ahead 22:25 - Thanks for Watching Follow our hosts on Social MediaTom Hollingsworth: https://www.twitter.com/NetworkingNerdStephen Foskett: https://www.twitter.com/SFoskett Max Mortillaro: https://www.twitter.com/MaxMortillaro Follow Gestalt ITWebsite: https://www.GestaltIT.com/Twitter: https://www.twitter.com/GestaltITLinkedIn: https://www.linkedin.com/company/1789 Tags: #Rundown, #RaspberryPi, #A100, #China, #Russia, #NFD31, #ArubaAtmosphere23, #NFDx, #MFD9, @Cisco @RaspberryPi_Org, @Google, @GoogleCloud, #AI, @NVIDIA, @MinIO, @WekaIO, #Storage

BONUS: What is Object Storage like AWS S3, Minio and more!

The Datanation Podcast - Podcast for Data Engineers, Analysts and Scientists

Play Episode Listen Later Apr 12, 2023

Alex Merced discusses what is Object Storage and the history of file systems. Join the community at datanation.click

aws s3 minio object storage alex merced

Episode #110 - It's 5:05, Friday, March 31, 2023

It's 5:05! Daily cybersecurity and open source briefing

Play Episode Listen Later Mar 31, 2023 10:53

Hey, it's 5:05 on Friday, March 31st, 2023. From The Sourced Podcast Network in New York City, this is your host Pokie Huang. Stories in today's episode come from Kadi Grigg in Alexandria, Virginia, Edwin Kwan in Sydney, Australia, Olimpiu Pop in Transylvania, Romania, Katy Craig in San Diego, California, Marcel Brown in St. Louis, Missouri. Let's get to it.Latest Mass Ransomware Attack

california new york city australia stories san diego chatgpt missouri romania transylvania pwned minio

Opensource Licensing Danger Zone?

The CTO Advisor

Play Episode Listen Later Mar 29, 2023

X as Code expert Ned Bellavance rejoins the podcast to discuss the latest battle in open-source licensing between MinIO and WekaIO and how customers should think about open-source licensing. Show Notes: Ned's Pluralsight Courses: https://www.pluralsight.com/authors/edward-bellavance Block and Files Article on MinIO/WekaIO: https://blocksandfiles.com/2023/03/26/we-object-minio-says-no-more-open-license-for-you-weka/

code open source licensing danger zone minio pluralsight courses

Gordon Moore Dies at 94 | Gestalt IT Rundown: March 29, 2023

Gestalt IT Rundown

Play Episode Listen Later Mar 29, 2023 39:23

Computing pioneer and Intel co-founder Gordon Moore has died. His name is commonly used in reference to Moore's Law, which stated that processors would be exponentially more complex, but he was much more than this. Moore was a visionary, who guided Intel through the DRAM market in the early years and then lead the transition of the company to lay the foundation for modern microcomputers. He quiet and polite, unlike Robert Noyce and Andy Grove, but everyone trusted Moore's thoughtful and considered decisions. Moore learned from his mistakes, notably a foray into the digital watch market, and was able to lead while allowing others to have their own autonomy. In a way, Moore created Silicon Valley but was entirely unlike what it has become. We could surely use more leaders in the mold of Gordon Moore! Time Stamps: 0:00 - Welcome to the Rundown 0:34 - Toshiba Takeover Talks 3:10 - Biden Outlaws Feds Commercial Spyware 6:03 - MinIO and Weka Divided on Licensing Changes 14:09 - OVHcloud Owes for Data Damages 18:42 - Arm Wants an Arm and a Leg for Chip Licenses 24:52 - Gordon Moore Dies at 94 37:35 - The Weeks Ahead 38:34 - Thanks for Watching Follow our hosts on Social MediaTom Hollingsworth: https://www.twitter.com/NetworkingNerdStephen Foskett: https://www.twitter.com/SFoskett Follow Gestalt ITWebsite: https://www.GestaltIT.com/Twitter: https://www.twitter.com/GestaltITLinkedIn: https://www.linkedin.com/company/1789 Tags: #Rundown, #GordonMoore, #Spyware, #Pegasus, #Cloud, #ChipLicense, #NFD31, #NFDx, #ArubaAtmosphere, #Toshiba, @MinIO, @WekaIO, @OVHcloud, @OVHcloud_US, @Arm, @Intel

law silicon valley cloud intel arm pegasus computing leg gestalt dram spyware toshiba andy grove gordon moore ovhcloud minio robert noyce

Making Open-Source Multi-Cloud Truly Free with AB Periasamy

Screaming in the Cloud

Play Episode Listen Later Mar 28, 2023 40:04

AB Periasamy, Co-Founder and CEO of MinIO, joins Corey on Screaming in the Cloud to discuss what it means to be truly open source and the current and future state of multi-cloud. AB explains how MinIO was born from the idea that the world was going to produce a massive amount of data, and what it's been like to see that come true and continue to be the future outlook. AB and Corey explore why some companies are hesitant to move to cloud, and AB describes why he feels the move is inevitable regardless of cost. AB also reveals how he has helped create a truly free open-source software, and how his partnership with Amazon has been beneficial. About ABAB Periasamy is the co-founder and CEO of MinIO, an open source provider of high performance, object storage software. In addition to this role, AB is an active investor and advisor to a wide range of technology companies, from H2O.ai and Manetu where he serves on the board to advisor or investor roles with Humio, Isovalent, Starburst, Yugabyte, Tetrate, Postman, Storj, Procurify, and Helpshift. Successful exits include Gitter.im (Gitlab), Treasure Data (ARM) and Fastor (SMART).AB co-founded Gluster in 2005 to commoditize scalable storage systems. As CTO, he was the primary architect and strategist for the development of the Gluster file system, a pioneer in software defined storage. After the company was acquired by Red Hat in 2011, AB joined Red Hat's Office of the CTO. Prior to Gluster, AB was CTO of California Digital Corporation, where his work led to scaling of the commodity cluster computing to supercomputing class performance. His work there resulted in the development of Lawrence Livermore Laboratory's “Thunder” code, which, at the time was the second fastest in the world. AB holds a Computer Science Engineering degree from Annamalai University, Tamil Nadu, India.AB is one of the leading proponents and thinkers on the subject of open source software - articulating the difference between the philosophy and business model. An active contributor to a number of open source projects, he is a board member of India's Free Software Foundation.Links Referenced: MinIO: https://min.io/ Twitter: https://twitter.com/abperiasamy LinkedIn: https://www.linkedin.com/in/abperiasamy/ Email: mailto:ab@min.io TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at Chronosphere. When it costs more money and time to observe your environment than it does to build it, there's a problem. With Chronosphere, you can shape and transform observability data based on need, context and utility. Learn how to only store the useful data you need to see in order to reduce costs and improve performance at chronosphere.io/corey-quinn. That's chronosphere.io/corey-quinn. And my thanks to them for sponsor ing my ridiculous nonsense. Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn, and I have taken a somewhat strong stance over the years on the relative merits of multi-cloud, and when it makes sense and when it doesn't. And it's time for me to start modifying some of those. To have that conversation and several others as well, with me today on this promoted guest episode is AB Periasamy, CEO and co-founder of MinIO. AB, it's great to have you back.AB: Yes, it's wonderful to be here again, Corey.Corey: So, one thing that I want to start with is defining terms. Because when we talk about multi-cloud, there are—to my mind at least—smart ways to do it and ways that are frankly ignorant. The thing that I've never quite seen is, it's greenfield, day one. Time to build something. Let's make sure we can build and deploy it to every cloud provider we might ever want to use.And that is usually not the right path. Whereas different workloads in different providers, that starts to make a lot more sense. When you do mergers and acquisitions, as big companies tend to do in lieu of doing anything interesting, it seems like they find it oh, we're suddenly in multiple cloud providers, should we move this acquisition to a new cloud? No. No, you should not.One of the challenges, of course, is that there's a lot of differentiation between the baseline offerings that cloud providers have. MinIO is interesting in that it starts and stops with an object store that is mostly S3 API compatible. Have I nailed the basic premise of what it is you folks do?AB: Yeah, it's basically an object store. Amazon S3 versus us, it's actually—that's the comparable, right? Amazon S3 is a hosted cloud storage as a service, but underneath the underlying technology is called object-store. MinIO is a software and it's also open-source and it's the software that you can deploy on the cloud, deploy on the edge, deploy anywhere, and both Amazon S3 and MinIO are exactly S3 API compatible. It's a drop-in replacement. You can write applications on MinIO and take it to AWS S3, and do the reverse. Amazon made S3 API a standard inside AWS, we made S3 API standard across the whole cloud, all the cloud edge, everywhere, rest of the world.Corey: I want to clarify two points because otherwise I know I'm going to get nibbled to death by ducks on the internet. When you say open-source, it is actually open-source; you're AGPL, not source available, or, “We've decided now we're going to change our model for licensing because oh, some people are using this without paying us money,” as so many companies seem to fall into that trap. You are actually open-source and no one reasonable is going to be able to disagree with that definition.The other pedantic part of it is when something says that it's S3 compatible on an API basis, like, the question is always does that include the weird bugs that we wish it wouldn't have, or some of the more esoteric stuff that seems to be a constant source of innovation? To be clear, I don't think that you need to be particularly compatible with those very corner and vertex cases. For me, it's always been the basic CRUD operations: can you store an object? Can you give it back to me? Can you delete the thing? And maybe an update, although generally object stores tend to be atomic. How far do you go down that path of being, I guess, a faithful implementation of what the S3 API does, and at which point you decide that something is just, honestly, lunacy and you feel no need to wind up supporting that?AB: Yeah, the unfortunate part of it is we have to be very, very deep. It only takes one API to break. And it's not even, like, one API we did not implement; one API under a particular circumstance, right? Like even if you see, like, AWS SDK is, right, Java SDK, different versions of Java SDK will interpret the same API differently. And AWS S3 is an API, it's not a standard.And Amazon has published the REST specifications, API specs, but they are more like religious text. You can interpret it in many ways. Amazon's own SDK has interpreted, like, this in several ways, right? The only way to get it right is, like, you have to have a massive ecosystem around your application. And if one thing breaks—today, if I commit a code and it introduced a regression, I will immediately hear from a whole bunch of community what I broke.There's no certification process here. There is no industry consortium to control the standard, but then there is an accepted standard. Like, if the application works, they need works. And one way to get it right is, like, Amazon SDKs, all of those language SDKs, to be cleaner, simpler, but applications can even use MinIO SDK to talk to Amazon and Amazon SDK to talk to MinIO. Now, there is a clear, cooperative model.And I actually have tremendous respect for Amazon engineers. They have only been kind and meaningful, like, reasonable partnership. Like, if our community reports a bug that Amazon rolled out a new update in one of the region and the S3 API broke, they will actually go fix it. They will never argue, “Why are you using MinIO SDK?” Their engineers, they do everything by reason. That's the reason why they gained credibility.Corey: I think, on some level, that we can trust that the API is not going to meaningfully shift, just because so much has been built on top of it over the last 15, almost 16 years now that even slight changes require massive coordination. I remember there was a little bit of a kerfuffle when they announced that they were going to be disabling the BitTorrent endpoint in S3 and it was no longer going to be supported in new regions, and eventually they were turning it off. There were still people pushing back on that. I'm still annoyed by some of the documentation around the API that says that it may not return a legitimate error code when it errors with certain XML interpretations. It's… it's kind of become very much its own thing.AB: [unintelligible 00:06:22] a problem, like, we have seen, like, even stupid errors similar to that, right? Like, HTTP headers are supposed to be case insensitive, but then there are some language SDKs will send us in certain type of casing and they expect the case to be—the response to be same way. And that's not HTTP standard. If we have to accept that bug and respond in the same way, then we are asking a whole bunch of community to go fix that application. And Amazon's problem are our problems too. We have to carry that baggage.But some places where we actually take a hard stance is, like, Amazon introduced that initially, the bucket policies, like access control list, then finally came IAM, then we actually, for us, like, the best way to teach the community is make best practices the standard. The only way to do it. We have been, like, educating them that we actually implemented ACLs, but we removed it. So, the customers will no longer use it. The scale at which we are growing, if I keep it, then I can never force them to remove.So, we have been pedantic about, like, how, like, certain things that if it's a good advice, force them to do it. That approach has paid off, but the problem is still quite real. Amazon also admits that S3 API is no longer simple, but at least it's not like POSIX, right? POSIX is a rich set of API, but doesn't do useful things that we need to do. So, Amazon's APIs are built on top of simple primitive foundations that got the storage architecture correct, and then doing sophisticated functionalities on top of the simple primitives, these atomic RESTful APIs, you can finally do it right and you can take it to great lengths and still not break the storage system.So, I'm not so concerned. I think it's time for both of us to slow down and then make sure that the ease of operation and adoption is the goal, then trying to create an API Bible.Corey: Well, one differentiation that you have that frankly I wish S3 would wind up implementing is this idea of bucket quotas. I would give a lot in certain circumstances to be able to say that this S3 bucket should be able to hold five gigabytes of storage and no more. Like, you could fix a lot of free tier problems, for example, by doing something like that. But there's also the problem that you'll see in data centers where, okay, we've now filled up whatever storage system we're using. We need to either expand it at significant cost and it's going to take a while or it's time to go and maybe delete some of the stuff we don't necessarily need to keep in perpetuity.There is no moment of reckoning in traditional S3 in that sense because, oh, you can just always add one more gigabyte at 2.3 or however many cents it happens to be, and you wind up with an unbounded growth problem that you're never really forced to wrestle with. Because it's infinite storage. They can add drives faster than you can fill them in most cases. So, it's it just feels like there's an economic story, if nothing else, just from a governance control and make sure this doesn't run away from me, and alert me before we get into the multi-petabyte style of storage for my Hello World WordPress website.AB: Mm-hm. Yeah, so I always thought that Amazon did not do this—it's not just Amazon, the cloud players, right—they did not do this because they want—is good for their business; they want all the customers' data, like unrestricted growth of data. Certainly it is beneficial for their business, but there is an operational challenge. When you set quota—this is why we grudgingly introduced this feature. We did not have quotas and we didn't want to because Amazon S3 API doesn't talk about quota, but the enterprise community wanted this so badly.And eventually we [unintelligible 00:09:54] it and we gave. But there is one issue to be aware of, right? The problem with quota is that you as an object storage administrator, you set a quota, let's say this bucket, this application, I don't see more than 20TB; I'm going to set 100TB quota. And then you forget it. And then you think in six months, they will reach 20TB. The reality is, in six months they reach 100TB.And then when nobody expected—everybody has forgotten that there was a code a certain place—suddenly application start failing. And when it fails, it doesn't—even though the S3 API responds back saying that insufficient space, but then the application doesn't really pass that error all the way up. When applications fail, they fail in unpredictable ways. By the time the application developer realizes that it's actually object storage ran out of space, the lost time and it's a downtime. So, as long as they have proper observability—because I mean, I've will also asked observability, that it can alert you that you are only going to run out of space soon. If you have those system in place, then go for quota. If not, I would agree with the S3 API standard that is not about cost. It's about operational, unexpected accidents.Corey: Yeah, on some level, we wound up having to deal with the exact same problem with disk volumes, where my default for most things was, at 70%, I want to start getting pings on it and at 90%, I want to be woken up for it. So, for small volumes, you wind up with a runaway log or whatnot, you have a chance to catch it and whatnot, and for the giant multi-petabyte things, okay, well, why would you alert at 70% on that? Well, because procurement takes a while when we're talking about buying that much disk for that much money. It was a roughly good baseline for these things. The problem, of course, is when you have none of that, and well it got full so oops-a-doozy.On some level, I wonder if there's a story around soft quotas that just scream at you, but let you keep adding to it. But that turns into implementation details, and you can build something like that on top of any existing object store if you don't need the hard limit aspect.AB: Actually, that is the right way to do. That's what I would recommend customers to do. Even though there is hard quota, I will tell, don't use it, but use soft quota. And the soft quota, instead of even soft quota, you monitor them. On the cloud, at least you have some kind of restriction that the more you use, the more you pay; eventually the month end bills, it shows up.On MinIO, when it's deployed on these large data centers, that it's unrestricted access, quickly you can use a lot of space, no one knows what data to delete, and no one will tell you what data to delete. The way to do this is there has to be some kind of accountability.j, the way to do it is—actually [unintelligible 00:12:27] have some chargeback mechanism based on the bucket growth. And the business units have to pay for it, right? That IT doesn't run for free, right? IT has to have a budget and it has to be sponsored by the applications team.And you measure, instead of setting a hard limit, you actually charge them that based on the usage of your bucket, you're going to pay for it. And this is a observability problem. And you can call it soft quotas, but it hasn't been to trigger an alert in observability. It's observability problem. But it actually is interesting to hear that as soft quotas, which makes a lot of sense.Corey: It's one of those problems that I think people only figure out after they've experienced it once. And then they look like wizards from the future who, “Oh, yeah, you're going to run into a quota storage problem.” Yeah, we all find that out because the first time we smack into something and live to regret it. Now, we can talk a lot about the nuances and implementation and low level detail of this stuff, but let's zoom out of it. What are you folks up to these days? What is the bigger picture that you're seeing of object storage and the ecosystem?AB: Yeah. So, when we started, right, our idea was that world is going to produce incredible amount of data. In ten years from now, we are going to drown in data. We've been saying that today and it will be true. Every year, you say ten years from now and it will still be valid, right?That was the reason for us to play this game. And we saw that every one of these cloud players were incompatible with each other. It's like early Unix days, right? Like a bunch of operating systems, everything was incompatible and applications were beginning to adopt this new standard, but they were stuck. And then the cloud storage players, whatever they had, like, GCS can only run inside Google Cloud, S3 can only run inside AWS, and the cloud player's game was bring all the world's data into the cloud.And that actually requires enormous amount of bandwidth. And moving data into the cloud at that scale, if you look at the amount of data the world is producing, if the data is produced inside the cloud, it's a different game, but the data is produced everywhere else. MinIO's idea was that instead of introducing yet another API standard, Amazon got the architecture right and that's the right way to build large-scale infrastructure. If we stick to Amazon S3 API instead of introducing it another standard, [unintelligible 00:14:40] API, and then go after the world's data. When we started in 2014 November—it's really 2015, we started, it was laughable. People thought that there won't be a need for MinIO because the whole world will basically go to AWS S3 and they will be the world's data store. Amazon is capable of doing that; the race is not over, right?Corey: And it still couldn't be done now. The thing is that they would need to fundamentally rethink their, frankly, you serious data egress charges. The problem is not that it's expensive to store data in AWS; it's that it's expensive to store data and then move it anywhere else for analysis or use on something else. So, there are entire classes of workload that people should not consider the big three cloud providers as the place where that data should live because you're never getting it back.AB: Spot on, right? Even if network is free, right, Amazon makes, like, okay, zero egress-ingress charge, the data we're talking about, like, most of MinIO deployments, they start at petabytes. Like, one to ten petabyte, feels like 100 terabyte. For even if network is free, try moving a ten-petabyte infrastructure into the cloud. How are you going to move it?Even with FedEx and UPS giving you a lot of bandwidth in their trucks, it is not possible, right? I think the data will continue to be produced everywhere else. So, our bet was there we will be [unintelligible 00:15:56]—instead of you moving the data, you can run MinIO where there is data, and then the whole world will look like AWS's S3 compatible object store. We took a very different path. But now, when I say the same story that when what we started with day one, it is no longer laughable, right?People believe that yes, MinIO is there because our market footprint is now larger than Amazon S3. And as it goes to production, customers are now realizing it's basically growing inside a shadow IT and eventually businesses realize the bulk of their business-critical data is sitting on MinIO and that's how it's surfacing up. So now, what we are seeing, this year particularly, all of these customers are hugely concerned about cost optimization. And as part of the journey, there is also multi-cloud and hybrid-cloud initiatives. They want to make sure that their application can run on any cloud or on the same software can run on their colos like Equinix, or like bunch of, like, Digital Reality, anywhere.And MinIO's software, this is what we set out to do. MinIO can run anywhere inside the cloud, all the way to the edge, even on Raspberry Pi. It's now—whatever we started with is now has become reality; the timing is perfect for us.Corey: One of the challenges I've always had with the idea of building an application with the idea to run it anywhere is you can make explicit technology choices around that, and for example, object store is a great example because most places you go now will or can have an object store available for your use. But there seem to be implementation details that get lost. And for example, even load balancers wind up being implemented in different ways with different scaling times and whatnot in various environments. And past a certain point, it's okay, we're just going to have to run it ourselves on top of HAproxy or Nginx, or something like it, running in containers themselves; you're reinventing the wheel. Where is that boundary between, we're going to build this in a way that we can run anywhere and the reality that I keep running into, which is we tried to do that but we implicitly without realizing it built in a lot of assumptions that everything would look just like this environment that we started off in.AB: The good part is that if you look at the S3 API, every request has the site name, the endpoint, bucket name, the path, and the object name. Every request is completely self-contained. It's literally a HTTP call away. And this means that whether your application is running on Android, iOS, inside a browser, JavaScript engine, anywhere across the world, they don't really care whether the bucket is served from EU or us-east or us-west. It doesn't matter at all, so it actually allows you by API, you can build a globally unified data infrastructure, some buckets here, some buckets there.That's actually not the problem. The problem comes when you have multiple clouds. Different teams, like, part M&A, the part—like they—even if you don't do M&A, different teams, no two data engineer will would agree on the same software stack. Then where they will all end up with different cloud players and some is still running on old legacy environment.When you combine them, the problem is, like, let's take just the cloud, right? How do I even apply a policy, that access control policy, how do I establish unified identity? Because I want to know this application is the only one who is allowed to access this bucket. Can I have that same policy on Google Cloud or Azure, even though they are different teams? Like if that employer, that project, or that admin, if he or she leaves the job, how do I make sure that that's all protected?You want unified identity, you want unified access control policies. Where are the encryption key store? And then the load balancer itself, the load, its—load balancer is not the problem. But then unless you adopt S3 API as your standard, the definition of what a bucket is different from Microsoft to Google to Amazon.Corey: Yeah, the idea of an of the PUTS and retrieving of actual data is one thing, but then you have how do you manage it the control plane layer of the object store and how do you rationalize that? What are the naming conventions? How do you address it? I even ran into something similar somewhat recently when I was doing an experiment with one of the Amazon Snowball edge devices to move some data into S3 on a lark. And the thing shows up and presents itself on the local network as an S3 endpoint, but none of their tooling can accept a different endpoint built into the configuration files; you have to explicitly use it as an environment variable or as a parameter on every invocation of something that talks to it, which is incredibly annoying.I would give a lot for just to be able to say, oh, when you're talking in this profile, that's always going to be your S3 endpoint. Go. But no, of course not. Because that would make it easier to use something that wasn't them, so why would they ever be incentivized to bake that in?AB: Yeah. Snowball is an important element to move data, right? That's the UPS and FedEx way of moving data, but what I find customers doing is they actually use the tools that we built for MinIO because the Snowball appliance also looks like S3 API-compatible object store. And in fact, like, I've been told that, like, when you want to ship multiple Snowball appliances, they actually put MinIO to make it look like one unit because MinIO can erase your code objects across multiple Snowball appliances. And the MC tool, unlike AWS CLI, which is really meant for developers, like low-level calls, MC gives you unique [scoring 00:21:08] tools, like lscp, rsync-like tools, and it's easy to move and copy and migrate data. Actually, that's how people deal with it.Corey: Oh, God. I hadn't even considered the problem of having a fleet of Snowball edges here that you're trying to do a mass data migration on, which is basically how you move petabyte-scale data, is a whole bunch of parallelism. But having to figure that out on a case-by-case basis would be nightmarish. That's right, there is no good way to wind up doing that natively.AB: Yeah. In fact, Western Digital and a few other players, too, now the Western Digital created a Snowball-like appliance and they put MinIO on it. And they are actually working with some system integrators to help customers move lots of data. But Snowball-like functionality is important and more and more customers who need it.Corey: This episode is sponsored in part by Honeycomb. I'm not going to dance around the problem. Your. Engineers. Are. Burned. Out. They're tired from pagers waking them up at 2 am for something that could have waited until after their morning coffee. Ring Ring, Who's There? It's Nagios, the original call of duty! They're fed up with relying on two or three different “monitoring tools” that still require them to manually trudge through logs to decipher what might be wrong. Simply put, there's a better way. Observability tools like Honeycomb (and very little else because they do admittedly set the bar) show you the patterns and outliers of how users experience your code in complex and unpredictable environments so you can spend less time firefighting and more time innovating. It's great for your business, great for your engineers, and, most importantly, great for your customers. Try FREE today at honeycomb.io/screaminginthecloud. That's honeycomb.io/screaminginthecloud.Corey: Increasingly, it felt like, back in the on-prem days, that you'd have a file server somewhere that was either a SAN or it was going to be a NAS. The question was only whether it presented it to various things as a volume or as a file share. And then in cloud, the default storage mechanism, unquestionably, was object store. And now we're starting to see it come back again. So, it started to increasingly feel, in a lot of ways, like Cloud is no longer so much a place that is somewhere else, but instead much more of an operating model for how you wind up addressing things.I'm wondering when the generation of prosumer networking equipment, for example, is going to say, “Oh, and send these logs over to what object store?” Because right now, it's still write a file and SFTP it somewhere else, at least the good ones; some of the crap ones still want old unencrypted FTP, which is neither here nor there. But I feel like it's coming back around again. Like, when do even home users wind up instead of where do you save this file to having the cloud abstraction, which hopefully, you'll never have to deal with an S3-style endpoint, but that can underpin an awful lot of things. It feels like it's coming back and that's cloud is the de facto way of thinking about things. Is that what you're seeing? Does that align with your belief on this?AB: I actually, fundamentally believe in the long run, right, applications will go SaaS, right? Like, if you remember the days that you used to install QuickBooks and ACT and stuff, like, on your data center, you used to run your own Exchange servers, like, those days are gone. I think these applications will become SaaS. But then the infrastructure building blocks for these SaaS, whether they are cloud or their own colo, I think that in the long run, it will be multi-cloud and colo all combined and all of them will look alike.But what I find from the customer's journey, the Old World and the New World is incompatible. When they shifted from bare metal to virtualization, they didn't have to rewrite their application. But this time, you have—it as a tectonic shift. Every single application, you have to rewrite. If you retrofit your application into the cloud, bad idea, right? It's going to cost you more and I would rather not do it.Even though cloud players are trying to make, like, the file and block, like, file system services [unintelligible 00:24:01] and stuff, they make it available ten times more expensive than object, but it's just to [integrate 00:24:07] some legacy applications, but it's still a bad idea to just move legacy applications there. But what I'm finding is that the cost, if you still run your infrastructure with enterprise IT mindset, you're out of luck. It's going to be super expensive and you're going to be left out modern infrastructure, because of the scale, it has to be treated as code. You have to run infrastructure with software engineers. And this cultural shift has to happen.And that's why cloud, in the long run, everyone will look like AWS and we always said that and it's now being becoming true. Like, Kubernetes and MinIO basically is leveling the ground everywhere. It's giving ECS and S3-like infrastructure inside AWS or outside AWS, everywhere. But what I find the challenging part is the cultural mindset. If they still have the old cultural mindset and if they want to adopt cloud, it's not going to work.You have to change the DNA, the culture, the mindset, everything. The best way to do it is go to the cloud-first. Adopt it, modernize your application, learn how to run and manage infrastructure, then ask economics question, the unit economics. Then you will find the answers yourself.Corey: On some level, that is the path forward. I feel like there's just a very long tail of systems that have been working and have been meeting the business objective. And well, we should go and refactor this because, I don't know, a couple of folks on a podcast said we should isn't the most compelling business case for doing a lot of it. It feels like these things sort of sit there until there is more upside than just cost-cutting to changing the way these things are built and run. That's the reason that people have been talking about getting off of mainframe since the '90s in some companies, and the mainframe is very much still there. It is so ingrained in the way that they do business, they have to rethink a lot of the architectural things that have sprung up around it.I'm not trying to shame anyone for the [laugh] state that their environment is in. I've never yet met a company that was super proud of its internal infrastructure. Everyone's always apologizing because it's a fire. But they think someone else has figured this out somewhere and it all runs perfectly. I don't think it exists.AB: What I am finding is that if you are running it the enterprise IT style, you are the one telling the application developers, here you go, you have this many VMs and then you have, like, a VMware license and, like, Jboss, like WebLogic, and like a SQL Server license, now you go build your application, you won't be able to do it. Because application developers talk about Kafka and Redis and like Kubernetes, they don't speak the same language. And that's when these developers go to the cloud and then finish their application, take it live from zero lines of code before it can procure infrastructure and provision it to these guys. The change that has to happen is how can you give what the developers want now that reverse journey is also starting. In the long run, everything will look alike, but what I'm finding is if you're running enterprise IT infrastructure, traditional infrastructure, they are ashamed of talking about it.But then you go to the cloud and then at scale, some parts of it, you want to move for—now you really know why you want to move. For economic reasons, like, particularly the data-intensive workloads becomes very expensive. And at that part, they go to a colo, but leave the applications on the cloud. So, it's the multi-cloud model, I think, is inevitable. The expensive pieces that where you can—if you are looking at yourself as hyperscaler and if your data is growing, if your business focus is data-centric business, parts of the data and data analytics, ML workloads will actually go out, if you're looking at unit economics. If all you are focused on productivity, stick to the cloud and you're still better off.Corey: I think that's a divide that gets lost sometimes. When people say, “Oh, we're going to move to the cloud to save money.” It's, “No you're not.” At a five-year time horizon, I would be astonished if that juice were worth the squeeze in almost any scenario. The reason you go for therefore is for a capability story when it's right for you.That also means that steady-state workloads that are well understood can often be run more economically in a place that is not the cloud. Everyone thinks for some reason that I tend to be its cloud or it's trash. No, I'm a big fan of doing things that are sensible and cloud is not the right answer for every workload under the sun. Conversely, when someone says, “Oh, I'm building a new e-commerce store,” or whatnot, “And I've decided cloud is not for me.” It's, “Ehh, you sure about that?”That sounds like you are smack-dab in the middle of the cloud use case. But all these things wind up acting as constraints and strategic objectives. And technology and single-vendor answers are rarely going to be a panacea the way that their sales teams say that they will.AB: Yeah. And I find, like, organizations that have SREs, DevOps, and software engineers running the infrastructure, they actually are ready to go multi-cloud or go to colo because they have the—exactly know. They have the containers and Kubernetes microservices expertise. If you are still on a traditional SAN, NAS, and VM architecture, go to cloud, rewrite your application.Corey: I think there's a misunderstanding in the ecosystem around what cloud repatriation actually looks like. Everyone claims it doesn't exist because there's basically no companies out there worth mentioning that are, “Yep, we've decided the cloud is terrible, we're taking everything out and we are going to data centers. The end.” In practice, it's individual workloads that do not make sense in the cloud. Sometimes just the back-of-the-envelope analysis means it's not going to work out, other times during proof of concepts, and other times, as things have hit a certain point of scale, we're in an individual workload being pulled back makes an awful lot of sense. But everything else is probably going to stay in the cloud and these companies don't want to wind up antagonizing the cloud providers by talking about it in public. But that model is very real.AB: Absolutely. Actually, what we are finding with the application side, like, parts of their overall ecosystem, right, within the company, they run on the cloud, but the data side, some of the examples, like, these are in the range of 100 to 500 petabytes. The 500-petabyte customer actually started at 500 petabytes and their plan is to go at exascale. And they are actually doing repatriation because for them, their customers, it's consumer-facing and it's extremely price sensitive, but when you're a consumer-facing, every dollar you spend counts. And if you don't do it at scale, it matters a lot, right? It will kill the business.Particularly last two years, the cost part became an important element in their infrastructure, they knew exactly what they want. They are thinking of themselves as hyperscalers. They get commodity—the same hardware, right, just a server with a bunch of [unintelligible 00:30:35] and network and put it on colo or even lease these boxes, they know what their demand is. Even at ten petabytes, the economics starts impacting. If you're processing it, the data side, we have several customers now moving to colo from cloud and this is the range we are talking about.They don't talk about it publicly because sometimes, like, you don't want to be anti-cloud, but I think for them, they're also not anti-cloud. They don't want to leave the cloud. The completely leaving the cloud, it's a different story. That's not the case. Applications stay there. Data lakes, data infrastructure, object store, particularly if it goes to a colo.Now, your applications from all the clouds can access this centralized—centralized, meaning that one object store you run on colo and the colos themselves have worldwide data centers. So, you can keep the data infrastructure in a colo, but applications can run on any cloud, some of them, surprisingly, that they have global customer base. And not all of them are cloud. Sometimes like some applications itself, if you ask what type of edge devices they are running, edge data centers, they said, it's a mix of everything. What really matters is not the infrastructure. Infrastructure in the end is CPU, network, and drive. It's a commodity. It's really the software stack, you want to make sure that it's containerized and easy to deploy, roll out updates, you have to learn the Facebook-Google style running SaaS business. That change is coming.Corey: It's a matter of time and it's a matter of inevitability. Now, nothing ever stays the same. Everything always inherently changes in the full sweep of things, but I'm pretty happy with where I see the industry going these days. I want to start seeing a little bit less centralization around one or two big companies, but I am confident that we're starting to see an awareness of doing these things for the right reason more broadly permeating.AB: Right. Like, the competition is always great for customers. They get to benefit from it. So, the decentralization is a path to bringing—like, commoditizing the infrastructure. I think the bigger picture for me, what I'm particularly happy is, for a long time we carried industry baggage in the infrastructure space.If no one wants to change, no one wants to rewrite application. As part of the equation, we carried the, like, POSIX baggage, like SAN and NAS. You can't even do [unintelligible 00:32:48] as a Service, NFS as a Service. It's too much of a baggage. All of that is getting thrown out. Like, the cloud players be helped the customers start with a clean slate. I think to me, that's the biggest advantage. And that now we have a clean slate, we can now go on a whole new evolution of the stack, keeping it simpler and everyone can benefit from this change.Corey: Before we wind up calling this an episode, I do have one last question for you. As I mentioned at the start, you're very much open-source, as in legitimate open-source, which means that anyone who wants to can grab an implementation and start running it. How do you, I guess make peace with the fact that the majority of your user base is not paying you? And I guess how do you get people to decide, “You know what? We like the cut of his jib. Let's give him some money.”AB: Mm-hm. Yeah, if I looked at it that way, right, I have both the [unintelligible 00:33:38], right, on the open-source side as well as the business. But I don't see them to be conflicting. If I run as a charity, right, like, I take donation. If you love the product, here is the donation box, then that doesn't work at all, right?I shouldn't take investor money and I shouldn't have a team because I have a job to pay their bills, too. But I actually find open-source to be incredibly beneficial. For me, it's about delivering value to the customer. If you pay me $5, I ought to make you feel $50 worth of value. The same software you would buy from a proprietary vendor, why would—if I'm a customer, same software equal in functionality, if its proprietary, I would actually prefer open-source and pay even more.But why are, really, customers paying me now and what's our view on open-source? I'm actually the free software guy. Free software and open-source are actually not exactly equal, right? We are the purest of the open-source community and we have strong views on what open-source means, right. That's why we call it free software. And free here means freedom, right? Free does not mean gratis, that free of cost. It's actually about freedom and I deeply care about it.For me it's a philosophy and it's a way of life. That's why I don't believe in open core and other models that holding—giving crippleware is not open-source, right? I give you some freedom but not all, right, like, it's it breaks the spirit. So, MinIO is a hundred percent open-source, but it's open-source for the open-source community. We did not take some community-developed code and then added commercial support on top.We built the product, we believed in open-source, we still believe and we will always believe. Because of that, we open-sourced our work. And it's open-source for the open-source community. And as you build applications that—like the AGPL license on the derivative works, they have to be compatible with AGPL because we are the creator. If you cannot open-source, you open-source your application derivative works, you can buy a commercial license from us. We are the creator, we can give you a dual license. That's how the business model works.That way, the open-source community completely benefits. And it's about the software freedom. There are customers, for them, open-source is good thing and they want to pay because it's open-source. There are some customers that they want to pay because they can't open-source their application and derivative works, so they pay. It's a happy medium; that way I actually find open-source to be incredibly beneficial.Open-source gave us that trust, like, more than adoption rate. It's not like free to download and use. More than that, the customers that matter, the community that matters because they can see the code and they can see everything we did, it's not because I said so, marketing and sales, you believe them, whatever they say. You download the product, experience it and fall in love with it, and then when it becomes an important part of your business, that's when they engage with us because they talk about license compatibility and data loss or a data breach, all that becomes important. Open-source isn't—I don't see that to be conflicting for business. It actually is incredibly helpful. And customers see that value in the end.Corey: I really want to thank you for being so generous with your time. If people want to learn more, where should they go?AB: I was on Twitter and now I think I'm spending more time on, maybe, LinkedIn. I think if they—they can send me a request and then we can chat. And I'm always, like, spending time with other entrepreneurs, architects, and engineers, sharing what I learned, what I know, and learning from them. There is also a [community open channel 00:37:04]. And just send me a mail at ab@min.io and I'm always interested in talking to our user base.Corey: And we will, of course, put links to that in the [show notes 00:37:12]. Thank you so much for your time. I appreciate it.AB: It's wonderful to be here.Corey: AB Periasamy, CEO and co-founder of MinIO. I'm Cloud Economist Corey Quinn and this has been a promoted guest episode of Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice that presumably will also include an angry, loud comment that we can access from anywhere because of shared APIs.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

god ceo amazon time google service co founders office data european union dna microsoft open fall in love android act cloud thunder ios engineers ab infrastructure i am saas exchange call of duty cto new world nas applications puts ups api adopt burned fedex open source screaming aws ml apis conversely vm devops javascript azure kafka cpu vmware google cloud raspberry pi s3 quickbooks snowball old world red hat kubernetes postman h2o sdks gitlab ftp tamil nadu unix honeycomb xml starburst vms acls ecs observability multicloud western digital bittorrent gcs redis sql server truly free gitter crud equinix ring ring nfs nginx amazon s3 ehh sres corey quinn aws s3 minio sftp free software foundation computer science engineering storj chronosphere isovalent jboss nagios haproxy duckbill group restful apis weblogic yugabyte aws sdk helpshift tetrate gluster aws cli chief cloud economist humio last week in aws

Combining Community and Company Employees with Matty Stratton

Screaming in the Cloud

Play Episode Listen Later Mar 16, 2023 40:08

Matty Stratton, Director of Developer Relations at Aiven, joins Corey on Screaming in the Cloud for a friendly debate on whether or not company employees can still be considered community members. Corey says no, but opens up his position to the slings and arrows of Matty in an entertaining change of pace. Matty explains why he feels company employees can still be considered community members, and also explores how that should be done in a way that is transparent and helpful to everyone in the community. Matty and Corey also explore the benefits and drawbacks of talented community members becoming employees.About MattyMatty Stratton is the Director of Developer Relations at Aiven, a well-known member of the DevOps community, founder and co-host of the popular Arrested DevOps podcast, and a global organizer of the DevOpsDays set of conferences.Matty has over 20 years of experience in IT operations and is a sought-after speaker internationally, presenting at Agile, DevOps, and cloud engineering focused events worldwide. Demonstrating his keen insight into the changing landscape of technology, he recently changed his license plate from DEVOPS to KUBECTL.He lives in Chicago and has three awesome kids, whom he loves just a little bit more than he loves Diet Coke. Links Referenced: Aiven: https://aiven.io/ Twitter: https://twitter.com/mattstratton Mastodon: hackyderm.io/@mattstratton LinkedIn: https://www.linkedin.com/in/mattstratton/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is brought to us in part by our friends at Min.ioWith more than 1.1 billion docker pulls - Most of which were not due to an unfortunate loop mistake, like the kind I like to make - and more than 37 thousand github stars, (which are admittedly harder to get wrong), MinIO has become the industry standard alternative to S3. It runs everywhere - public clouds, private clouds, Kubernetes distributions, baremetal, raspberry's pi, colocations - even in AWS Local Zones. The reason people like it comes down to its simplicity, scalability, enterprise features and best in class throughput. Software-defined and capable of running on almost any hardware you can imagine and some you probably can't, MinIO can handle everything you can throw at it - and AWS has imagined a lot of things - from datalakes to databases.Don't take their word for it though - check it out at www.min.io and see for yourself. That's www.min.io Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I am joined today by returning guest, my friend and yours, Matty Stratton, Director of Developer Relations at Aiven. Matty, it's been a hot second. How are you?Matty: It has been a while, but been pretty good. We have to come back to something that just occurred to me when we think about the different things we've talked about. There was a point of contention about prior art of the Corey Quinn face and photos. I don't know if you saw that discourse; we may have to have a conversation. There may be some absent—Corey: I did not see—Matty: Okay.Corey: —discourse, but I also would accept freely that I am not the first person to ever come up with the idea of opening my mouth and looking ridiculous for a photograph either.Matty: That's fair, but the thing that I think was funny—and if you don't mind, I'll just go ahead and throw this out here—is that I didn't put this two and two together. So, I posted a picture on Twitter a week or so ago that was primarily to show off the fact—it was a picture of me in 1993, and the point was that my jeans were French-rolled and were pegged. But in the photo, I am doing kind of the Corey Quinn face and so people said, “Oh, is this prior art?” And I said—you know what? I actually just remembered and I've never thought about this before, but one of my friends in high school, for his senior year ID he took a picture—his picture looks like, you know, that kind of, you know, three-quarters turn with the mouth opening going, “Ah,” you know?And he loved that picture—number one, he loved that picture so much that this guy carried his senior year high school ID in his wallet until we were like 25 because it was his favorite picture of himself. But every photo—and I saw this from looking through my yearbook of my friend Jay when we are seniors, he's doing the Corey Quinn face. And he is anecdotally part of the DevOps community, now a little bit too, and I haven't pointed this out to him. But people were saying that, you know, mine was prior art on yours, I said, “Actually, I was emulating yet someone else.”Corey: I will tell you the actual story of how it started. It was at re:Invent, I want to say 2018 or so, and what happened was is someone, they were a big fan of the newsletter—sort of the start of re:Invent—they said, “Hey, can I get a selfie with you?” And I figured, sure, why not. And the problem I had is I've always looked bad in photographs. And okay, great, so if I'm going to have a photo taken of me, that's going to be ridiculous, why not as a lark, go ahead and do this for fun during the course of re:Invent this year?So, whenever I did that I just slapped—if someone asked for a selfie—I'd slap the big happy open mouth smile on my face. And people thought, “Oh, my God, this is amazing.” And I don't know that it was necessarily worth that level of enthusiasm, but okay. I'll take it. I'm not here to tell people they're wrong when they enjoy a joke that I'm putting out there.And it just sort of stuck. And I think the peak of it that I don't think I'm ever going to be able to beat is I actually managed to pull that expression on my driver's license.Matty: Wow.Corey: Yeah.Matty: That's—Corey: They don't have a sense of humor that they are aware of at the DMV.Matty: No, they really don't. And having been to the San Francisco DMV and knowing how long it takes to get in there, like, that was a bit of a risk on your part because if they decided to change their mind, you wouldn't be able to come back for another four months [laugh].Corey: It amused me to do it, so why not? What else was I going to do? I brought my iPad with me, it has cellular on it, so I just can work remotely from there. It was either that or working in my home office again, and frankly, at the height of the pandemic, I could use the break.Matty: Yes [laugh]. That's saying something when the break you can use is going to the DMV.Corey: Right.Matty: That's a little bit where we were, where we at. I think just real quick thinking about that because there's a lot to be said with that kind of idea of making a—whether it's silly or not, but having a common, especially if you do a lot of photos, do a lot of things, you don't have to think about, like, how do I look? I mean, you have to think about—you know, you can just say I just know what I do. Because if you think about it, it's about cultivating your smile, cultivating your look for your photos, and just sort of having a way so you don't—you just know what to do every time. I guess that's a, you know, maybe a model tip or something. I don't know. But you might be onto something.Corey: I joke that my entire family motto is never be the most uncomfortable person in the room. And there's something to be said for it where if you're going to present a certain way, make it your own. Find a way to at least stand out. If nothing else, it's a bit different. Most people don't do that.Remember, we've all got made fun of, generally women—for some reason—back about 15 years ago or so for duck face, where in all the pictures you're making duck face. And well, there are reasons why that is a flattering way to present your face. But if there's one thing we love as a society, it's telling women they're doing something wrong.Matty: Yeah.Corey: So yeah, there's a whole bunch of ways you're supposed to take selfies or whatnot. Honestly, I'm in no way shape or form pretty enough or young enough to care about any of them. At this point, it's what I do when someone busts out a camera and that's the end of it. Now, am I the only person to do this? Absolutely not. Do I take ownership of it? No. Someone else wants to do it, they need give no credit. The idea probably didn't come from me.Matty: And to be fair, if I'm little bit taking the mickey there or whatever about prior art, it was more than I thought it was funny because I had not even—it was this thing where it was like, this is a good friend of mine, probably some of that I've been friends with longer than anyone in my whole life, and it was a core part [laugh] of his personality when we were 18 and 19, and it just d—I just never direct—like, made that connection. And then it happened to me and went “Oh, my God. Jason and Corey did the same thing.” [laugh]. It was—Corey: No, it feels like parallel evolution.Matty: Yeah, yeah. It was more of me never having connected those dots. And again, you're making that face for your DMV photo amused you, me talking about this for the last three minutes on a podcast amused me. So.Corey: And let's also be realistic here. How many ways are there to hold your face during a selfie that is distinguishable and worthy of comment? Usually, it's like okay, well, he has this weird sardonic half-smile with an eyebrow ar—no. His mouth was wide open. We're gonna go with that.Matty: You know, there's a little—I want to kind of—because I think there's actually quite a bit to the lesson from any of this because I think about—follow me here; maybe I'll get to the right place—like me and karaoke. No one would ever accuse me of being a talented singer, right? I'm not going to sing well in a way where people are going to be moved by my talent. So instead, I have to go a different direction. I have to go funny.But what it boils down to is I can only do—I do karaoke well when it's a song where I can feel like I'm doing an impression of the singer. So, for example, the B-52s. I do a very good impression of Fred Schneider. So, I can sing a B-52 song all day long. I actually could do better with Pearl Jam than I should be able to with my terrible voice because I'm doing an Eddie Vedder impression.So, what I'm getting at is you're sort of taking this thing where you're saying, okay, to your point, you said, “Hey,”—and your words, not mine—[where 00:07:09] somebody say, “The picture is not going to be of me looking like blue steel runway model, so I might as well look goofy.” You know? And take it that way and be funny with it. And also, every time, it's the same way, so I think it's a matter of kind of owning the conversation, you know, and saying, how do you accentuate the thing that you can do. I don't know. There's something about DevOps, somehow in there.Corey: So, I am in that uncomfortable place right now between having finalized a blog post slash podcast that's going out in two days from this recording. So, it will go out before you and I have this discussion publicly, but it's also too late for me to change any of it,m so I figured I will open myself up to the slings and arrows of you, more or less. And you haven't read this thing yet, which is even better, so you're now going to be angry about an imperfect representation of what I said in writing. But the short version is this: if you work for a company as their employee, then you are no longer a part of that company's community, as it were. And yes, that's nuanced and it's an overbroad statement and there are a bunch of ways that you could poke holes in it, but I'm curious to get your take on the overall positioning of it.Matty: So, at face value, I would vehemently disagree with that statement. And by that is, that I have spent years of my life tilting at the opposite windmill, which is just because you work at this company, doesn't mean you do not participate in the community and should not consider yourself a part of the community, first and foremost. That will, again, like everything else, it depends. It depends on a lot of things and I hope we can kind of explore that a little bit because just as much as I would take umbrage if you will, or whatnot, with the statement that if you work at the company, you stop being part of the community, I would also have an issue with, you're just automatically part of the community, right? Because these things take effort.And I feel like I've been as a devreloper, or whatever, Corey—how do you say it?Corey: Yep. No, you're right on. Devreloper.Matty: As a—or I would say, as a DevRel, although people on Twitter are angry about using the word DevRel to discuss—like saying, “I'm a DevRel.” “DevRel is a department.” It's a DevOps engineer thing again, except actually—it's, like, actually wrong. But anyway, you kind of run into this, like for example—I'm going to not name names here—but, like, to say, you know, Twitter for Pets, the—what do you—by the way, Corey, what are you going to do now for your made-up company when what Twitter is not fun for this anymore? You can't have Twitter for Pets anymore.Corey: I know I'm going to have to come up with a new joke. I don't quite know what to do with myself.Matty: This is really hard. While we will pretend Twitter for Pets is still around a little bit, even though its API is getting shut down.Corey: Exactly.Matty: So okay, so we're over here at Twitter for Pets, Inc. And we've got our—Corey: Twitter for Bees, because you know it'll at least have an APIary.Matty: Yeah. Ha. We have our team of devrelopers and community managers and stuff and community engineers that work at Twitter for Pets, and we have all of our software engineers and different people. And a lot of times the assumption—and now we're going to have Twitter for Pets community something, right? We have our community, we have our area, our place that we interact, whether it's in person, it's virtual, whether it's an event, whether it's our Discord or Discourse or Slack or whatever [doodlee 00:10:33] thing we're doing these days, and a lot of times, all those engineers and people whose title does not have the word ‘community' on it are like, “Oh, good. Well, we have people that do that.”So, number one, no because now we have people whose priority is it; like, we have more intentionality. So, if I work on the community team, if I'm a dev advocate or something like that, my priority is communicating and advocating to and for that community. But it's like a little bit of the, you know, the office space, I take the requirements from the [unintelligible 00:11:07] to people, you I give them to the engineers. I've got people—so like, you shouldn't have to have a go-between, right? And there's actually quite a bit of place.So, I think, this sort of assumption that you're not part of it and you have no responsibility towards that community, first of all, you're missing a lot as a person because that's just how you end up with people building a thing they don't understand.Corey: Oh, I think you have tremendous responsibility to the community, but whether you're a part of it and having responsibility to it or not aligned in my mind.Matty: So… maybe let's take a second and what do you mean by being a part of it?Corey: Right. Where very often I'll see a certain, I don't know, very large cloud provider will have an open-source project. Great, so you go and look at the open-source project and the only people with commit access are people who work at that company. That is an easy-to-make-fun-of example of this. Another is when the people who are in a community and talking about how they perceive things and putting out content about how they've interacted with various aspects of it start to work there, you see areas where it starts to call its authenticity into question.AWS is another great example of this. As someone in the community, I can talk about how I would build something on top of AWS, but then move this thing on to Fastly instead of CloudFront because CloudFront is terrible. If you work there, you're not going to be able to say the same thing. So, even if you're not being effusive with praise, there are certain guardrails and constraints that keep you from saying what you might otherwise, just based upon the sheer self-interest that comes from the company whose product or service you're talking about is also signing your paycheck and choosing to continue to do so.Matty: And I think even less about it because that's where your paycheck is coming. It's also just a—there's a gravitational pull towards those solutions because that's just what you're spending your day with, right? You know—Corey: Yeah. And you also don't want to start and admit even to yourself, in some cases, that okay, this aspect of what our company does is terrible, so companies—people shouldn't use it. You want to sort of ignore that, on some level, psychologically because that dissonance becomes harmful.Matty: Yeah. And I think there's—so again, this is where things get nuanced and get to levels. Because if you have the right amount of psychological safety in your organization, the organization understands what it's about to that. Because even people whose job is to be a community person should be able to say, “Hey, this is my actual opinion on this. And it might be contrary to the go-to-market where that comes in.”But it's hard, especially when it gets filtered through multiple layers and now you've got a CEO who doesn't understand that nuance who goes, “Wait, why was Corey on some podcast saying that the Twitter for Pets API is not everything it could possibly be?” So, I do think—I will say this—I do think that organizations and leadership are understanding this more than they might have in the past, so we are maybe putting on ourselves this belief that we can't be as fully honest, but even if it's not about hiding the warts, even if it's just a matter of also, you're just like, hey, chances are—plus also to be quite frank, if I work at the company, I probably have access to way more shit than I would have to pay for or do whatever and I know the right way. But here's the trick, and I won't even say it's a dogfooding thing, but if you are not learning and thinking about things the way that your users do—and I will even say that that's where—it is the users, which are the community, that community or the people that use your product or are connected to it, they don't use it; they may be anecdotal—or not anecdotally, maybe tangentially connected. I will give an example. And there was a place I was working where it was very clear, like, we had a way to you know, do open-source contributions back of a type of a provider plug-in, whatever you want to call it and I worked at the company and I could barely figure out how to follow the instructions.Because it made a lot of sense to someone who built that software all day long and knew the build patterns, knew all that stuff. So, if you were an engineer at this company, “Well, yeah, of course. You just do this.” And anybody who puts the—connects the dots, this has gotten better—and this was understood relatively quickly as, “Oh, this is the problem. Let's fix it.” So, the thing is, the reason why I bring this up is because it's not something anybody does intentionally because you don't know what you don't know. And—Corey: Oh, I'm not accusing anyone of being a nefarious actor in any of this. I also wonder if part of this is comes from your background as being heavily involved in the Chef community as a Chef employee and as part of the community around that, which is inherently focused on an open-source product that a company has been built around, whereas my primary interaction with community these days is the AWS community, where it doesn't matter whether you're large or small, you are not getting much, if anything, for free from AWS; you're all their customers and you don't really have input into how something gets built, beyond begging nicely.Matty: That's definitely true. And I think we saw that and there was things, when we look at, like, how community, kind of, evolved or just sort of happened at Chef and why we can't recreate it the same way is there was a certain inflection point of the industry and the burgeoning DevOps movement, and there wasn't—you know, so a lot of that was there. But one of the big problems, too, is, as Corey said, everybody—I shouldn't say every, but I've from the A—all the way up to AWS to your smaller startups will have this problem of where you end up hiring in—whether you want to or not—all of your champions and advocates and your really strong community members, and then that ends up happening. So, number one, that's going to happen. So frankly, if you don't push towards this idea, you're actually going to have people not want to come work because you should be able to be still the member that you were before.And the other thing is that at certain size, like, at the size of a hyperscaler, or, you know, a Microsoft—well, anybody—well Microsofts not a hyperscaler, but you know what I'm saying. Like, very, very large organization, your community folks are not necessarily the ones doing that hiring away. And as much as they might—you know, and again, I may be the running the community champion program at Microsoft and see that you want—you know, but that Joe Schmo is getting hired over into engineering. Like, I'm not going to hire Joe because it hurts me, but I can't say you can't, you know? It's so this is a problem at the large size.And at the smaller size, when you're growing that community, it happens, too, because it's really exciting. When there's a place that you're part of that community, especially when there's a strong feel, like going to work for the mothership, so to speak is, like, awesome. So again, to give an example, I was a member of the Chef community, I was a user, a community person well, before, you know, I went and, you know, had a paycheck coming out of that Seattle office. And it was, like, the coolest thing in the world to get a job offer from Ch—like, I was like, “Oh, my God. I get to actually go work there now.” Right?And when I was at Pulumi, there quite a few people I could think of who I knew through the community who then get jobs at Pulumi and we're so excited, and I imagine still excited, you know? I mean, that was awesome to do. So, it's hard because when you get really excited about a technology, then being able to say, “Wait, I can work on this all the time?” That sounds awesome, right? So like, you're going to have that happen.So, I think what you have to do is rather than prevent it from happening because number one, like, you don't want to actually prevent that from happening because those people will actually be really great additions to your organization in lots of ways. Also, you're not going to stop it from happening, right? I mean, it's also just a silly way to do it. All you're going to do is piss people off, and say, like, “Hey, you're not allowed to work here because we need you in the community.” Then they're going to be like, “Great. Well, guess what I'm not a part of anymore now, jerk?” Right? You know [laugh] I mean so—Corey: Exactly.Matty: Your [unintelligible 00:18:50] stops me. So, that doesn't work. But I think to your point, you talked about, like, okay, if you have a, ostensibly this a community project, but all the maintainers are from one—are from your company, you know? Or so I'm going to point to an example of, we had—you know, this was at Pulumi, we had a Champions program called Puluminaries, and then there's something similar to like Vox Populi, but it was kind of the community that was not run by Pulumi Inc. In that case.Now, we helped fund it and helped get it started, but there was there were rules about the, you know, the membership of the leadership, steering committee or board or whatever it was called, there was a hard limit on the number of people that could be Pulumi employees who were on that board. And it actually, as I recall when I was leaving—I imagine this is not—[unintelligible 00:19:41] does sometimes have to adjust a couple of things because maybe those board members become employees and now you have to say, you can't do that anymore or we have to take someone down. But the goal was to actually, you know, basically have—you know, Pulumi Corp wanted to have a voice on that board because if for no other reason, they were funding it, but it was just one voice. It wasn't even a majority voice. And that's a hard sell in a lot of places too because you lose control over that.There's things I know with, uh—when I think about, like, running meetup communities, like, we might be—well I mean, this is not a big secret, I mean because it's been announced, but we're—you know, Aiven is helping bootstrap a bunch of data infrastructure meetups around the world. But they're not Aiven meetups. Now, we're starting them because they have to start, but pretty much our approach is, as soon as this is running and there's people, whether they work here, work with us or not, they can take it, right? Like, if that's go—you know? And being able to do that can be really hard because you have to relinquish the control of your community.And I think you don't have to relinquish a hundred percent of that control because you're helping facilitate it because if it doesn't already have its own thing—to make sure that things like code of conduct and funding of it, and there's things that come along with the okay, we as an organization, as a company that has dollars and euros is going to do stuff for this, but it's not ours. And that's the thing to remember is that your community does not belong to you, the company. You are there to facilitate it, you are there to empower it, you're there to force-multiply it, to help protect it. And yeah, you will probably slurp a whole bunch of value out of it, so this is not magnanimous, but if you want it to actually be a place it's going to work, it kind of has to be what it wants to be. But by the same token, you can't just sort of sit there and be like, “I'm going to wait for this community grow up around me without anything”—you know.So, that's why you do have to start one if there is quote-unquote—maybe if there's no shape to one. But yeah, I think that's… it is different when it's something that feels a little—I don't even want to say that it's about being open-source. It's a little bit about it less of it being a SaaS or a service, or if it's something that you—I don't know.Corey: This episode is sponsored in part by Honeycomb. I'm not going to dance around the problem. Your. Engineers. Are. Burned. Out. They're tired from pagers waking them up at 2 am for something that could have waited until after their morning coffee. Ring Ring, Who's There? It's Nagios, the original call of duty! They're fed up with relying on two or three different “monitoring tools” that still require them to manually trudge through logs to decipher what might be wrong. Simply put, there's a better way. Observability tools like Honeycomb (and very little else becau se they do admittedly set the bar) show you the patterns and outliers of how users experience your code in complex and unpredictable environments so you can spend less time firefighting and more time innovating. It's great for your business, great for your engineers, and, most importantly, great for your customers. Try FREE today at honeycomb.io/screaminginthecloud. That's honeycomb.io/screaminginthecloud.Corey: Yeah, I think you're onto something here. I think another aspect where I found it be annoying is when companies view their community as, let's hire them all. And I don't think it ever starts that way. I think that it starts as, well these are people who are super-passionate about this, and they have great ideas and they were great to work with. Could we hire them?And the answer is, “Oh, wait. You can give me money for this thing I've been doing basically for free? Yeah, sure, why not?” And that's great in the individual cases. The problem is, at some point, you start to see scenarios where it feels like, if not everyone, then a significant vocal majority of the community starts to work there.Matty: I think less often than you might think is it done strategically or on purpose. There have been exceptions to that. There's one really clear one where it feels like a certain company a few years ago, hired up all the usual suspects of the DevOps community. All of a sudden, you're like, oh, a dozen people all went to go work at this place all at once. And the fun thing is, I remember feeling a little bit—got my nose a little out of joint because I was not the hiring mana—like, I knew the people.I was like, “Well, why didn't you ask me?” And they said, “Actually, you are more important to us not working here.” Now, that might have just been a way to sell my dude-in-tech ego or not, but whether or not that was actually true for me or not, that is a thing where you say you know, your folks—but I do think that particular example of, like, okay, I'm this, that company, and I'm going to go hire up all the usual suspects, I think that's less. I think a lot of times when you see communities hire up those people, it's not done on purpose and in fact, it's probably not something they actually wanted to do in mass that way. But it happens because people who are passionate about your product, it's like I said before, it actually seems pretty cool to go work on it as your main thing.But I can think of places I've been where we had, you know—again, same thing, we had a Pulumi—we had someone who was probably our strongest, loudest, most vocal community member, and you know, I really wanted to get this person to come join us and that was sort of one of the conversations. Nobody ever said, “We won't offer this person a job if they're great.” Like, that's the thing. I think that's actually kind of would be shitty to be like, “You're a very qualified individual, but you're more important to me out in the community so I'm not going to make your job offer.” But it was like, Ooh, that's the, you know—it'd be super cool to have this person but also, not that that should be part of our calculus of decision, but then you just say, what do you do to mitigate that?Because what I'm concerned about is people hearing this the wrong way and saying, “There's this very qualified individual who wants to come work on my team at my company, but they're also really important to our community and it will hurt our community if they come work here, so sorry, person, we're not going to give you an opportunity to have an awesome job.” Like, that's also thinking about the people involved, too. But I know having talked to folks that lots of these different large organizations that have this problem, generally, those community folks, especially at those places, they don't want this [laugh] happening. They get frustrated by it. So, I mean, I'll tell you, it's you know, the—AWS is one of them, right?They're very excited about a lot of the programs and cool people coming from community builders and stuff and Heroes, you know. On one hand, it's incredibly awesome to have a Hero come work at AWS, but it hurts, right, because now they're not external anymore.Corey: And you stop being a Hero in that case, as well.Matty: Yeah. You do, yeah.Corey: Of course, they also lose the status if they go to one of their major competitors. So like, let me get this straight. You can't be a Hero if you work for AWS or one of its competitors. And okay, how are there any Heroes left at all at some point? And the answer is, they bound it via size and a relatively small list of companies. But okay.Matty: So, thinking back to your point about saying, okay, so if you work at the company, you lose some authenticity, some impartiality, some, you know… I think, rather than just saying, “Well, you're not part”—because that also, honestly, my concern is that your blog post is now going to be ammunition for all the people who don't want to act as members of the community for the company they work for now. They're going to say, well, Corey told me I don't have to. So, like I said, I've been spending the last few years tilting at the opposite windmill, which is getting people that are not on the community team to take part in community summits and discourse and things like that, like, you know, for that's—so I think the thing is, rather than saying, “Well, you can't,” or, “You aren't,” it's like, “Well, what do you do to mitigate those things?”Corey: Yeah, it's a weird thing because taking AWS as the example that I've been beating up on a lot, the vast majority of their employees don't know the community exists in any meaningful sense. Which, no fault to them. The company has so many different things, no one keeps up with at all. But it's kind of nuts to realize that there are huge communities of people out there using a thing you have built and you do not know that those users exist and talk to each other in a particular watering hole. And you of course, as a result, have no presence there. I think that's the wrong direction, too. But—Matty: Mm-hm.Corey: Observing the community and being part of the community, I think there's a difference. Are you a biologist or are you a gorilla?Matty: Okay, but [sigh] I guess that's sort of the difference, too which—and it's hard, it's very hard to not just observe. Because I think that actually even taking the mentality of, “I am here to be Jane Goodall, Dr. Jane Goodall, and observe you while I live amongst you, but I'm not going to actually”—although maybe I'm probably doing disservice—I'm remembering my Goodall is… she was actually more involved. May be a bad example.Corey: Yeah. So, that analogy does fall apart a little bit.Matty: It does fall apart a little bit—Corey: Yeah.Matty: But it's you kind of am I sitting there taking field notes or am I actually engaging with you? Because there is a difference. Even if your main reason for being there is just purely to—I mean, this is not the Prime Directive. It's not Star Trek, right? You're not going to like, hold—you don't need to hold—I mean, do you have to hold yourself aloof and say, “I don't participate in this conversation; I'm just here to take notes?”I think that's very non-genuine at that point. That's over-rotating the other way. But I think it's a matter of in those spaces—I think there's two things. I think you have to have a way to be identified as you are an employee because that's just disclosure.Corey: Oh, I'm not suggesting by any stretch of the imagination, people work somewhere but not admit that they work somewhere when talking about the company. That's called fraud.Matty: Right. No, no, and I don't think it's even—but I'm saying beyond just, if it's not, if you're a cop, you have to tell me, right?Corey: [laugh].Matty: It's like, it's not—if asked, I will tell you I work at AWS. It's like in that place, it should say, “I am an AWS em—” like, I should be badged that way, just so it's clear. I think that's actually helpful in two ways. It's also helpful because it says like, okay, maybe you have a connection you can get for me somehow. Like, you might actually have some different insight or a way to chase something that, you know, it's not necessarily just about disclosure; it's also helpful to know.But I think within those spaces, that disclosure—or not disclosure, but being an employee does not offer you any more authority. And part of that is just having to be very clear about how you're constructing that community, right? And that's sort of the way that I think about it is, like, when we did the Pulumi Community Summit about a year ago, right? It was an online, you know, thing we did, and the timing was such that we didn't have a whole lot of Pulumi engineers were able to join, but when we—and it's hard to say we're going to sit in an open space together and everybody is the same here because people also—here's the difference. You say you want this authority? People will want that authority from the people that work at the company and they will always go to them and say, like, “Well, you should have this answer. Can you tell me about this? Can you do this?”So, it's actually hard on both cases to have that two-way conversation unless you set the rules of that space such as, “Okay, I work at Aiven, but when I'm in this space, short of code of conduct or whatever, if I have to be doing that thing, I have no more authority on this than anyone else.” I'm in this space as the same way everyone else's. You can't let that be assumed.Corey: Oh, and big companies do. It's always someone else's… there's someone else's department. Like, at some level, it feels like when you work in one of those enormous orgs, it's your remit is six inches wide.Matty: Well, right. Right. So, I think it's like your authority exists only so far as it's helpful to somebody. If I'm in a space as an Aivener, I'm there just as Matty the person. But I will say I work at Aiven, so if you're like, “God, I wish that I knew who was the person to ask about this replication issue,” and then I can be like, “Aha, I actually have backchannel. Let me help you with that.” But if I can say, “You know what? This is what I think about Kafka and I think why this is whatever,” like, you can—my opinion carries just as much weight as anybody else's, so to speak. Or—Corey: Yeah. You know, it's also weird. Again, community is such a broad and diverse term, I find myself in scenarios where I will observe and talk to people inside AWS about things, but I never want to come across as gloating somehow, that oh, I know, internal people that talk to you about this and you don't. Like, that's never how I want to come across. And I also, I never see the full picture; it's impossible for me to, so I never make commitments on behalf of other people. That's a good way to get in trouble.Matty: It is. And I think in the case of, like, someone like you who's, you know, got the connections you have or whatever, it's less likely for that to be something that you would advertise for a couple of reasons. Like, nobody should be advertising to gloat, but also, part of my remit as a member of a community team is to actually help people. Like, you're doing it because you want to or because it serves you in a different way. Like, that is literally my job.So like, it shouldn't be, like—like, because same thing, if you offer up your connections, now you are taking on some work to do that. Someone who works at the company, like, yes, you should be taking on that work because this is what we do. We're already getting paid for it, you know, so to speak, so I think that's the—Corey: Yeah.Matty: —maybe a nuance, but—Corey: Every once in a while, I'll check my Twitter spam graveyard, [unintelligible 00:32:01] people asking me technical questions months ago about various things regarding AWS and whatnot. And that's all well and good; the problem I have with it is that I'm not a support vector. I don't represent for the company or work for them. Now, if I worked there, I'd feel obligated to make sure this gets handed to the right person. And that's important.The other part of it, though, is okay, now that that's been done and handed off, like do I shepherd it through the process? Eh. I don't want people to get used to asking people in DMs because again, I consider myself to be a nice guy, but if I'm some nefarious jerk, then I could lead them down a very dark path where I suddenly have access to their accounts. And oh, yeah, go ahead and sign up for this thing and I'll take over their computer or convince them to pay me in iTunes gift cards or something like that. No, no, no. Have those conversations in public or through official channels, just because I don't, I don't think you want to wind up in that scenario.Matty: So, my concern as well, with sort of taking the tack of you are just an observer of the community, not a part of it is, that actually can reinforce some pretty bad behavior from an organization towards how they treat the community. One of the things that bothers me—if we're going to go on a different rant about devrelopers like myself—is I like to say that, you know, we pride ourselves as DevRels as being very empathetic and all this stuff, but very happy to shit all over people that work in sales or marketing, based on their job title, right? And I'm like, “Wow, that's great,” right? We're painting with this broad brush. Whereas in reality, we're not separate from.And so, the thing is, when you treat your community as something separate from you, you are treating it as something separate from you. And then it becomes a lot easier also, to not treat them like people and treat them as just a bunch of numbers and treat them as something to have value extracted from rather than it—this is actually a bunch of humans, right? And if I'm part of that, then I'm in the same Dunbar number a little bit, right? I'm in the same monkey sphere as those people because me, I'm—whoever; I'm the CTO or whatever, but I'm part of this community, just like Joe Smith over there in Paducah, you know, who's just building things for the first time. We're all humans together, and it helps to not treat it as the sort of amorphous blob of value to be extracted.So, I think that's… I think all of the examples you've been giving and those are all valid concerns and things to watch out for, the broad brush if you're not part of the community if you work there, my concern is that that leads towards exacerbating already existing bad behavior. You don't have to convince most of the people that the community is separate from them. That's what I'm sort of getting at. I feel like in this work, we've been spending so much time to try to get people to realize they should be acting like part of their larger community—and also, Corey, I know you well enough to know that, you know, sensationalism to make a point [laugh] works to get somebody to join—Corey: I have my moments.Matty: Yeah, yeah, yeah. I mean, there's I think… I'll put it this way. I'm very interested to see the reaction, the response that comes out in, well now, for us a couple of days, for you the listener, a while ago [laugh] when that hits because I think it is a, I don't want to say it's controversial, but I think it's something that has a lot of, um… put it this way, anything that's simple and black and white is not good for discussion.Corey: It's nuanced. And I know that whenever I wrote in 1200 words is not going to be as nuanced of the conversation we just had, either, so I'm sure people will have opinions on it. That'd be fun. It'd be a good excuse for me to listen.Matty: Exactly [laugh]. And then we'll have to remember to go back and find—I'll have to do a little Twitter search for the dates.Corey: We'll have to do another discussion on this, if anything interesting comes out of it.Matty: Actually, that would be funny. That would be—we could do a little recap.Corey: It would. I want to thank you so much for being so generous with your time. Where can people find you if they want to learn more?Matty: Well, [sigh] for the moment, [sigh] who knows what will be the case when this comes out, but you can still find me on Twitter at @mattstratton. I'm also at hackie-derm dot io—sorry, hackyderm.io. I keep wanting to say hackie-derm, but hackyderm actually works better anyway and it's funnier. But [hackyderm.io/@mattstratton](https://hackyderm.io/@mattstratton) is my Mastodon. LinkedIn; I'm. Around there. I need to play more at that. You will—also again, I don't know when this is coming out, so you won't tell you—you don't find me out traveling as much as you might have before, but DevOpsDays Chicago is coming up August 9th and 10th in Chicago, so at the time of listening to this, I'm sure our program will have been posted. But please come and join us. It will be our ninth time of hosting a DevOpsDay Chicago. And I have decided I'm sticking around for ten, so next year will be my last DevOpsDay that I'm running. So, this is the penultimate. And we always know that the penultimate is the best.Corey: Absolutely. Thanks again for your time. It's appreciated. Matty Stratton, Director of Developer Relations at Aiven. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment talking about how I completely missed the whole point of this community and failing to disclose that you are in fact one of the producers of the show.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

god ceo director amazon community chicago french seattle microsoft hero chefs heroes employees champions software discord cloud star trek id honestly engineers ipads pets saas dms call of duty cto bees slack api agile burned dmv screaming aha aws discourse pearl jam mastodon devops kafka invent demonstrating diet coke dunbar s3 jane goodall kubernetes eddie vedder goodall honeycomb joe smith prime directive observability developer relations fastly paducah devrel ring ring apiary vox populi corey quinn fred schneider devopsdays minio joe schmo cloudfront aiven nagios duckbill group arrested devops chief cloud economist matty stratton kubectl last week in aws

493: Dotfile Management

BSD Now

Play Episode Listen Later Feb 9, 2023 42:29

Write Admin tools from Day One, Differentiating between Data Security and Data Integrity, 45 year-old Unix tool is finally getting an upgrade, OpenBSD 7.2 on an ODROID-HC4, Dotfiles Management, and more NOTES This episode of BSDNow is brought to you by Tarsnap (https://www.tarsnap.com/bsdnow) and the BSDNow Patreon (https://www.patreon.com/bsdnow) Headlines Write Admin tools from Day One (https://milwaukeemaven.blogspot.com/2022/08/write-admin-tools-from-day-one.html) Differentiating between Data Security and Data Integrity (https://klarasystems.com/articles/openzfs-data-security-vs-integrity/) News Roundup This 45 year-old Unix tool is finally getting an upgrade (https://www.techradar.com/news/45-year-old-unix-tool-finally-gets-an-upgrade) Installing OpenBSD 7.2 on an ODROID-HC4 (https://www.tumfatig.net/2022/install-openbsd-odroid-hc4/) Dotfiles Management (https://mitxela.com/projects/dotfiles_management) Beastie Bits FreeBSD Journal - November/December 2022 - Observability and Metrics (https://freebsdfoundation.org/past-issues/observability-and-metrics/) HAMMER2 file system for NetBSD (https://github.com/kusumi/netbsd_hammer2) Running OpenBSD 7.2 on your laptop is really hard (not) (https://sohcahtoa.org.uk/openbsd.html) MinIO on OpenBSD 7.2: Install (https://dev.to/nabbisen/minio-on-openbsd-72-install-3b3h) WireGuard VPN on OpenBSD (https://www.adrianobarbosa.xyz/blog/openbsd-wireguard.html) A tool for glamorous shell scripts (https://github.com/charmbracelet/gum) Visualize your git commits with a heat map in the terminal (https://github.com/james-stoup/heatwave) Tarsnap This weeks episode of BSDNow was sponsored by our friends at Tarsnap, the only secure online backup you can trust your data to. Even paranoids need backups. Send questions, comments, show ideas/topics, or stories you want mentioned on the show to feedback@bsdnow.tv (mailto:feedback@bsdnow.tv)

interview guide management development tool os software jail berkeley distribution day one metrics storage how to open source tutorials admin visualize packages install differentiating ports operating systems data security trident unix observability bsd dataset data integrity filesystem zfs minio openbsd netbsd allan jude trueos tarsnap dragonflybsd

481: Fiery Crackers

BSD Now

Play Episode Listen Later Nov 17, 2022 47:54

FreeBSD Q3 2022 status report, Leveraging MinIO and OpenZFS to avoid vendor lock in, FreeBSD on Firecracker platform, How Much Faster Is Making A Tar Archive Without Gzip, Postgres from packages on OpenBSD, Upgrading an NVMe zpool from 222G to 1TB drives, Don't use Reddit for Linux or BSD related questions, and more. NOTES This episode of BSDNow is brought to you by Tarsnap (https://www.tarsnap.com/bsdnow) and the BSDNow Patreon (https://www.patreon.com/bsdnow) Headlines FreeBSD Quarterly Status Report Third Quarter 2022 (https://www.freebsd.org/status/report-2022-07-2022-09/) Avoid Infrastructure Vendor Lock-in by leveraging MinIO and OpenZFS (https://klarasystems.com/articles/avoid-vendor-lock-in-with-minio-and-openzfs/) Announcing the FreeBSD/Firecracker platform (https://www.daemonology.net/blog/2022-10-18-FreeBSD-Firecracker.html) News Roundup How Much Faster Is Making A Tar Archive Without Gzip? (https://lowendbox.com/blog/how-much-faster-is-making-a-tar-archive-without-gzip/) PostgreSQL from packages on OpenBSD (https://www.dbi-services.com/blog/postgresql-from-packages-on-openbsd/) Upgrading an NVMe zpool from 222G to 1TB drives (https://dan.langille.org/2022/10/18/upgrading-an-nvme-zpool-from-222g-to-1tb-drives/) PSA: Don't use Reddit for Linux or BSD related questions (https://unixsheikh.com/articles/dont-use-reddit-for-linux-or-bsd-related-questions.html) Tarsnap This weeks episode of BSDNow was sponsored by our friends at Tarsnap, the only secure online backup you can trust your data to. Even paranoids need backups. Feedback/Questions Hinnerk - vnet jails (https://github.com/BSDNow/bsdnow.tv/blob/master/episodes/481/feedback/Hinnerk%20-%20vnet%20jails.md) Tom's response example: https://adventurist.me/posts/00304 Hugo - Apple M2 (https://github.com/BSDNow/bsdnow.tv/blob/master/episodes/481/feedback/Hugo%20-%20Apple%20M2.md) kevin - emacs backspace (https://github.com/BSDNow/bsdnow.tv/blob/master/episodes/481/feedback/kevin%20-%20emacs%20backspace.md) ) Send questions, comments, show ideas/topics, or stories you want mentioned on the show to feedback@bsdnow.tv (mailto:feedback@bsdnow.tv)

Building a Healthier Sales Environment with Ashleigh Early

Screaming in the Cloud

Play Episode Listen Later Apr 6, 2022 43:22

About AshleighAshleigh Early is a passionate advocate for sales people and through her consulting, coaching, and The Other Side of Sales, she is devoted to making B2B sales culture more inclusive so anyone can thrive. Over the past ten years Ashleigh has led, built, re-built, and consulted for 2 unicorns, 3 acquisitions, 1 abject failure and every step in between. She is also the Head of Sales at the Duckbill Group! You can find Ashleigh on Twitter @AshleighatWork and more about the Other Side of Sales at Othersideofsales.comLinks: Twitter: https://twitter.com/ashleighatwork LinkedIn: https://www.linkedin.com/in/ashleighearly TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Couchbase Capella Database-as-a-Service is flexible, full-featured and fully managed with built in access via key-value, SQL, and full-text search. Flexible JSON documents aligned to your applications and workloads. Build faster with blazing fast in-memory performance and automated replication and scaling while reducing cost. Capella has the best price performance of any fully managed document database. Visit couchbase.com/screaminginthecloud to try Capella today for free and be up and running in three minutes with no credit card required. Couchbase Capella: make your data sing.Corey: Today's episode is brought to you in part by our friends at MinIO the high-performance Kubernetes native object store that's built for the multi-cloud, creating a consistent data storage layer for your public cloud instances, your private cloud instances, and even your edge instances, depending upon what the heck you're defining those as, which depends probably on where you work. It's getting that unified is one of the greatest challenges facing developers and architects today. It requires S3 compatibility, enterprise-grade security and resiliency, the speed to run any workload, and the footprint to run anywhere, and that's exactly what MinIO offers. With superb read speeds in excess of 360 gigs and 100 megabyte binary that doesn't eat all the data you've gotten on the system, it's exactly what you've been looking for. Check it out today at min.io/download, and see for yourself. That's min.io/download, and be sure to tell them that I sent you.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. My guest today does something that I, sort of, dabbled around the fringes of once upon a time, but then realized I wasn't particularly good at it and got the hell out of it and went screaming into clouds instead. Ashleigh Early is the Head of Sales here at The Duckbill Group. Ashleigh, thank you for joining me.Ashleigh: Thanks for coming on and running, screaming from my chosen profession [laugh]. You're definitely not the only one.Corey: Well, let's be clear here; there are two ways that can go because sure, I used to dabble around in sales when I was, basically, trying to figure how to not starve to death. But I also used to run things; it's basically a smart team. I was managing people and realized I was bad at that, too. So, really, that's, sort of, an open-ended direction. We can go either side and…But, let's go with sales. That seems like a more interesting way for this to play out. So, you've been here for—what is it now—it feels like ages, but my awareness for the passing of time in the middle of a global panini is relatively not great.Ashleigh: Yeah. I think we're at day—what is it—1,053 of March 2020? So, time is irrelevant; it's a construct; I don't know. But, technically, by the Gregorian Calendar, I think I'm at six months.Corey: It's very odd to me, at least the way that I contextualized doing this. Back when I started what became The Duckbill Group, I was an independent consultant. It was, more or less, working people I knew through my network who had a very specific, very expensive problem: The AWS bill is too high. And I figured, this is genius. It is the easiest possible sale in the world and one of the only scenarios where I can provably demonstrate ROI to a point where, “Bring me in; you will inherently save money.”And all of that is true, but one of things I learned very quickly was that, even with the easiest sale of, “Hi. I'd like to sell you this bag of money,” there is no such thing as an easy enterprise sale. There is nuance to it. There is a lot of difficulty to it. And I was left with the, I guess, driving question—after my first few months of playing this game—of, “How on earth does anyone make money in this space?”The reason I persisted was, basically, a bunch of people did favors for me, but they didn't owe me at all. It was, “Oh, great. I'll give them the price quote.” And they're, like, “Oh, yeah.” So cool, they turned around and quoted that to their boss at triple the rate because, “Don't slit your own throat on this.” They were right. And not for nothing, it turns out when you're selling advice, charging more for it makes it likelier to succeed as a project.But, I had no idea what I was doing. And, like most engineers on Twitter, I look at something I don't understand deeply myself, and figure, “Oh. Well, it's not engineering, therefore, it's easy.” Yeah, it turns out that running a business is humbling across a whole bunch of different axes.Ashleigh: I wouldn't even say, it's not running a business; it's working with humans. Working with humans is humbling. If you're working with a machine or even something as simple as, like, you know, you're making a product. It's follow a recipe; it's okay. Follow the instructions. I do A, then B, then C, then D, unless you don't enjoy using the instructions because you don't enjoy using instructions. But you still follow a set general process; you build a thing that comes out correctly.The moment that process is, talk to this person, and then Person A, then Person B, then Person C, then Person D, then Back to Person A, then Person D, and then finally to Person E, everything goes to heck in a handbasket. That's what really makes it interesting. And for those of us who are of a certain disposition, we find that fascinating and enthralling. If you're of another disposition, that's hell on earth [laugh]. So, it's a very—yeah, it's a very interesting thing.Corey: Back when I was independent, and people tried to sell me things—and yeah, sometimes it worked. It was always interesting going through various intake funnels and the rest. And, like, “Well, what role do you hold in the organization? Do you influence the decision? Do you make the decision? How many people need to be involved in the rest?”And I was looking around going, “How many people do you think fit in my home office here? Let's be serious.” I mean, there are times I escalated to the Chihuahua because she's unpleasant and annoying and basically, sometimes so are people. But that's a separate topic for later. But it became a very different story back as the organizational distance between the people that needed to sign off on a sale increased.Ashleigh: Mm-hm. Absolutely. And you might have felt me squirm when you described those questions because one of my biggest pet peeves is when people take sales terminology and directly use that with clients. Just like if you're an engineer and you're describing what you do, you're not going to go home and explain to your dad in technical jargon what exactly; you're going to tell him broad strokes. And if they're interested, go deeper and deeper; technical, more technical.I hate when salespeople use sales jargon, like, “What's your role in the organization? Are you the decision-maker?” Don't—mmm. There are better ways to deal with that. So, that's just a sign of poor training. It's not the sales rep's fault; it's his company's fault—their company's fault. But that's a different thing.It's fascinating to me, kind of, watching this—what you said spoke of two things there. One is poor training, and two, of a lack of awareness of the situation and a lack of just doing a little bit of pre-work. Like, you do five seconds of research on Corey Quinn, you can realize that the company is ten to 15 people tops. So, it makes sense to ask a question around, “Hey, do you need anyone else to sign off before we can move forward with this project?”That tells me if I need to get someone for technical, for budget, for whatever, but asking if you're a decision-maker, or if you're influencing, or if you're doing initial research, like, that's using sales terminology, not actually getting to the root of the problem and immediately making it very clear, you didn't do any actual research in advance, which is not—in modern selling—not okay.Corey: My business partner, Mike, has a CEO job title, and he'll get a whole bunch of cold outreach constantly all day, every day. I conducted a two-week experiment where in front of my Chief Cloud Economist job title, I put ‘CTO/' just to see what would happen, and sure enough, I started getting outreach left, right, up, down, and sideways. Not just for things that a CTO figure might theoretically wind up needing to buy, but also, job opportunities for a skill set that I haven't dusted off in a decade.So, okay. Once people can have something that hits their filters when you're searching for very specific titles, then you wind up getting a lot more outreach. But if you create a job title that no one sensible would ever pick for themselves, suddenly a lot of that tends to go by the wayside. It shined a light on how frustratingly dreary a lot of the sales prospecting work really can be from—Ashleigh: Oh, yeah.Corey: —just from the side of someone who gets it. Now, I'm not exaggerating when I say that I did work in sales once upon a time. Not great at it, but one of the first white-collar-style jobs that I had was telemarketing, of all things. And I was spectacular at it because I was fortunate enough to be working on a co-branded affinity credit card that was great, and I had the opportunity to position it as a benefit of an existing membership or something else people already had. I was consistently top-ten out of 400 people on a shift, and it was great.But it was also something that was very time-limited, and if you're having an off day, everything winds up crumbling. And, eventually, I drifted off and started doing different things. But I've never forgotten those days. And that's why it just grinds my gears both to see crappy sales stuff happening, and two, watching people on Twitter—particularly—taking various sales-prospect outreach for a drag. And it's—Ashleigh: Oh, God. Yeah.Corey: —you know, not everyone is swimming in the ocean of privilege that some of the rest of us are. And understand that you're just making yourself look like a jerk when you're talking to someone who is relatively early-career and didn't happen to google you deeply enough before sending you an email that you find insulting. That bugs me a fair bit.Ashleigh: And I think part of that is just a lack of humanity and understanding. Like, there's—I mean, I get it; I'm the first person to be jumping on Twitter and [unintelligible 00:08:41] when something goes down, or something's not working, and saying, you know—I'm the first one to get angry and start complaining. Don't get me wrong. However, what I think a lot of people—it's really easy to dehumanize something you don't see very often, or you're not involved in directly. And I find it real interesting you mentioned you worked in, you know, doing telemarketing.I lasted literally two weeks in telemarketing. I full-on rage-quit. It was a college job. I worked in my college donations center. I lasted two weeks, and I fully walked out on a shift. I was, like, “Screw this; I'm never doing anything like that ever again. I hate this.”But what I hated about it was I hated the lack of connection. I was, like, I'm not just going to read some scripts and get yelled at for having too much banter. Like, I'm getting money; what do you care? I'm getting more money than other people. Maybe they're not making as many calls, but I'm getting just as much, so why do you care how I do this?But what really gets me is you have to remember—and I think a lot of people don't understand how, kind of, most large, modern sales organizations work. And just really quickly giving you a very, very generic explanation, the way a lot of organizations work is they employ something called SDRs or Sales Development Reps. That title can be permeated in a million different ways. There's ADRs, MDRs, BDRs, whatever. But basically, it's their job to do nothing but scour the internet using, sometimes, actual, like, scripts.Sometimes they use LinkedIn; sometimes they have—they purchase databases. So, for example, like, you might change your title on LinkedIn, but it's not changing in the database. Just trust me Corey, they have you flagged as a CTO. Sorry. What [crosstalk 00:10:16].Corey: My personal favorite is when I get cold outreach asking me on the phone call about whether we have any needs for whatever it is they happen to be selling at—and then they name a company that I left in 2012. I don't know how often that database has been sold and resold and sold onwards, yet again. And it's just, I work in tech. What do you think the odds are that I'm still in the same job I was ten years ago? And I get that it happens, but at some point, it just becomes almost laughable.Ashleigh: Yeah. If you work in a company—that when in doubt—I tell every sales, kind of, every company team that I work with—do not use those vendors. Ninety percent of them are not very good; they're using old databases; they don't update. You're better off paying for a database that is subscription-based because then, literally, you've got an SLA on data quality, and you can flag and get things fixed. The number one sales-data provider, I happen to know for a fact, I actually earned, I think, almost $10,000 in donations to a charity in—what was this—this was 2015 because I went through and did a scrub of are RCRM versus I think, LinkedIn or something else, and I flagged everything that wasn't accurate and sent it back to them.And they happened to have a promotion where for every—where you could do a flag that wasn't accurate because they were no longer at the company. They would donate a buck to charity, and I think I sent them, like, 10,000 or something. [unintelligible 00:11:36] I was like, “None of these are accurate.” And they're, like, you know? And they sent me this great email, like, “Thank you for telling us; we really appreciate it.”I didn't even know they were doing this promotion. They thought I'd be saving up for it. And I was, like, “No, I just happened to run this analysis and thought you'd want to know.” So, subscriptions—Corey: You know, it turns out computers are really fast at things.Ashleigh: Yeah, and I was very proud I figured out how to run a script. I was, like, “Yay. Look at me; I wrote a macro.” This was very exciting for—the first—God, the first five or so years of my sales career, I've consistently called myself a dumb salesperson because I was working in really super-technical products. I worked for Arista Networks, FireEye, Bromium, you know, PernixData. I was working in some pretty reasonably hard tech, and I'd always, kind of, introduced myself, I definitely talked about my technical aptitude because I have a degree in political science and opera. These are not technical fields, and yet here I am every day, talking about, you know, tech [crosstalk 00:12:25].Corey: Well, if the election doesn't pan out the way you want, why don't you sing about it? Why not? You can tie all these things together.Ashleigh: You can. And, honestly, there have several points—I've done a whole other shows on, like, how those two, seemingly, completely disparate things have actually been some of the greatest gifts to my career. And most notably, I think, is the fact that I have my degree in political science as a Bachelor of Science, which means I have a BS in BS, which is incredibly relevant to my career in a lot of different ways.Corey: This episode is sponsored by our friends at Oracle Cloud. Counting the pennies, but still dreaming of deploying apps instead of “Hello, World” demos? Allow me to introduce you to Oracle's Always Free tier. It provides over 20 free services and infrastructure, networking, databases, observability, management, and security. And—let me be clear here—it's actually free. There's no surprise billing until you intentionally and proactively upgrade your account. This means you can provision a virtual machine instance or spin up an autonomous database that manages itself, all while gaining the networking, load balancing, and storage resources that somehow never quite make it into most free tiers needed to support the application that you want to build. With Always Free, you can do things like run small-scale applications or do proof-of-concept testing without spending a dime. You know that I always like to put asterisks next to the word free? This is actually free, no asterisk. Start now. Visit snark.cloud/oci-free that's snark.cloud/oci-free.Ashleigh: Yeah, so wrapping up, kind of, how modern-skills organizations work, most companies' employees can be called BDRs, and they're typically people who have less than five years of sales experience. They, rightly or wrongly, tend to be people in their early-20s who have very little training. Most people get SDRs on phones within a week, which means—Corey: These are the people that are doing the cold outreach?Ashleigh: —they've gotten maybe five or six hours of product training. Hmm? Sorry.Corey: These are the people who are doing the cold outreach?Ashleigh: These are the people who are doing the cold outreach. So, their whole job is just to get appointments for account execs. Account execs make it—again; tons of different names, but these are the closers. They'll run you through the sales cycle. They typically have between five and thirty years of experience.But they're the ones depending on how big your company is. [unintelligible 00:13:35] the bigger your company, typically the more experience your sales rep's going to have in terms of managing most separate deal cycles. But what ends up happening is you end up with this SDR organization—this is where I've spent most of my career is helping people build healthy sales-development organizations. In terms of this churn-and-burn culture where you've got people coming in and basically flaming out because they go on Twitter or—heaven forbid—Reddit and get sales advice from these loud-mouthed, terrible people, who are telling them to do things that didn't work ten years ago, but they then go try it; they send it out, and then their prospects suddenly blasting them on Twitter.It's not that rep's fault that they got no training in the first place, they got no support, they just had to figure it out because that's the culture. It's the company's fault. And a lot of times, people don't—there was a big push against this last year, I think, within the sales community against other sales leaders doing it, but now, it's starting to spread out. Like, I have no problem dragging someone for a really terrible email. Anonymize the company; anonymize the email. And, if you want to give feedback, give it to them directly. And you can also say, “I'm going to post this, but it's not coming back to you.” And tell them, like—Corey: Whenever I get outreach from—Ashleigh: “Get out of that terrible company.”Corey: Yeah. Whenever I get outreach from AWS for a sales motion or for recruiting or whatnot. I always anonymize the heck out of the rep. It's funny to me because it's, “Don't you know who I am?” It is humorous, on some level. And it's clear that is a numbers game, and they're trying to do a bunch of different things, but a cursory google of my name would show it. It's just amusing.I want to be clear that whenever I do that, I don't think the rep has done anything wrong. They're doing exactly what they should. I just find it very funny that, “Wait, me? Work at an AWS? The bookstore?” It seems like it would be a—yeah. Yeah, the juxtaposition is just hilarious to me. They've done nothing wrong, and that's okay. It's a hard racket.I remember—at least they have the benefit over my first enterprise sales job where I was selling tape drives into the AS/400 market, competing against IBM on price. That was in the days of “No one ever gets fired for buying IBMs.” So, yeah. The place you want to save money on is definitely the backup system that's going to save all of your systems. I made one sale in my time there—and apparently set a company record because it wasn't specifically aimed at the AS/400—and I did the math on that and realized, “Huh, I'd have to do two of these a month in order to beat the draw against commission structure that they had.”So, I said, “To hell with this,” and I quit. The CEO was very much a sales pro, and, “Well, you need to figure out whether you're a salesperson or not.” Even back then, I had an attitude problem, but it was, “Yeah, I think that—oh, I know that I am. It's just a question is am I going to be a salesperson here?” And the answer is, “No.” It [laugh]—Ashleigh: Yeah.Corey: It's a two-way street.Ashleigh: It is. And I say this all the time to people who—I work with a lot of salespeople now who are, like, “I don't think sales is for me. I don't know, I need [unintelligible 00:16:24]. The past three companies didn't work.” The answer isn't, “Is sales for you?”The answer is, “Are you selling the right thing at the right place?” And one of the things we've learned from the ‘Great Recession' and the ‘Great Reshuffling' in everything is there's no reason to stay at a terrible company, and there's no reason to stay at a company where you're not really passionate and understand what you're selling. I joked about, you know, I talked down about myself for the first bit of my career. Doesn't mean I didn't—like, I might not understand exactly how heuristics work, but I understand what heuristics are. Just don't ask me to design any of them.You know, like, you have to understand and you have to be really excited about it. And that's what modern sales is. And so, yes, you're going to get a ton of the outreach because that's how people—it still works. That's why we all still get Nigerian prince emails. Somebody, somewhere, still clicks those things, sadly. And that gets me really angry.Corey: It's a pure numbers game.Ashleigh: Exactly. Ninety percent if enterprise B2B sales is not that anymore. Even the companies that are using BDRs—which is most of them—are now moving to what's called ‘account-based selling'. We're using hyper-personalized messaging. You're probably noticing videos are popping up more.I'm a huge fan of video. I think it's a great way to force personalization. It's, like, “Hi. Corey, I see you. I'm talking to you. I've done my research. I know what you're doing at The Duckbill Group and here's how I think we can help. If that's not the case, no worries. Let me know; I'll leave you alone.” That's what selling should be.Corey: I have yet to receive one of those, but I'm sure it'll happen now that I've mentioned that and put that out into the universe.Ashleigh: Probably.Corey: What always drove me nuts—and maybe this is unfair—but when I'm trying to use a product, probably something SaaS-based—and I see this a lot—where, first, if you aren't letting me self-serve and get off with the free tier and just start testing something, well, that's already a ding against you because usually I'm figuring this out at 2 o'clock in the morning when I can't sleep, and I want to work on something. I don't want to wait for a sales cycle, and I have to slow things down. Cool. But at some point, for sophisticated customers, you absolutely need to have a sales conversation. But, okay, great. Usually, I encounter this more with lead magnets or other things designed to get my contact info.But what drives me up a wall, when they start demanding information that is very clearly trying to classify me in their sales funnel, on some level. I'll give you my name, my company, and my work email address—although I would think that from my work email address, you could probably figure out where I work and the rest—but then there are other questions. How big is your company? What is your functional role within the company? And where are you geographically?Well, that's an interesting question. Why does that matter in 2022? Well, very often leads get circulated out to people based upon geography. And I get it, but it also frustrates me, just because I don't want to have to deal with classifying and sorting myself out for what is going to be a very brief conversation [laugh] with a salesperson. Because if the product works, great, I'm going to buy. If it doesn't work, I'm going to get frustrated and not want to hear from you forever.Which gets to my big question for you—and please don't take the question as anything other than the joking spirit in which it's intended—but why are so many salespeople profoundly annoying?Ashleigh: I would—uh, hmm.Corey: Sales processes is probably the better way to frame it because—Ashleigh: I was going to say, “Yeah, it's not the people; it's the process.” So—Corey: —it's not the individual's fault, as we've talked about it.Ashleigh: —yeah, I was going to say, I was, like, “Okay, I think it's less the people; more of the processes.” And processes that will make [crosstalk 00:19:37]—Corey: Yeah. It expresses itself as the same person showing up again and again. But that is not—Ashleigh: Totally.Corey: —their fault. That is the process by which they are being measured at as a part of their job. And it's unfair to blame them for that. But the expression is, “This person's annoying the hell out of me, what gives?”Ashleigh: “Oh, my gosh. Why does she keep [unintelligible 00:19:51] my inbox? Leave me alone. Just let me freaking test it.” I said, “I needed two weeks. Just let me have the two weeks to freaking test the thing. I will get back to you.” [unintelligible 00:19:58] yeah, no, I know.And even since moving into leadership several years ago, same thing. I'm like, “Okay, no.” I've gotten to the point where I've had several conversations with salespeople. I'm like, “I know the game. I know what you're trying to do. I respect it. Leave me alone. I promise I will get back to you, just lea”—I have literally said this to people. And the weird thing is most salespeople respect that. We really respect the transparency on that.Now, the trick is what you're talking about with lead capture and stuff like this, again, it comes down to company's design and it comes down to companies who value the buyer experience and customer journey, and companies who don't. And this, I think, is actually more driven by—in my humble opinion—our slightly over-reliance on venture capital, which is all about for a gathering of as much data as possible, figuring out how to monetize it, and move from there. In their mind, personal experience and emotion doesn't really factor into that equation very much, so you end up with these buyer journeys that are less about the buyer and more about getting them from click to purchase as efficiently as possible in terms of company resources, which includes salespeople time. So, as to why you have to fill out all those things, that just to me reeks of a company that maybe doesn't really understand the client experience and probably is going to have a pretty, mmm, support program as well, which means the product had better be really freaking good for me to buy it.Corey: To be clear, at The Duckbill Group, we do not have a two-in-the-morning click here and get you onboarded. Turns out that we have yet to really see the value in building a shopping cart system, where you can buy, “One consulting please,” and call it good. We're not quite at the level of productizing our offering yet and having conversations is a necessary part of what we do. But that also aligns with our customer expectation where there is not a general expectation in this industry that you can buy a full-on bespoke consulting engagement without talking to a human being. That, honestly, if someone trying to sell someone such a thing, I would be terrified.Ashleigh: Yeah, run screaming. Good Lord. No, exactly. And that's one of the reasons I love working with this team and I love this problem is because this isn't a quick, you know, download, install, and save, you know, save ten percent on your AWS bill by installing Duckbill Group. It ain't that simple. If it were that simple, like, AWS wouldn't have the market cap it does.So, that's one of the things I love. I love really meaty problems that don't have clean answers, and specifically have answers that look slightly different for everybody. I love those sort of problems. I've done the highly prioritized stuff: Click here, buy, get it on the free tier, and then it's all about up-sale, cross-sale as needed. Been there, done that; that's fun, and that's a whole different bucket of challenges, but what we're dealing with every single day on the consulting's of The Duckbill Group is far more nuanced and far more exciting because we're also seeing some truly incredible architecture designs. Like, companies who are really on the bleeding edge of what they're doing. And it's just really fun—Corey: Cost and architecture are the same thing in the Cloud.Ashleigh: —[crosstalk 00:22:59] that little—Corey: It's a blast to see it.Ashleigh: It's so much fun. It's, it's, it's… the world's best jigsaw puzzle because it covers, like, every single continent and all these different nuances, and you got to think about a ‘ephemerality,' which is my new favorite word. So…Corey: It's fun because you are building a sales team here, which opens up a few interesting avenues for me. For one, I don't have to manage and yell at individual salespeople in the same way. For example, we talk about it being a process and not a person thing. We're launching some outbound sales work and basically, having the person to talk to about that process—namely you—means that I don't need to be hovering over people's shoulders the way I felt that I once did, as far as what are we sending people? These passive-aggressive drip campaigns of, “Clearly, you don't mind lighting money on fire. If that changes, please let me know.”It's email eight in a sequence. It's no. This stuff has an implicit ‘Love, Mike and Corey' at the bottom of everything that comes out of this company, and it represents us on some respect. And let's be clear, we have a savvy, sophisticated, and more-attractive-than-the-average audience listening to all of these shows. And they'll eat me alive if we start doing stuff like that—Ashleigh: Oh, yeah.Corey: —not to mention that I find it not particularly respectful of their time and who they are. It doesn't work, so we have to be very conscious of that. The fact that I never had to explain that concept in any depth to you made bringing you in one of the easiest decisions we've ever made.Ashleigh: Well, I think it helped—I think in one of my interviews I went off on the ‘alligator email,' which is this infamous email we've all gotten, which is basically, like, you know, “Hi. I haven't heard from you yet, so I want to know which one of these three scenarios has happened to you. One, you're not interested in my product but didn't have the balls to email me and say that you're not interested. Two, you're no longer in this position, in which case, you're not going to read this email anyway. Or three, you're being chased by an alligator, and I should call animal control because you need help.” This email was—Corey: He, he, he, hilarious.Ashleigh: Ugh. And there's variations of it. And I've seen variations of it that are very well done and are on brand and work with the company. I've seen variations that could be legitimately, I think, great humor. And that's great.Humor in emails and humor in sales is fantastic. I have to shout out my friend, Jon Selig up in Canada, who actually, literally, does workshops on how sales teams can integrate humor into their prospecting. It's freaking brilliant. But—Corey: Near and dear to my heart.Ashleigh: —if you're not actually trained in that stuff, don't do it. Don't do the alligator email. But I think I went off on that during one of our interviews just because I was just sick of seeing these things. And what kills me, again, it comes back to the beginning, is people who have no training, no experience coming in—I mean, it really kills me, too, because there's a real concerted effort in the sales community to get more diverse people into sales to, kind of, kill the sales bro just by washing them out, basically. And so, we're recruiting hard with veterans, with black and other racial minority groups, LGBTQ communities, all sorts of things, and indigenous peoples.And so, we're bringing people that also are maybe a little bit more mature, a little bit older, have families they're supporting, and we're throwing them in a role with no support and very little training. And then they wash out, and we wonder why. It's, like, well, maybe because you didn't—it's, like, when I explain this to other people who aren't in sales, like, “Really, imagine coming in to being hired for a coding job, being told you're going to be trained on, you know, Ruby on Rails or C# or whatever it is we're currently using”—my reference is probably super outdated—but then, being given a book, and that's it. And told, “Learn it. And by the way, your first project is due in a month.” That's what we're doing in sales—Corey: For a lot of folks, that's how we learned in the engineering spaces, but let's be clear, the people who do well in that, generally have tailwinds of privilege at their back. They don't have headwinds of, “You suck at this.” It was, you're-born-on-third-you-didn't-hit-a-triple school-of-thought. It's—Ashleigh: Yeah.Corey: —the idea of building an onboarding pipeline, of making this stuff more accessible to people earlier on is incredibly important. One of my, I guess, awakening moments as we were building this company was it turns out that if you manage salespeople as if they were engineers, it doesn't go super well. Whereas, if you manage engineers like they're salespeople, they quit—rage quit—cry, and call you out as being an abusive manager.One of the best descriptions I ever heard from an advisor was that salespeople are sharks. But that's not intended to be unkind. It is simply a facet of their nature. They enjoy the hunt; they enjoy chasing things down, and they like playing games. Whereas, as soon as you start playing games with your engineers on how much money they're going to make this week, that turns out to be a very negative thing. It's a different mindset. It's about motivating people as whatever befits what it is that they want to be doing.Ashleigh: It is. And the other thing is it's a cultural conditioning. So, it's really interesting to say, you know, “People,” you know, “Playing games.” We do enjoy—there's definitely some enjoyment of the competition; there's the thrill of the hunt, absolutely, but at the same time, you want your salespeople to quit? Screw with their money.You screw with their money; we will bail so fast it'll make your head spin. So, it's like, people think, “Oh, we love this.” No, it's really more—think of it as we are gamblers.Corey: Yeah. To be clear when I say, “Playing games with money,” I'm talking about the idea of, “Sell to a company in this profile this quarter, and we'll throw a $5,000 bonus your way,” or something like that. It is if the business wants to see something, great, make it worth the sales team's while to pursue it, or don't be surprised when no one really cares that much about those things—Ashleigh: Exactly.Corey: It's all upside. It is not about, “He, he. And if you don't sell to this weird thing that I can't really describe effectively to you, we're going to cut your bet—” Yeah, that goes over like a lead balloon. As it should. My belief is that compensation should always go up, not down.Ashleigh: Yeah. No, it should. Aside from that, here's a fun stat—I believe this came out of Forrester, it might've been out of [Topel 00:28:54]; I apologize, I don't remember exactly who said this, but a recent study found that less than 68 percent of sales reps make their quota every month. So, imagine that where if you're—we have this thing called OTE, which is On Target Earnings. So, if you have this number you're supposed to take home every month, only 68 percent of sales reps actually do that every month.So, that means we live with this number as our target, but we're living and budgeting anywhere from 30 to 50 percent below that. And then hoping and doing the work that goes in there. That's what we've been conditioned to accept, and that's why you end up with sales reps that use terms like ‘shark' and are aggressive and are in your face and can get—[unintelligible 00:29:30]—Corey: I didn't realize it was pejorative.Ashleigh: I know. No. But here's the thing too, but somebody called it ‘commission breath,' which I love. It's, like, you can smell commission breath coming off us when we're desperate. You totally can. It's because of this antiquated way of building commissions.And this is something that I—this was really obvious to me, and apparently, I was a little bit ahead of the curve. When I started designing comp plans, everyone told me, “You want to design a comp plan? Tie it to what you want them to do very specifically.” So, if you want them to move a pen, design a comp plan that they get a buck when they put the pen from the heel of your hand to the tips of your fingers. Then they get a buck. And then they can do that repeatedly. That's literally how I was taught design comp plans.In my head, that meant that I need to design it in such a way that it's doable for my team because I don't want my team worrying about how they're going to put food on the table while they're talking to a client because they're going get commission breath and it'll piss off the client. That's not a good client experience; that's not going to lead to good performance. Apparently—Corey: Yeah. My concern as a business owner has nothing to do with salespeople making too much money. In fact, I am never happier than I am than paying out commissions. The concern, then, therefore has to become the, “Okay, great. How do I keep the salespeople from being inadvertently incentivized to sell something for $10 that costs me $12 to fulfill?”It's a question of what behaviors do you incentivize that align what they're motivated by with what the company needs. And very often getting that wrong—which happens from time to time—is not viewed as a learning experience that it should be. But instead, “They're just out to screw us.” And I've seen so many company owners get so annoyed whenever their salespeople outperform. But what did you expect? That is the positive outcome. As opposed to what? The underperforming sales rep that can't close a deal? Please.Ashleigh: Well, no. And let's think about this too, especially if it's tied to commission and you're paying out commission. It's, like, okay, commission is always some, sort of, percentage—depending on a lot of things—but some sort of percent of what they're bringing in. If you design a comp plan that has you paying out more in commission than the sales that were earned to bring it in, that's on you; you screwed up. And you need to either be honest and say, “I screwed up; I can't pay this,” and know that you're going to lose some sales reps, but you won't lose as many as if you just refuse to pay it.But, honestly, and I'm not even kidding, I know people. I've worked at a company that I happen to know did this. That literally fired people because they didn't have the money to pay out the commission. And because they fired them before the commission was due to be paid out, then that person no longer had a legal claim to it. That's common. So, the commission goes both ways.Corey: To be clear, we've never done that, but I also would say that if we had, that's a screaming red flag for our consultancy, given the nature of what it is that we do here. It turns out that when we're building out comp plans, we model out various scenarios. Like, what is the worst way that this could wind up unfolding? And, okay, some of our early drafts it's, yeah, it turns out that we would not be able to pay salaries because we wound up giving all of that in commission to people with uncapped upside. Okay, great.But we're also not going to cap people's commissions because that winds up being a freaking problem, so how do we wind up motivating in a way that continues to grow and continues to incentivize the behaviors we want? And it turns out it's super complicated which why we brought you in. It's easier.Ashleigh: Yeah, it's a pain. But the other side of this too, I think, is there is another force at play here, which is finance. A lot of traditional finance modeling is built around that 50 to 70 percent of people hit commission. So, if all of the sudden, you design a comp plan such of a way that a hundred percent of the team is hitting commission, finance loses their shit. So, you have to make sure that when you're designing these things, one of the things I learned, I learned the hard way—this is how I learned that not everyone does it this way—I built my first comp plan; my team's hitting it.My team's overperforming, not a ton, but we're doing really well. All of the sudden, I'm getting called to Finance and getting raked over the coals. And they're like, “What did you do?” I'm like, “What do you mean what did I do? I designed a comp plan; we're hitting goal. Why are you mad?” “Well, we only had this much budgeted for commission.”And I was, like, “That's not my fault.” “Well, that's what historic performance was.” “Okay, well that's not what we're going to do going forward. We're going to do this.” And they're like, “Oh, well, you need to notify us if you're going to change it like that.” And I was, like, “Wait a minute. You modeled so that my team would not hit OTE?” “Yes.” “That's how you've always done this?” “Yes.” “Okay. Well, that's not what we're going to do going forward, and if that's a problem, I'll go find a door.” Because, no.Especially when we're talking about people who are living in extremely expensive areas. I spent most of my career living and working in San Francisco, managing teams of people who made less than six figures. And that's rough when you're paying two grand in rent every month. And 60 percent of your pay is commission. Like, no. You need to know that money's coming.So, I talk about modern sales a lot because that's what I'm trying to use because there's Glengarry Glen Ross, kind of, Wolf of Wall Street school, which is not how anyone behaves anymore, and if you're in an environment that's like that or treats your salespeople like that? Please leave. And then you've got modern sales, which is all about, “Okay, let's figure out how we can set up our salespeople to be the best people they can be to give our clients the best experience they can.” That's where you get top performance out of, and that's where you never run into the terrible emails with the alligators, and the, “Clearly you like lighting piles of money on fire.” That's where you don't get emails to Corey Quinn asking him if he's interested in coming to work for AWS, the book company.It's by incentivizing the people and creating good humans where they can really thrive as salespeople and as people in general. The rest comes with time. But, it's this whole, new way of looking at things. And it's big, and it's scary, and it costs more upfront, but you get more on the back end every single time.Corey: Not that you care about this an awful lot, but you have your own podcast that talks about this, The Other Side of Sales. What inspired you to decide, not just to build sales teams through a different lens, but also to, “You know what? I'm going to go out and talk into microphones through the internet from time to time.” Which, let's be clear, it takes a little bit of a certain warped perspective. I say this myself, having done this far too often.Ashleigh: Yeah. No, it's a fun little origin story. So, I'm a huge Star Trek geek; obsessive. And I was listening to a Star Trek podcast run by a couple of guys who are a little bit embarrassed to run a Star Trek podcast, called The Greatest Generation. Definitely not safe for work, but a really good podcast if you're into Star Trek at all.And they always do, kind of, letters at the end of the shows. And one of the letters at the end of the show one day was, “Hey, I was really inspired by you guys and I started my own podcast on this random thing that I am super excited about.” And I'm literally driving in the car with my husband, and I'm, like, “Huh. I don't know why I'm not listening to sales podcasts. I listen to enough of these other random ones.” Jumped online, pulled up a list of sales podcasts, and I think I went through three or four articles of, like, every sales podcast that was big. And this was, like, January of 2019.Corey: “By Broseph McBrowerson, but Everyone Calls Him ‘Browie.'” Yeah.Ashleigh: Literally, there was, Conversations with Women in Sales with the late, great—with the amazing Lori Richardson, who's now with it, but she took over for a mentor of mine who passed in 2020, sadly. But there was that, and then there was one other that was hosted by a husband-and-wife team. And that was it out of, like, 30 podcasts. And [laugh] so it was this moment of, like, epiphany of, like, “I can start my own podcast,” and, “Oh, I probably need to,” because, literally, no one looks or sounds like someone who I would actually want to hang out with ever, or do business with, in a lot of cases. And that's really changed. I'm so grateful.But really, what it came down to was I didn't feel there was a podcast for me. There wasn't a podcast I could listen to about sales that could help me, that I felt like I identified with. So, I was, like, “All right, fine. I'll start my own.” I called up a friend, and she was, literally, going through the same thing at the same time, so we said, “Screw it. We'll do our own.”We went full Bender from Futurama. We're like, “Just screw it; we'll have our own podcast… with liquor… and heels… and honest conversations that happens to us every day,” and random stuff. It's a lot of fun. And we've gone through a few iterations and it's been a long journey. We're about to hit our hundredth episode, which is really exciting.But yeah, we're—The Other Side of Sales is on a mission to make B2B sales culture truly inclusive so everyone can thrive, so, our conversations are all interviews with amazing sales pros who are trying to do amazing things and who are 90—I think are over 90 percent—are from a minority background, which is really exciting to, kind of, try and shift that conversation from Broseph McBrowerson. Our original tagline was the ‘anti-sales bro' podcast, but we thought that was a little too antagonistic. So…Corey: Yeah, being a little too antagonistic is, generally, my failure mode, so I hear you on that. I really want to thank you for taking so much time out of your day to speak with me. Because—well, not that I should thank you. It's one of those, I should really turn around and say, “Wait a minute. Why aren't you selling things? Why are you still talking to me?” But no—Ashleigh: No, I'm waiting for you to say, “Back to work.”Corey: Do appreciate your—exactly. I think that's a different podcast. Thank you so much for your time. If people want to learn more, where's the best place to find you?Ashleigh: Well, definitely please go check out duckbillgroup.com. We would love to talk to with you about anything to do with your AWS bill. Got a ton of resources on there around how to get that managed and sorted.If you're interested in connecting with me you can always hit me up at—I'm on Twitter @ashleighatwork, which is another deep-cut Star Trek reference, or you can hit me up at LinkedIn. Just search Ashleigh Early. My name is spelled a little weird because I'm a little weird. It's A-S-H-L-E-I-G-H, and then Early, like ‘early in the morning.'Corey: And links to all of that will wind up in the [show notes 00:39:11]. Thanks so much for your time. It's appreciated.Ashleigh: This has been fun; we'll do it again soon.AndIf your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

god love ceo women amazon head canada world conversations science work service san francisco sales playing lgbtq bachelor finance environment humor wolf bs wall street reddit cloud star trek roi b2b ibm oracle saas account cto counting nigerians healthier screw screaming other side aws great recession bender devops tie chihuahua futurama ninety jumped s3 forrester sql kubernetes sdr sla greatest generation capella ruby on rails good lord glengarry glen ross sdrs fireeye bdrs oracle cloud adrs ote gregorian calendar person b corey quinn always free minio lori richardson ibms sales development reps topel jon selig duckbill group mdrs bromium chief cloud economist last week in aws humblepod

Would You Kindly Remind with Peter Hamilton

Screaming in the Cloud

Play Episode Listen Later Mar 31, 2022 40:17

About PeterPeter's spent more than a decade building scalable and robust systems at startups across adtech and edtech. At Remind, where he's VP of Technology, Peter pushes for building a sustainable tech company with mature software engineering. He lives in Southern California and enjoys spending time at the beach with his family.Links: Redis: https://redis.com/ Remind: https://www.remind.com/ Remind Engineering Blog: https://engineering.remind.com LinkedIn: https://www.linkedin.com/in/hamiltop Email: peterh@remind101.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Today's episode is brought to you in part by our friends at MinIO the high-performance Kubernetes native object store that's built for the multi-cloud, creating a consistent data storage layer for your public cloud instances, your private cloud instances, and even your edge instances, depending upon what the heck you're defining those as, which depends probably on where you work. It's getting that unified is one of the greatest challenges facing developers and architects today. It requires S3 compatibility, enterprise-grade security and resiliency, the speed to run any workload, and the footprint to run anywhere, and that's exactly what MinIO offers. With superb read speeds in excess of 360 gigs and 100 megabyte binary that doesn't eat all the data you've gotten on the system, it's exactly what you've been looking for. Check it out today at min.io/download, and see for yourself. That's min.io/download, and be sure to tell them that I sent you.Corey: This episode is sponsored in part by our friends at Vultr. Spelled V-U-L-T-R because they're all about helping save money, including on things like, you know, vowels. So, what they do is they are a cloud provider that provides surprisingly high performance cloud compute at a price that—while sure they claim its better than AWS pricing—and when they say that they mean it is less money. Sure, I don't dispute that but what I find interesting is that it's predictable. They tell you in advance on a monthly basis what it's going to going to cost. They have a bunch of advanced networking features. They have nineteen global locations and scale things elastically. Not to be confused with openly, because apparently elastic and open can mean the same thing sometimes. They have had over a million users. Deployments take less that sixty seconds across twelve pre-selected operating systems. Or, if you're one of those nutters like me, you can bring your own ISO and install basically any operating system you want. Starting with pricing as low as $2.50 a month for Vultr cloud compute they have plans for developers and businesses of all sizes, except maybe Amazon, who stubbornly insists on having something to scale all on their own. Try Vultr today for free by visiting: vultr.com/screaming, and you'll receive a $100 in credit. Thats V-U-L-T-R.com slash screaming.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn and this is a fun episode. It is a promoted episode, which means that our friends at Redis have gone ahead and sponsored this entire episode. I asked them, “Great, who are you going to send me from, generally, your executive suite?” And they said, “Nah. You already know what we're going to say. We want you to talk to one of our customers.” And so here we are. My guest today is Peter Hamilton, VP of Technology at Remind. Peter, thank you for joining me.Peter: Thanks, Corey. Excited to be here.Corey: It's always interesting when I get to talk to people on promoted guest episodes when they're a customer of the sponsor because to be clear, you do not work for Redis. This is one of those stories you enjoy telling, but you don't personally have a stake in whether people love Redis, hate Redis, adopt that or not, which is exactly what I try and do on these shows. There's an authenticity to people who have in-the-trenches experience who aren't themselves trying to sell the thing because that is their entire job in this world.Peter: Yeah. You just presented three or four different opinions and I guarantee we felt all at the different times.Corey: [laugh]. So, let's start at the very beginning. What does Remind do?Peter: So, Remind is a messaging tool for education, largely K through 12. We support about 30 million active users across the country, over 2 million teachers, making sure that every student has, you know, equal opportunities to succeed and that we can facilitate as much learning as possible.Corey: When you say messaging that could mean a bunch of different things to a bunch of different people. Once on a lark, I wound up sitting down—this was years ago, so I'm sure the number is a woeful underestimate now—of how many AWS services I could use to send a message from me to you. And this is without going into the lunacy territory of, “Well, I can tag a thing and then mail it to you like a Snowball Edge or something.” No, this is using them as intended, I think I got 15 or 16 of them. When you say messaging, what does that mean to you?Peter: So, for us, it's about communication to the end-user. We will do everything we can to deliver whatever message a teacher or district administrator has to the user. We go through SMS, text messaging, we go through Apple and Google's push services, we go through email, we go through voice call, really pulling out all the stops we can to make sure that these important messages get out.Corey: And I can only imagine some of the regulatory pressure you almost certainly experience. It feels like it's not quite to HIPAA levels, where ohh, there's a private cause of action if any of this stuff gets out, but people are inherently sensitive about communications involving their children. I always sort of knew this in a general sense, and then I had kids myself, and oh, yeah, suddenly I really care about those sorts of things.Peter: Yeah. One of the big challenges, you can build great systems that do the correct thing, but at the end of the day, we're relying on a teacher choosing the right recipient when they send a message. And so we've had to build a lot of processes and controls in place, so that we can, kind of, satisfy two conflicting needs: One is to provide a clear audit log because that's an important thing for districts to know if something does happen, that we have clear communication; and the other is to also be able to jump in and intervene when something inappropriate or mistaken is sent out to the wrong people.Corey: Remind has always been one of those companies that has a somewhat exalted reputation in the AWS space. You folks have been early adopters of a bunch of different services—which let's be clear, in the responsible way, not the, “Well, they said it on stage; time to go ahead and put everything they just listed into production because we for some Godforsaken reason, view it as a todo list.”—but you've been thoughtful about how you approach things, and you have been around as a company for a while. But you've also been making a significant push toward being cloud-native by certain definitions of that term. So, I know this sounds like a college entrance essay, but what does cloud-native mean to you?Peter: So, one of the big gaps—if you take an application that was written to be deployed in a traditional data center environment and just drop it in the cloud, what you're going to get is a flaky data center.Corey: Well, that's unfair. It's also going to be extremely expensive.Peter: [laugh]. Sorry, an expensive, flaky data set.Corey: There we go. There we go.Peter: What we've really looked at–and a lot of this goes back to our history in the earlier days; we ran a top of Heroku and it was kind of the early days what they call the Twelve-Factor Application—but making aggressive decisions about how you structure your architecture and application so that you fit in with some of the cloud tools that are available and that you fit in, you know, with the operating models that are out there.Corey: When you say an aggressive decision, what sort of thing are you talking about? Because when I think of being aggressive with an approach to things like AWS, it usually involves Twitter, and I'm guessing that is not the direction you intend that to go.Peter: No, I think if you look at Twitter or Netflix or some of these players that, quite frankly, have defined what AWS is to us today through their usage patterns, not quite that.Corey: Oh, I mean using Twitter to yell at them explicitly about things—Peter: Oh.Corey: —because I don't do passive-aggressive; I just do aggressive.Peter: Got it. No, I think in our case, it's been plotting a very narrow path that allows us to avoid some of the bigger pitfalls. We have our sponsor here, Redis. Talk a little bit about our usage of Redis and how that's helped us in some of these cases. One of the pitfalls you'll find with pulling a non-cloud-native application and put it in the cloud is state is hard to manage.If you put state on all your machines and machines go down, networks fail, all those things, you now no longer have access to that state and we start to see a lot of problems. One of the decisions we've made is try to put as much data as we can into data stores like Redis or Postgres or something, in order to decouple our hardware from the state we're trying to manage and provide for users so that we're more resilient to those sorts of failures.Corey: I get the sense from the way that we're having this conversation, when you talk about Redis, you mean actual Redis itself, not ElastiCache for Redis, or as to I'm tending to increasingly think about AWS's services, Amazon Basics for Redis.Peter: Yeah. I mean, Amazon has launched a number of products. They have their ElastiCache, they have their new MemoryDB, there's a lot different ways to use this. We've relied pretty heavily on Redis, previously known as Redis Labs, and their enterprise product in their cloud, in order to take care of our most important data—which we just don't want to manage ourselves—trying to manage that on our own using something like ElastiCache, there's so many pitfalls, so many ways that we can lose that data. This data is important to us. By having it in a trusted place and managed by a great ops team, like they have at Redis, we're able to then lean in on the other aspects of cloud data to really get as much value as we can out of AWS.Corey: I am curious. As I said you've had a reputation as a company for a while in the AWS space of doing an awful lot of really interesting things. I mean, you have a robust GitHub presence, you have a whole bunch of tools that have come out Remind that are great, I've linked to a number of them over the years in the newsletter. You are clearly not afraid, culturally, to get your hands dirty and build things yourself, but you are using Redis Enterprise as opposed to open-source Redis. What drove that decision? I have to assume it's not, “Wait. You mean, I can get it for free as an open-source project? Why didn't someone tell me?” What brought you to that decision?Peter: Yeah, a big part of this is what we could call operating leverage. Building a great set of tools that allow you to get more value out of AWS is a little different story than babysitting servers all day and making sure they stay up. So, if you look through, most of our contributions in open-source space have really been around here's how to expand upon these foundational pieces from AWS; here's how to more efficiently launch a suite of servers into an auto-scaling group; here's, you know, our troposphere and other pieces there. This was all before Amazon CDK product, but really, it was, here's how we can more effectively use CloudFormation to capture our Infrastructure as Code. And so we are not afraid in any way to invest in our tooling and invest in some of those things, but when we look at the trade-off of directly managing stateful services and dealing with all the uncertainty that comes, we feel our time is better spent working on our product and delivering value to our users and relying on partners like Redis in order to provide that stability we need.Corey: You raise a good point. An awful lot of the tools that you've put out there are the best, from my perspective, approach to working with AWS services. And that is a relatively thin layer built on top of them with an eye toward making the user experience more polished, but not being so heavily opinionated that as soon as the service goes in a different direction, the tool becomes completely useless. You just decide to make it a bit easier to wind up working with specific environment variables or profiles, rather than what appears to be the AWS UX approach of, “Oh, now type in your access key, your secret key and your session token, and we've disabled copy and paste. Go, have fun.” You've really done a lot of quality of life improvements, more so than you have this is the entire system of how we do deploys, start to finish. It's opinionated and sort of a, like, a take on what Netflix, did once upon a time, with Asgard. It really feels like it's just the right level of abstraction.Peter: We did a pretty good job. I will say, you know, years later, we felt that we got it wrong a couple times. It's been really interesting to see that, that there are times when we say, “Oh, we could take these three or four services and wrap it up into this new concept of an application.” And over time, we just have to start poking holes in that new layer and we start to see we would have been better served by sticking with as thin a layer as possible that enables us, rather than trying to get these higher-level pieces.Corey: It's remarkably refreshing to hear you say that just because so many people love to tell the story on podcasts, or on conference stages, or whatever format they have of, “This is what we built.” And it is an aspirationally superficial story about this. They don't talk about that, “Well, firstly, without these three wrong paths first.” It's always a, “Oh, yes, obviously, we are smart people and we only make the correct decision.”And I remember in the before times sitting in conference talks, watching people talk about great things they'd done, and I'll turn next to the person next to me and say, “Wow, I wish I could be involved in a project like that.” And they'll say, “Yeah, so do I.” And it turns out they work at the company the speaker is from. Because all of these things tend to be the most positive story. Do you have an example of something that you have done in your production environment that going back, “Yeah, in hindsight, I would have done that completely differently.”Peter: Yeah. So, coming from Heroku moving into AWS, we had a great open-source project called Empire, which kind of bridge that gap between them, but used Amazon's ECS in order to launch applications. It was actually command-line compatible with the Heroku command when it first launched. So, a very big commitment there. And at the time—I mean, this comes back to the point I think you and I were talking about earlier, where architecture, costs, infrastructure, they're all interlinked.And I'm a big fan of Conway's Law, which says that an organization's structure needs to match its architecture. And so six, seven years ago, we're heavy growth-based company and we are interns running around, doing all the things, and we wanted to have really strict guardrails and a narrow set of things that our development team could do. And so we built a pretty constrained: You will launch, you will have one Docker image per ECS service, it can only do these specific things. And this allowed our development team to focus on pretty buttons on the screen and user engagement and experiments and whatnot, but as we've evolved as a company, as we built out a more robust business, we've started to track revenue and costs of goods sold more aggressively, we've seen, there's a lot of inefficient things that come out of it.One particular example was we used PgBouncer for our connection pooling to our Postgres application. In the traditional model, we had an auto-scaling group for a PgBouncer, and then our auto-scaling groups for the other applications would connect to it. And we saw additional latency, we saw additional cost, and we eventually kind of twirl that down and packaged that PgBouncer alongside the applications that needed it. And this was a configuration that wasn't available on our first pass; it was something we intentionally did not provide to our development team, and we had to unwind that. And when we did, we saw better performance, we saw better cost efficiency, all sorts of benefits that we care a lot about now that we didn't care about as much, many years ago.Corey: It sounds like you're describing some semblance of an internal platform, where instead of letting all your engineers effectively, “Well, here's the console. Ideally, you use some form of Infrastructure as Code. Good luck. Have fun.” You effectively gate access to that. Is that something that you're still doing or have you taken a different approach?Peter: So, our primary gate is our Infrastructure as Code repository. If you want to make a meaningful change, you open up a PR, got to go through code review, you need people to sign off on it. Anything that's not there may not exist tomorrow. There's no guarantees. And we've gone around, occasionally just shut random servers down that people spun up in our account.And sometimes people will be grumpy about it, but you really need to enforce that culture that we have to go through the correct channels and we have to have this cohesive platform, as you said, to support our development efforts.Corey: So, you're a messaging service in education. So, whenever I do a little bit of digging into backstories of companies and what has made, I guess, an impression, you look for certain things and explicit dates are one of them, where on March 13th of 2020, your business changed just a smidgen. What happened other than the obvious, we never went outside for two years?Peter: [laugh]. So, if we roll back a week—you know, that's March 13th, so if we roll back a week, we're looking at March 6th. On that day, we sent out about 60 million messages over all of our different mediums: Text, email, push notifications. On March 13th that was 100 million, and then, a few weeks later on March 30th, that was 177 million. And so our traffic effectively tripled over the course of those three weeks. And yeah, that's quite a ride, let me tell you.Corey: The opinion that a lot of folks have who have not gotten to play in sophisticated distributed systems is, “Well, what's the hard part there you have an auto-scaling group. Just spin up three times the number of servers in that fleet and problem solved. What's challenging?” A lot, but what did you find that the pressure points were?Peter: So, I love that example, that your auto-scaling group will just work. By default, Amazon's auto-scaling groups only support 1000 backends. So, when your auto-scaling group goes from 400 backends to 1200, things break, [laugh] and not in ways that you would have expected. You start to learn things about how database systems provided by Amazon have limits other than CPU and memory. And they're clearly laid out that there's network bandwidth limits and things you have to worry about.We had a pretty small team at that time and we'd gotten this cadence where every Monday morning, we would wake up at 4 a.m. Pacific because as part of the pandemic, our traffic shifted, so our East Coast users would be most active in the morning rather than the afternoon. And so at about 7 a.m. on the east coast is when everyone came online. And we had our Monday morning crew there and just looking to see where the next pain point was going to be.And we'd have Monday, walk through it all, Monday afternoon, we'd meet together, we come up with our three or four hypotheses on what will break, if our traffic doubles again, and we'd spend the rest of that next week addressing those the best we could and repeat for the next Monday. And we did this for three, four or five weeks in a row, and finally, it stabilized. But yeah, it's all the small little things, the things you don't know about, the limits in places you don't recognize that just catch up to you. And you need to have a team that can move fast and adapt quickly.Corey: You've been using Redis for six, seven years, something along those lines, as an enterprise offering. You've been working with the same vendor who provides this managed service for a while now. What are the fruits of that relationship? What is the value that you see by continuing to have a long-term relationship with vendors? Because let's be serious, most of us don't stay in jobs that long, let alone work with the same vendor.Peter: Yeah. So, coming back to the March 2020 story, many of our vendors started to see some issues here that various services weren't scaled properly. We made a lot of phone calls to a lot of vendors in working with them, and I… very impressed with how Redis Labs at the time was able to respond. We hopped on a call, they said, “Here's what we think we need to do, we'll go ahead and do this. We'll sort this out in a few weeks and figure out what this means for your contract. We're here to help and support in this pandemic because we recognize how this is affecting everyone around the world.”And so I think when you get in those deeper relationships, those long-term relationships, it is so helpful to have that trust, to have a little bit of that give when you need it in times of crisis, and that they're there and willing to jump in right away.Corey: There's a lot to be said for having those working relationships before you need them. So often, I think that a lot of engineering teams just don't talk to their vendors to a point where they may as well be strangers. But you'll see this most notably because—at least I feel it most acutely—with AWS service teams. They'll do a whole kickoff when the enterprise support deal is signed, three years go passed, and both the AWS team and the customer's team have completely rotated since then, and they may as well be strangers. Being able to have that relationship to fall back on in those really weird really, honestly, high-stress moments has been one of those things where I didn't see the value myself until the first time I went through a hairy situation where I found that that was useful.And now it's oh, I—I now bias instead for, “Oh, I can fit to the free tier of this service. No, no, I'm going to pay and become a paying customer.” I'd rather be a customer that can have that relationship and pick up the phone than someone whining at people in a forum somewhere of, “Hey, I'm a free user, and I'm having some problems with production.” Just never felt right to me.Peter: Yeah, there's nothing worse than calling your account rep and being told, “Oh, I'm not your account rep anymore.” Somehow you missed the email, you missed who it was. Prior to Covid, you know—and we saw this many, many years ago—one of the things about Remind is every back-to-school season, our traffic 10Xes in about three weeks. And so we're used to emergencies happening and unforeseen things happening. And we plan through our year and try to do capacity planning and everything, but we been around the block a couple of times.And so we have a pretty strong culture now leaning in hard with our support reps. We have them in our Slack channels. Our AWS team, we meet with often. Redis Labs, we have them on Slack as well. We're constantly talking about databases that may or may not be performing as we expect them, too. They're an extension of our team, we have an incident; we get paged. If it's related to one of the services, we hit them in Slack immediately and have them start checking on the back end while we're checking on our side. So.Corey: One of the biggest takeaways I wish more companies would have is that when you are dependent upon another company to effectively run your production infrastructure, they are no longer your vendor, they're your partner, whether you want them to be or not. And approaching it with that perspective really pays dividends down the road.Peter: Yeah. One of the cases you get when you've been at a company for a long time and been in relationship for a long time is growing together is always an interesting approach. And seeing, sometimes there's some painful points; sometimes you're on an old legacy version of their product that you were literally the last customer on, and you got to work with them to move off of. But you were there six years ago when they're just starting out, and they've seen how you grow, and you've seen how they've grown, and you've kind of been able to marry that experience together in a meaningful way.Corey: This episode is sponsored by our friends at Oracle Cloud. Counting the pennies, but still dreaming of deploying apps instead of “Hello, World” demos? Allow me to introduce you to Oracle's Always Free tier. It provides over 20 free services and infrastructure, networking, databases, observability, management, and security. And—let me be clear here—it's actually free. There's no surprise billing until you intentionally and proactively upgrade your account. This means you can provision a virtual machine instance or spin up an autonomous database that manages itself, all while gaining the networking, load balancing, and storage resources that somehow never quite make it into most free tiers needed to support the application that you want to build. With Always Free, you can do things like run small-scale applications or do proof-of-concept testing without spending a dime. You know that I always like to put asterisks next to the word free? This is actually free, no asterisk. Start now. Visit snark.cloud/oci-free that's snark.cloud/oci-free.Corey: Redis is, these days, of data platform back once upon a time, I viewed it as more of a caching layer. And I admit that the capabilities of the platform has significantly advanced since those days when I viewed it purely through lens of cache. But one of the interesting parts is that neither one of those use cases, in my mind, blends particularly well with heavy use of Spot Fleets, but you're doing exactly that. What are your folks doing over there?Peter: [laugh]. Yeah, so as I mentioned earlier, coming back to some of the Twelve-Factor App design, we heavily rely on Redis as sort of a distributed heap. One of our challenges of delivering all these messages is every single message has its in-flight state: Here's the content, here's who we sent it to, we wait for them to respond. On a traditional application, you might have one big server that stores it all in-memory, and you get the incoming requests, and you match things up. By moving all that state to Redis, all of our workers, all of our application servers, we know they can disappear at any point in time.We use Amazon's Spot Instances and their Spot Fleet for all of our production traffic. Every single web service, every single worker that we have runs on this infrastructure, and we would not be able to do that if we didn't have a reliable and robust place to store this data that is in-flight and currently being accessed. So, we'll have a couple hundred gigs of data at any point in time in a Redis Database, just representing in-flight work that's happening on various machines.Corey: It's really neat seeing Spot Fleets being used as something more than a theoretical possibility. It's something I've always been very interested in, obviously, given the potential cost savings; they approach cheap is free in some cases. But it turns out—we talked earlier about the idea of being cloud-native versus the rickety, expensive data center in the cloud, and an awful lot of applications are simply not built in a way that yeah, we're just going to randomly turn off a subset of your systems, ideally, with two minutes of notice, but all right, have fun with that. And a lot of times, it just becomes a complete non-starter, even for stateless workloads, just based upon how all of these things are configured. It is really interesting to watch a company that has an awful lot of responsibility that you've been entrusted with who embraces that mindset. It's a lot more rare than you'd think.Peter: Yeah. And again, you know, sometimes, we overbuild things, and sometimes we go down paths that may have been a little excessive, but it really comes down to your architecture. You know, it's not just having everything running on Spot. It's making effective use of SQS and other queueing products at Amazon to provide checkpointing abilities, and so you know that should you lose an instance, you're only going to lose a few seconds of productive work on that particular workload and be able to kick off where you left off.It's properly using auto-scaling groups. From the financial side, there's all sorts of weird quirks you'll see. You know, the Spot market has a wonderful set of dynamics where the big instances are much, much cheaper per CPU than the small ones are on the Spot market. And so structuring things in a way that you can colocate different workloads onto the same hosts and hedge against the host going down by spreading across multiple availability zones. I think there's definitely a point where having enough workload, having enough scale allows you to take advantage of these things, but it all comes down to the architecture and design that really enables it.Corey: So, you've been using Redis for longer than I think many of our listeners have been in tech.Peter: [laugh].Corey: And the key distinguishing points for me between someone who is an advocate for a technology and someone who's a zealot—or a pure critic—is they can identify use cases for which is great and use cases for which it is not likely to be a great experience. In your time with Redis, what have you found that it's been great at and what are some areas that you would encourage people to consider more carefully before diving into it?Peter: So, we like to joke that five, six years ago, most of our development process was, “I've hit a problem. Can I use Redis to solve that problem?” And so we've tried every solution possible with Redis. We've done all the things. We have number of very complicated Lua scripts that are managing different keys in an atomic way.Some of these have been more successful than others, for sure. Right now, our biggest philosophy is, if it is data we need quickly, and it is data that is important to us, we put it in Enterprise Redis, the cloud product from Redis. Other use cases, there's a dozen things that you can use for a cache, Redis is great for cache, memcache does a decent job as well; you're not going to see a meaningful difference between those sorts of products. Where we've struggled a little bit has been when we have essentially relational data that we need fast access to. And we're still trying to find a clear path forward here because you can do it and you can have atomic updates and you can kind of simulate some of the ACID characteristics you would have in a relational database, but it adds a lot of complexity.And that's a lot of overhead to our team as we're continuing to develop these products, to extend them, to fix any bugs you might have in there. And so we're kind of recalibrating a bit, and some of those workloads are moving to other data stores where they're more appropriate. But at the end of the day, it's data that we need fast, and it's data that's important, we're sticking with what we got here because it's been working pretty well.Corey: It sounds almost like you started off with the mindset of one database for a bunch of different use cases and you're starting to differentiate into purpose-built databases for certain things. Or is that not entirely accurate?Peter: There's a little bit of that. And I think coming back to some of our tooling, as we kind of jumped on a bit of the microservice bandwagon, we would see, here's a small service that only has a small amount of data that needs to be stored. It wouldn't make sense to bring up a RDS instance, or an Aurora instance, for that, you know, in Postgres. Let's just store it in an easystore like Redis. And some of those cases have been great, some of them have been a little problematic.And so as we've invested in our tooling to make all our databases accessible and make it less of a weird trade-off between what the product needs, what we can do right now, and what we want to do long-term, and reduce that friction, we've been able to be much more deliberate about the data source that we choose in each case.Corey: It's very clear that you're speaking with a voice of experience on this where this is not something that you just woke up and figured out. One last area I want to go into with you is when I asked you what is you care about primarily as an engineering leader and as you look at serving your customers well, you effectively had a dual answer, almost off the cuff, of stability and security. I find the two of those things are deeply intertwined in most of the conversations I have, but they're rarely called out explicitly in quite the way that you do. Talk to me about that.Peter: Yeah, so in our wild journey, stability has always been a challenge. And we've alway—you know, been an early startup mode, where you're constantly pushing what can we ship? How quickly can we ship it? And in our particular space, we feel that this communication that we foster between teachers and students and their parents is incredibly important, and is a thing that we take very, very seriously. And so, a couple years ago, we were trying to create this balance and create not just a language that we could talk about on a podcasts like this, but really recognizing that framing these concepts to our company internally: To our engineers to help them to think as they're building a feature, what are the things they should think about, what are the concerns beyond the product spec; to work with our marketing and sales team to help them to understand why we're making these investments that may not get particular feature out by X date but it's still a worthwhile investment.So, from the security side, we've really focused on building out robust practices and robust controls that don't necessarily lock us into a particular standard, like PCI compliance or things like that, but really focusing on the maturity of our company and, you know, our culture as we go forward. And so we're in a place now we are ISO 27001; we're heading into our third year. We leaned in hard our disaster recovery processes, we've leaned in hard on our bug bounties, pen tests, kind of, found this incremental approach that, you know, day one, I remember we turned on our bug bounty and it was a scary day as the reports kept coming in. But we take on one thing at a time and continue to build on it and make it an essential part of how we build systems.Corey: It really has to be built in. It feels like security is not something could be slapped on as an afterthought, however much companies try to do that. Especially, again, as we started this episode with, you're dealing with communication with people's kids. That is something that people have remarkably little sense of humor around. And rightfully so.Seeing that there is as much if not more care taken around security than there is stability is generally the sign of a well-run organization. If there's a security lapse, I expect certain vendors to rip the power out of their data centers rather than run in an insecure fashion. And your job done correctly—which clearly you have gotten to—means that you never have to make that decision because you've approached this the right way from the beginning. Nothing's perfect, but there's always the idea of actually caring about it being the first step.Peter: Yeah. And the other side of that was talking about stability, and again, it's avoiding the either/or situation. We can work in as well along those two—stability and security—we work in our cost of goods sold and our operating leverage in other aspects of our business. And every single one of them, it's our co-number one priorities are stability and security. And if it costs us a bit more money, if it takes our dev team a little longer, there's not a choice at that point. We're doing the correct thing.Corey: Saving money is almost never the primary objective of any company that you really want to be dealing with unless something bizarre is going on.Peter: Yeah. Our philosophy on, you know, any cost reduction has been this should have zero negative impact on our stability. If we do not feel we can safely do this, we won't. And coming back to the Spot Instance piece, that was a journey for us. And you know, we tested the waters a bit and we got to a point, we worked very closely with Amazon's team, and we came to that conclusion that we can safely do this. And we've been doing it for over a year and seen no adverse effects.Corey: Yeah. And a lot of shops I've talked to folks about well, when we go and do a consulting project, it's, “Okay. There's a lot of things that could have been done before we got here. Why hasn't any of that been addressed?” And the answer is, “Well. We tried to save money once and it caused an outage and then we weren't allowed to save money anymore. And here we are.” And I absolutely get that perspective. It's a hard balance to strike. It always is.Peter: Yeah. The other aspect where stability and security kind of intertwine is you can think about security as InfoSec in our systems and locking things down, but at the end of the day, why are we doing all that? It's for the benefit of our users. And Remind, as a communication platform, and safety and security of our users is as dependent on us being up and available so that teachers can reach out to parents with important communication. And things like attendance, things like natural disasters, or lockdowns, or any of the number of difficult situations schools find themselves in. This is part of why we take that stewardship that we have so seriously is that being up and protecting a user's data just has such a huge impact on education in this country.Corey: It's always interesting to talk to folks who insists they're making the world a better place. And it's, “What do you do?” “We're improving ad relevance.” I mean, “Okay, great, good for you.” You're serving a need that I would I would not shy away from classifying what you do, fundamentally, as critical infrastructure, and that is always a good conversation to have. It's nice being able to talk to folks who are doing things that you can unequivocally look at and say, “This is a good thing.”Peter: Yeah. And around 80% of public schools in the US are using Remind in some capacity. And so we're not a product that's used in a few civic regions. All across the board. One of my favorite things about working in Remind is meeting people and telling them where I work, and they recognize it.They say, “Oh, I have that app, I use that app. I love it.” And I spent years and ads before this, and you know, I've been there and no one ever told me they were glad to see an ad. That's never the case. And it's been quite a rewarding experience coming in every day, and as you said, being part of this critical infrastructure. That's a special thing.Corey: I look forward to installing the app myself as my eldest prepares to enter public school in the fall. So, now at least I'll have a hotline of exactly where to complain when I didn't get the attendance message because, you know, there's no customer quite like a whiny customer.Peter: They're still customers. [laugh]. Happy to have them.Corey: True. We tend to be. I want to thank you for taking so much time out of your day to speak with me. If people want to learn more about what you're up to, where's the best place to find you?Peter: So, from an engineering perspective at Remind, we have our blog, engineering.remind.com. If you want to reach out to me directly. I'm on LinkedIn; good place to find me or you can just reach out over email directly, peterh@remind101.com.Corey: And we will put all of that into the show notes. Thank you so much for your time. I appreciate it.Peter: Thanks, Corey.Corey: Peter Hamilton, VP of Technology at Remind. This has been a promoted episode brought to us by our friends at Redis, and I'm Cloud Economist Corey Quinn. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry and insulting comment that you will then hope that Remind sends out to 20 million students all at once.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

covid-19 amazon netflix world google starting apple technology pr talk law building code empire hamilton southern california spot cloud pacific excited east coast infrastructure oracle counting slack nah sms acid remind screaming aws conway github iso devops cpu hipaa s3 lua docker kubernetes asgard pci rds infosec deployments ecs redis heroku postgres godforsaken oracle cloud amazon basics corey quinn always free minio cloudformation redis labs sqs twelve factor app duckbill group elasticache chief cloud economist peter oh last week in aws snowball edge humblepod

The Demystification of Zero Trust with Philip Griffiths

Screaming in the Cloud

Play Episode Listen Later Mar 30, 2022 35:38

About PhilipPhilip Griffiths is VP Global Business Development and regularly speaks at events from DevOps to IoT to Cyber Security. Prior to this, he worked for Atos IT Services in various roles working with C-suit executives to realise their digital transformation. He lives in Cambridge with his wife and two daughters.Links: NetFoundry: https://netfoundry.io/ Blog article: https://netfoundry.io/demystifying-the-magic-of-zero-trust-with-my-daughter-and-opensource/ netfoundry.io/screaminginthecloud: https://netfoundry.io/screaminginthecloud TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Today's episode is brought to you in part by our friends at MinIO the high-performance Kubernetes native object store that's built for the multi-cloud, creating a consistent data storage layer for your public cloud instances, your private cloud instances, and even your edge instances, depending upon what the heck you're defining those as, which depends probably on where you work. It's getting that unified is one of the greatest challenges facing developers and architects today. It requires S3 compatibility, enterprise-grade security and resiliency, the speed to run any workload, and the footprint to run anywhere, and that's exactly what MinIO offers. With superb read speeds in excess of 360 gigs and 100 megabyte binary that doesn't eat all the data you've gotten on the system, it's exactly what you've been looking for. Check it out today at min.io/download, and see for yourself. That's min.io/download, and be sure to tell them that I sent you.Corey: This episode is sponsored by our friends at Oracle Cloud. Counting the pennies, but still dreaming of deploying apps instead of “Hello, World” demos? Allow me to introduce you to Oracle's Always Free tier. It provides over 20 free services and infrastructure, networking, databases, observability, management, and security. And—let me be clear here—it's actually free. There's no surprise billing until you intentionally and proactively upgrade your account. This means you can provision a virtual machine instance or spin up an autonomous database that manages itself, all while gaining the networking, load balancing, and storage resources that somehow never quite make it into most free tiers needed to support the application that you want to build. With Always Free, you can do things like run small-scale applications or do proof-of-concept testing without spending a dime. You know that I always like to put asterisks next to the word free? This is actually free, no asterisk. Start now. Visit snark.cloud/oci-free that's snark.cloud/oci-free.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Today's promoted episode is about a topic that is near and dear to my heart. In the AWS universe, we have seen over time that the networking has gotten more and more capable going from EC2 Classic to the world of VPC network to a whole bunch of other things. But with that capability comes a stupendous amount of complexity, to the point where the easy answer to, “Do you understand how networking works within AWS?” Is, of course, no, “I don't.”I'm joined today by Philip Griffiths, who's the Head of Business Development at NetFoundry. Philip, thank you for joining me.Philip: Pleasure to be here, Corey.Corey: So, NetFoundry has what I would argue to be one of the most intriguing-slash-differentiated approaches to handling that ever-increasing complexity around the networking story, not just in AWS, but a number of different cloud providers, and between them, and that approach is to ignore it completely. Have I nailed the salient approach here with that, I guess we'll call it a flippant statement.Philip: Yeah, I'd probably say so. It's the interesting thing where a lot of people say cloud networking is hard, and from our perspective, it should just be super easy, you should be able to provision it in a few minutes with only outbound ports, and set up your policy so that malicious actors can't get inside it. It should be that easy, and programmable, and it's a shame that the current world is not.Corey: One of the hard problems has always been in, I guess, security, which is the thing that everyone pretends to care about right up front, but in practice, often winds up bolting it on after the fact because, “We care about security,” is sort of the trademark phrase of things that we see, usually an email announcing a data breach when it was very clear that companies did not care about security. It's not just me complaining about how complex the network stack is, but by what directly flows from that. If you aren't able to fit all of that into your head as far as what's going on from a security perspective, the odds of misconfiguration creep in and you don't really become aware of what your risk exposure is. I'm really partial to the idea of just avoiding it entirely. Is NetFoundry, effectively, a network overlay? Is it something that goes a bit beyond that? Effectively, where do you folks start and where do you stop?Philip: Yes, that is precisely correct. We are a network overlay that's been built on the principles of zero trust. What is very unique is the ability to be able to start it wherever you want. So yes, you can deploy it from the AWS Marketplace in a few minutes into your VPC or into your operating system, but we also have the ability to actually put it directly into the application stack itself, which has some very interesting complications. What I find as the most interesting starting point is the oxymoron of secure networking.There are no secure networks. It's not possible. Networks are designed to share information and taking it to first principles, you can only isolate networks. And this is why we had the thought process for if we're going to put our overlay network into stuff and make it secure, we have to start at the application level because then we can actually just isolate it to an application communicating into an application, which has profound implications.Corey: The network part is relatively straightforward. I imagine it just becomes, more or less, what resembles a fairly flat network where everything internal is allowed to talk to each other, and then, in turn, this winds up effectively elevating what should be allowed to talk to what and on what ports and whatnot into something that's a lot closer to the application logic, and transcends whatever provider it happens to be traversing.Philip: Yeah, correct. Following the principles of zero trust, we utilize strong embedded identity as a function of what the endpoints are, what the source and destination is. And therefore you build up your policies and services to say what should communicate to what on the basis that the default the least privileged: Absolutely nothing. Your underlay then, the only thing you need is commodity internet with outbound ports. The whole concept of north-south, east-west, if you're app-embedded, you don't even need public DNS; you don't even need DNS at all. Naming conventions go out the window; you don't need to conform to the standards. You know, you could say, “I want to hit Jenkins.” You go to Jenkins because that can be done.Corey: I would approach this entire endeavor with a fair bit of suspicion and no small amount of alarm if it were something that you had developed internally, as far as, “Well, we're just going to replace what amounts to your entire network stack and just go ahead and trust us. It's fine.” But you didn't do that. You're riding on top of the OpenZiti open-source project. And that basically assuages a whole raft of concerns I would have if something like this were proprietary, and people who know what they're doing—who, let's be clear, aren't me—were not able to inspect it and say, “Okay, this passes muster”—as they have done—or alternately, “No, this is terrifyingly dangerous for a variety of excellent reasons.”And it really feels like a lot of the zero-trust stories that we see these days that are taking advantage of either a network overlay approach or shifting authentication into a different layer, have all taken a somewhat similar tack. I used to think it was a good idea; now I'm starting to suspect it might very well be the only viable model. Do you find that that's accurate, or was this a subject of some contention when you were starting out?Philip: So, there's two very interesting [sigh] thoughts that came to me as you were saying that. The number one is yes, we drove forward with OpenZiti because we've seen open-source just completely dominate the industry and everything new that's been built. If you want to deploy an application, you're building on Linux. And in fact, you're probably [laugh] also running on Kubernetes if you're building new. And our objective was to be able to turn OpenZiti into you know, the open-source, zero-trust private network and equivalent where it's just standard: You'll bake your application with Ziti, by design.It will become a check function that people say you have to comply to. When I look at other vendors and how they look at zero-trust, I broadly see a few things that dishearten me. And again, it's a big market, a lot of people—everyone says they're zero-trust nowadays—but I broadly categorize it into a few ways. You have people who are effectively acting as a proxy and they're adding authentication as a way to check what people should have access to. And they may give access to the whole network, they may do granular; it varies between them. In fact, I've just written a blog on this where I effectively call that no-magic zero trust. It's a blog conceptualized within Harry Potter and [unintelligible 00:07:36] a conversation with my daughter.Corey: Yeah, any way to tell a story that beats the traditional enterprise voice is very much appreciated over in this corner of the world.Philip: [laugh]. Yeah, exactly. You have a second tier, which is what I like to think as semi-magical. And that's where you start saying, I am going to use a software-defined perimeter. So, that it's first packet authenticate, or outbound-only based upon embedded identity. And in my eyes, this is basically an invisibility cloak.You then have app-embedded or magical zero-trust. And this is where you're putting the invisibility cloak inside your application, but you're also giving it a port key so that when it needs to connect to something else on the other side of the world, it just happens; it's transparent. And broadly speaking, I think it's very good that the whole world, including the US government is taking zero-trust incredibly importantly, but the distribution of how people tackle a problem is wildly different. There are some zero-trust solutions, which going in the right direction, but fundamentally, if you're putting it in front of your—I won't name a vendor, but there was a vendor who in December, they released a report that said in 90 seconds, common vulnerabilities are exploited something like 96% of the time. 24-hours, 100%.A few days later, they had a 9.8 CVE on their zero-trust VPN concentrator with a public IP, to which I thought, “If you're not patching that immediately, you've got problems if someone is coming into your network.”Corey: Absolutely. We just completed our annual security awareness training here, and so much of it just… it really made my skin crawl, there was an entire module on how to effectively detect phishing emails, and I got to tell you, if they ever start running spellcheck on their some of their [spear-phishing 00:09:23] campaigns, then we're all doomed because that was what the entire training was here. My position is, is that okay, if someone in your company clicks a bad link and it destroys the company's infrastructure, maybe it's the person who's clicking the link that is not necessarily the critical failure point here. Great, if someone compromises an employee workstation, there should be a way to contain the blast radius, they should not now be inside the walls and able to traverse into whatever it is that they want. There should be additional barriers, and zero trust—though it has become, as you say, a catch-all term—seems to be a serious way of looking at this type of compromise and this sort of mitigation against that sort of behavior.Philip: Definitely. And I think that leads itself to, if you're using the correct zero-trust solution, you're able to close [unintelligible 00:10:12] ports, great, you've now massively reduced your attack surface. But what if someone does get a phishing injection of ransomware or something to their endpoint or into their servers? The two things that I like to think about is that if you're creating your overlay network so that the only communication from your server is outbound into the public IPs of your private overlay, then effectively even if the ransomware gets in there, it can't then connect to its command and control module to then go through the kill cycle to other activities. The other is that if you then look at it [instead 00:10:46] of on the server-side, but actually on the client-side, if someone infects my Mac laptop with ransomware, we use this internal application called Mattermost.And it's basically Slack, but open-source. If my Mattermost is Ziti-fied, even I've got ransomware on my device, it can't side-channel attack into Mattermost because you would actually have to break into the Mattermost application and somehow get that Mattermost application to make a compromised query or whatever to get past the system. So really, when I look at zero-trust, it's not about saying, “We're secure. Job done. You know, fire the security department because we don't need them anymore.” It's all about saying—Corey: Box check. Hand it off to the auditor.Philip: [laugh]. Exactly. It's more about saying the cost of attack, the cost of compromised is increased, ideally, to the point where the malicious actors don't have a return on investment. Because if they don't have a return on investment, they will find something else that's not your applications and your systems to try and compromise.Corey: I want to make sure that I'm contextualizing this properly because we're talking—I think—about what almost looks like two different worlds here. There's the, this is how things wind up working in the ecosystem as far as your server environment goes in a cloud provider, but then we're also talking about what goes on in your corporate network of people who are using laptops, which is increasingly being done from home these days. Where do you folks start? Where do you stop? Do you transcend into the corporate network as well, or is this primarily viewed as a production utility?Philip: We do. One of our original design principles with OpenZiti was for it to be a platform rather than a point solution. So, we designed it from the ground up to be able to support any IP packets, TCP, UDP, et cetera, whether you're doing, client-server, server-server, machine-server, server-initiated, client-initiated, yadda, yadda, yadda. So effectively, the same technology can be applied to many different use cases, depending on where you want to use it. We've been doing work recently to handle, let's call them the hard use cases.Probably one of the hardest ones out there is VoIP. There is a playbook that is currently taking place where the VoIP-managed service provider gets DDoSed by malicious actors; the playbook is to move it onto a CDN so that you move the attack surface and you get respite for a few hours. And there's not really any way to solve it because blocking DDoS attacks at layer 3, layer 4 is incredibly difficult unless you can make your PBX dark. And I've seen a couple of our OpenZitiE engineers making calls from one device to another without going through the PBX by doing that over OpenZiti, and being able to solve some of the challenges that's normally associated with VoIP. Again, it was really one of our design principles: How can we make the platform is so flexible that we can do X, Y, Zed today; we're able to build it, again to become a standard, because it can handle anything.Corey: One of the big questions that people are going to have going into this is, and this may sound surprising is a little bit less about technical risk of things like encryption and the rest and a lot more around the idea of okay, does this mean that what you are building becomes a central point of business risk? In other words, if the NetFoundry SaaS installation and wherever they happen to be using as their primary winds up going down, does that mean suddenly nothing can talk to one another? Because it turns out that, you know, computers are not particularly useful in 2022 if they aren't able to talk to other computers, by and large. “The network is the computer,” as was famously stated. What is the failure mode in the event that you experience technical interruption?Philip: We have this internal sessions, which we call Ziti Kitchens, where our engineering team that are creating Ziti educate on stuff that they're building. And one of them in the Ziti Kitchen was around HA, HS, et cetera, and all of the functions that we've built in so that you have redundancy and availability within the different components. Because effectively it's an overlay network, so we've designed it to be a mesh overlay network. You can setup with one point of failure, but then simultaneously, you can very easily set up to have no points of failure because it can have that redundancy and the overlay has its own mechanisms to do things like smart routing and calculation of underlying costs.That cost in that instance would be, well, AWS has gone down, so the latency to send a packet or flow over it is incredibly high, therefore I'm going to avoid that route and send the traffic to another location. I always remember this Ziti Kitchen episode because the underlying technology that does it is called Terminators—Ziti has these things called Terminators—some of the slide there was this little heads over the Terminator with the red eyes, you know, the silver exoskeleton, which always made me laugh.Corey: It's helpful to have things that fail out of band as opposed to—think of the traditional history in security before everything was branded with zero-trust as a prerequisite for exhibiting at RSA; before that was firewalls was the story, and the question always was, if a firewall fails, do you want it to fail open or fail closed? And believe it or not, there are legitimate answers in both directions; depends on context and what you're doing. There are some things for example, IAM in a cloud world where you absolutely never want to fail open, full stop. You would rather someone bodily rip the power cable out the back of the data center rather than let that happen. With something like this, where nothing is able to talk to one another if the entire system goes down, yeah, you want to have the control system that you folks run to be out of band, that is almost always the right answer.As I look at the various case studies that you have on your website and the serious companies that are using what you have built, do you find that they are primarily centralizing around individual cloud providers? Are you seeing that they're using this as a expression of multi-cloud because I can definitely see a story where oh, it helps bring two cloud providers from a networking and security perspective onto the same page, but I can also see, even within one cloud provider, the idea that, hey, I don't have to play around with your ridiculous nonsense? What use cases are you seeing emerge among your customers?Philip: Definitely, the multi-cloud challenge is one that we're seeing as a emerging trend. We do a lot of work with Oracle and, you know, their stated position is multi-cloud is a fact. In fact for them, if we make the secure networking easier, we can bring workloads into our cloud quicker [unintelligible 00:17:21] the main driver between our partnership. We recently did a blog talking about Superclouds and the advent of organizations like Snowflake and HashiCorp and Confluence and Databricks basically building value and business applications which abstracts away the underlying complexity. But you get into the problem of the standard shared security model, where the customer has to deal with DNS and VPNs and MPLS and AWS Private Endpoint or Azure Private Link or whatever they call it, and you have to assemble this Frankenstein of stuff just to enable a VM to communicate to another VM.And the posit of our blog—in fact, we use that exact quote—John Gage—“The computer is the network.” If you can put a network inside the application, you've now given your supercloud superpowers because [unintelligible 00:18:13] natively—I mean, this is very marketing term, but, “Develop once; deploy anywhere,” and be multi-cloud-native.Corey: The idea of being able to adapt to emerging usage patterns without full-on redeploy is handy. What I also would like to highlight, too, is that you are, of course, a network overlay and that is something that is fairly well understood and people have seen it, but your preferred adoption model goes up a couple of steps beyond that into altering the way that the application thinks about these things. And you offer an SDK that ranges from single line of code implementation to I think up to 20, so it's not a massive rewrite of the application, but it does require modification of the stack. What does that buy you, for lack of a better term? Because once you have the application becomes aware of what is effectively its own, “Special network,” quote-unquote, its work to wind up modifying existing applications around something like this. What's the payoff?Philip: So, there's three broad ones that immediately come to my mind. Number one is the highest security that effectively—your private network is inside the app, so you have to somehow break into the app and that can be incredibly complicated, particularly run the app in something like a confidential compute enclave; you can now have a distributed confidential system.The second is what you're getting in programmability. You're able to effectively operate in a fully—even, you know, you get to a GitOps environment. We're currently working on documentation which says, “Hey, you can do all this stuff in GitOps and then it'll go into your CI/CD and that'll talk to the APIs.” And it'll effectively do everything in a completely programmable manner so that you can treat your private networks as cattle rather than as pets.The third is transparency. You used the words earlier of bolt-on networking because that's how we always think about networking security: We bolt it on. As a user, we have to jump through the VPN hoop, we have to go through the bastion, we have to interact with the network. If your private network's inside the application, then you interact with the application. I can have a mobile application on my device and I have no idea that it's part of a private network and that the API is private and the malicious actors can't get to it. I just interact with the application. That is it.That is what no one else has the ability to do and where OpenZiti has its most power because then you get rid of the constant tug of war between the security team that want to lock everything down and the users and the developers who want to move fast and give a great experience. You can effectively have your cake and eat it.Corey: The challenge, of course, with rolling a lot of these things out in a way that becomes highly programmable is that unlocks a bunch of capability, but the double-edged sword there is always one of complexity. I mean, we take a look at the way that AWS networking has progressed, and they finally rolled out the VPC Reachability Analyzer, so when two things can't talk to each other, well, you run this thing and it tells you exactly why, which is super handy. And then just as a way of twisting the knife a little bit, every time you run it, they charge at ten cents for the privilege, which doesn't actually matter in the context of what anyone is being compensated for, until and unless you build this into something programmatic, but it stings a little bit. And the idea of being able to program these things to abstract away a lot of that complexity is incredibly compelling, except for the part where now it feels like it really increases developer burden on a lot of these things. Have you found that to be true? Do you find that it is sort of like a sliding scale? What has the customer experience been around this?Philip: I would say a sliding scale. You know, we had one organization who they started with the OpenZiti Tunnelers, and then we convinced them to use the SDK and [unintelligible 00:21:51], “Oh, this was super easy.” And now they just run OpenZiti on themselves. But then they've also said at some point, we'll use the NetFoundry platform, which effectively gives us a SaaS experience in consuming that. One of the huge focus—well, we've got a few big focuses for product development, but one of the really big areas is really giving more visibility and monitoring so that rather than people having to react to configuration problems or things which they need to fix in order to ensure your perfect network overlay, instead, those things are being seen and automatically dealt with human-in-the-loop if you want it, in order to remove that burden.Because ultimately, if you can get the network to a point where as long as you've got underlay and you've set your policy, the overlay is going to work, it's going to be secure, and it's going to give you the uptime you need, that is the Nirvana that we all have to strive for.Corey: This episode is sponsored in part by our friends at Vultr. Spelled V-U-L-T-R because they're all about helping save money, including on things like, you know, vowels. So, what they do is they are a cloud provider that provides surprisingly high performance cloud compute at a price that—while sure they claim its better than AWS pricing—and when they say that they mean it is less money. Sure, I don't dispute that but what I find interesting is that it's predictable. They tell you in advance on a monthly basis what it's going to going to cost. They have a bunch of advanced networking features. They have nineteen global locations and scale things elastically. Not to be confused with openly, because apparently elastic and open can mean the same thing sometimes. They have had over a million users. Deployments take less that sixty seconds across twelve pre-selected operating systems. Or, if you're one of those nutters like me, you can bring your own ISO and install basically any operating system you want. Starting with pricing as low as $2.50 a month for Vultr cloud compute they have plans for developers and businesses of all sizes, except maybe Amazon, who stubbornly insists on having something to scale all on their own. Try Vultr today for free by visiting: vultr.com/screaming, and you'll receive a $100 in credit. Thats V-U-L-T-R.com slash screaming.Corey: A common criticism of things that shall we say abstract away the network is a fairly common predictable failure mode. I've been making fun of Kubernetes on this particular point for years, and I'm annoyed that at the time that we're recording this, that is still accurate. But from the cloud providers' perspective, when you run Kubernetes, it looks like one big really strangely behaved single-tenant application. And Kubernetes itself is generally not aware of zone affinity, so it could just as easily wind up tossing traffic to the node next to it at zero cost or across an availability zone at two cents per gigabyte, or, God forbid across the internet at nine cents a gigabyte and counting depending upon how it works. And the application-side has absolutely no conception of this.How does OpenZiti address this in the real world because it's one of those things where it almost doesn't matter what you folks charge on top of it, but instead oh wow, this winds up being so hellaciously expensive that we can't use it regardless of whatever benefit it provides just because it becomes a non-starter.Philip: So, when we built the overlay and the mesh, we did it from the perspective of making it as programmable and self-driven as possible. So, with the whole Terminator strategies that was mentioned earlier, it gives you the ability to start putting logic into how you want packets to flow. Today, it does it on a calculation of end-to-end latency and chooses and reroutes traffic in order to give that information. But there's no reason that you couldn't hook it up into understanding what is the numerical in monetary cost for sending a packet along a certain path. Or even what is my application performance monitoring tool saying? Because what that says versus what the network believes could be different things. And effectively you can ingest that information to make your smart routing decisions so all of that logic can exist within the overlay that operates for you.Corey: I will say that really harkens back, on some level, to what I was experimenting with back when I got my CCNA many years ago where there's an idea of routing protocols have built into the idea of the cost of a link. I will freely admit slash confess that at the time of the low-cost link, I assumed this was about what was congested or what would wind up having, theoretically, some transit versus peering agreement. It never occurred to me that I'd have to think about those things in a local network and have to calculate in the Byzantine pricing models of cloud providers. But I've seen examples of folks who are using OpenZiti, and NetFoundry alike, to wind up building in these costing models so that yeah, ideally, it just keeps everything local, but of that path degrades then yes, we would prefer to go over an expensive link than to basically have TCP terminate on the floor until everything comes back up. It sort of feels like there's an awful lot of logic you can bake into that goes well beyond what routing protocols are capable of, just by virtue of exposing that programmability.Well, for this customer because they're on the pre—on the extreme tier, then we want to have the expensive fallback; for low-tier customers, we might want to have them just have an outage until things end. And it really comes down to letting business decisions express themselves in terms of application behavior while in degraded state. I love that idea.Philip: Yeah, I understand. We don't do it today, but there will be a point in the future—I strongly believe—that we'll be able to say, hey, I'll give you an SLA on the internet. Because we'll have such path diversity and visibility of how the internet operates that we'll be able to say within certain risk parameters of what we can deliver. But then you can take it to other logical extremes. You could say, “Hey, I want to build a green overlay. I want to make sure that I'm using Arm instances and in data centers of renewable energy so that my network is green.”Or you can say on a GDPR-compliant overlay so that my data stays within a certain country. You start being able to say—you know, really start dreaming up what are the different policies that I can apply to this because you're applying a central policy to then what is in the distributed system.Corey: One last topic I want to cover before we call it an episode is that you are, effectively, a SaaS company that is built on top of an open-source project. And that has been an interesting path for a lot of companies that early on, figured that if they wrote the software, a lot of the contributors who are doing the lion's share of contribution, that they were clearly the best people to run it. And Amazon's approach towards operational excellence—as they called it—wound up causing some challenges when they launched the Amazon Basics version of that service. I feel like there are some natural defenses built into OpenZiti to keep it from suffering that fate, but I'm very curious to get your take on it.Philip: Fundamentally, our take is that—in fact, our mission is to take what was previously impossible and turn it into a standard. And the only way you can really create standards is to have a open-source that is adopted by the wider community and that ecosystems get built around and into. And that means giving an OpenZiti to absolutely everyone so that they can use it, they can innovate on top of it. We all know that very few people actually want to host their own infrastructure, so we assume a large percentage of people will come and go, “Hey, NetFounder, you provide us the hosting, you provide us the SaaS capability so we don't have to do that ourselves.” But fundamentally in the knowledge that there's something bigger because it's not just us maintaining this project; there's a bunch of people who are doing pull requests and find out cool, fun ways to build further value on what we can build ourselves.We believe the recent history is littered with examples of the new world built on open-source. And fundamentally, we think that's really the only way to be able to change an industry so profoundly as we intend to.Corey: I would also argue that, to be very direct—and I can probably get away with saying this in a way that I suspect you might not be able to—but if AWS had it in their character to simplify things and make it a lot easier for people to work with in a networking sense, what's stopping them? They didn't need to wait for an open-source company to wind up coming out of nowhere and demonstrating the value of this. Customers have been asking it for years. I think that at this point, this is something that is unlikely to ever wind up being integrated into a cloud provider's primary offering. Until and unless the entire industry shifts, at which point we're having a radically different conversation very far down the road.Philip: Yeah, potentially because it opens the interesting thing that if you make it so easy for someone to take their data out, do they use your cloud less? There are some cloud providers that will lean into that because they do see more clouds in the future and others that won't. I see it more myself that as those kind of things happen, it'll be done on a product-by-product basis. For example, we're talking to an organization, and [unintelligible 00:29:49] like, “Oh, could you Ziti-fy our JDBC driver so that when users access our database, they don't have to use a VPN?” [unintelligible 00:29:55], “Yeah. We've already done that with JDBC. We called it ZDBC.”So, we'll just, instead of using the general industry one—probably the Oracle one or something because that's kind of standard—we'll take your one that you've created for yourself and be able to solve that problem for you.Corey: I really want to thank you for taking the time to speak with me today. If people want to learn more, where's the best place to find you?Philip: Best place to go to is netfoundry.io/screaminginthecloud. From there, anyone can grab some free Ziggy swag. Ziggy's our little open-source mascot, cute little piece of pasta with many different outfits. Little sass as well. And you can find further information both on OpenZiti and NetFoundry.Corey: And we will put links to both of those in the [show notes 00:30:40]. Thanks so much for taking the time to speak with me today. I really appreciate it.Philip: It's a pleasure. Thanks, Corey.Corey: Philip Griffiths, Head of Business Development at NetFoundry. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment telling me exactly why I'm wrong about AWS's VPC complexity, and that comment will get moderated and I won't get to read it until you pay me ten cents to tell you how it got moderated.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Podcasts about MinIO

Best podcasts about MinIO

Screaming in the Cloud

Artificial Intelligence in Industry with Daniel Faggella

GreyBeards on Storage

Gestalt IT Rundown

BSD Now

Packet Pushers - Full Podcast Feed

AWS Morning Brief

Packet Pushers - Briefings In Brief

Vox 2 Box

Latest news about MinIO

Latest podcast episodes about MinIO

D2DO280: Architect for Your AI Success With F5 and MinIO (Sponsored)

D2DO280: Architect for Your AI Success With F5 and MinIO (Sponsored)

D2DO280: Architect for Your AI Success With F5 and MinIO (Sponsored)

617: The Disposable Server

Building a serverless database replica with Carl Sverre

Architektura on-premises PaaS - open source jako alternatywa dla chmury

ATA 680 Backups en Android con Restic

ATA 680 Backups en Android con Restic

#092 - Introducing Data Services Manager 2.2 featuring Cormac Hogan

ATA 677 No pierdas tus datos. Backups infalibles con Restic y Minio

ATA 677 No pierdas tus datos. Backups infalibles con Restic y Minio

Agrigento Capitale Cultura, Minio "10/1 passaggio testimone da Pesaro"

Agrigento Capitale Cultura, Minio "10/1 passaggio testimone da Pesaro"

Empowering Enterprises: OPEA, AI, and the Future of Storage

Northwestern Mutual's Cloud Adoption Journey - with Ahmed Azam of Northwestern Mutual

Tech Bytes: MinIO Optimizes Object Storage for AI Infrastructure (Sponsored)

Tech Bytes: MinIO Optimizes Object Storage for AI Infrastructure (Sponsored)

Is IT Security Too Stressful for the Money? | The Gestalt IT Rundown: November 13, 2024

Managing End Point Storage in Hybrid Data Strategies for Financial Services - with Yonas Yohannes of Oracle

AI Infrastructure Investments for Insurance Workflows - with Ylan Kazi of Blue Cross Blue Shield

#156 Intel Capital's Senior Managing Director Mark Rostick

Infrastructure Challenges in Life Sciences Through the Lens of Data - with Robert Wenier of AstraZeneca

Driving AI Infrastructure in Compliance-Heavy Industries - with Shardul Vikram of SAP

1,015: Aligning Investor Narratives with Operational Strategy | Mark Khavkin, CFO, MinIO

1,015: Aligning Investor Narratives with Operational Strategy | Mark Khavkin, CFO, MinIO

Tech Bytes: High Performance, Scalable Object Storage with MinIO (Sponsored)

Tech Bytes: High Performance, Scalable Object Storage with MinIO (Sponsored)

Tech Bytes: High Performance, Scalable Object Storage with MinIO (Sponsored)

State of the Art: Training >70B LLMs on 10,000 H100 clusters

Balancing Infrastructure and Data Management in Financial Services - with Sheri Crawford of Scotiabank

Essentials for AI Infrastructure and Object-Based Storage for Enterprises - with Anand Babu Periasamy of MinIO

Open Source, AI, and Business Insights with AB Periasamy

Staying True to Your Community and Your Bottom Line with Garima Kapoor

AI workloads impacting cloud storage and data infrastructures

Leading the World of Object Storage with Garima Kapoor

In today's symposium, we talk about a new strand of Chae$ malware, some developments in social engineering, privateers in a hybrid war, cyber ops as combat support, and some default passwords.

PDF MalDoc warning, MinIO storage compromises, Okta helpdesk attacks

Object store for AI workloads | Anand Babu Periasamy, cofounder and CEO of MinIO

China Plays Hardball with Western Companies | Gestalt IT Rundown: April 12, 2023

BONUS: What is Object Storage like AWS S3, Minio and more!

Episode #110 - It's 5:05, Friday, March 31, 2023

Opensource Licensing Danger Zone?

Gordon Moore Dies at 94 | Gestalt IT Rundown: March 29, 2023

Making Open-Source Multi-Cloud Truly Free with AB Periasamy

Combining Community and Company Employees with Matty Stratton

493: Dotfile Management

481: Fiery Crackers

Building a Healthier Sales Environment with Ashleigh Early

Would You Kindly Remind with Peter Hamilton

The Demystification of Zero Trust with Philip Griffiths