POPULARITY
10 月 20 號星期一,亞馬遜雲端服務 AWS 的核心區域 us-east-1 爆出一個 Race Condition,導致 DynamoDB 的 DNS 被清空,結果連帶拖垮了 113 項內部與外部服務。從社群平台、交易所、航空公司、政府單位,甚至英超足球聯盟,全都中標。這場十五小時的大當機,不只是 AWS 的災難,更是「雲端集中化」的一次警訊。這集我們就來聊聊:☁️ 為什麼 us-east-1 這麼關鍵?⚙️ Race Condition 到底怎麼讓 DNS 全毀?
Send us a textIn this episode of What's New in Cloud FinOps, Stephen Old and Frank discuss the latest updates in cloud computing, including AWS Outposts' integration with third-party storage, new Amazon EC2 Mac instances, Azure's managed services, and Google Cloud VM Engine updates. They also explore pricing changes in Azure, the deprecation of Azure Machine Learning data labeling, and the introduction of new metrics in software development. The conversation highlights the importance of sustainability in cloud services and concludes with reflections on the podcast's five-year anniversary.TakeawaysAWS Outposts now supports third-party storage integration with Dell and HPE.Amazon EC2 introduces new Mac instances for developers.Azure managed services now include Grafana dashboards at no extra cost.Google Cloud VM Engine V1 SKUs are now end of sale.Azure UltraDisk pricing has been reduced significantly in specific regions.Azure Machine Learning data labeling will be deprecated by 2026.AWS Transform Assessment helps visualize storage migration benefits.New cost to serve software metric introduced by AWS.Cortex Framework now deploys sustainability modules for SAP.AWS Lambda cold start billing changes will take effect in 2025.
An airhacks.fm conversation with Philipp Page (@PagePhilipp) about: early computing experiences with Windows XP and Intel Pentium systems, playing rally car games like Dirt with split-screen multiplayer, transitioning from gaming to server administration through Minecraft, running Minecraft servers at age 13 with memory limitations and out-of-memory exceptions, implementing caching mechanisms with cron jobs and MySQL databases, learning about SQL injection attacks and prepared statements, discovering connection pooling advantages over PHP approaches, appreciating type safety and Object-oriented programming principles in Java, the tendency to over-abstract and create unnecessary abstractions as junior developers, obsession with avoiding dependencies and implementing frameworks from scratch, building custom Model-View-Controller patterns and dependency injection systems, developing e-learning platform for aerospace industry using PHP Symfony framework, implementing time series forecasting in pure Java without external dependencies, internship and employment at AWS Dublin in Frontier Networking team, working on AWS Outposts and Ground Station hybrid cloud offerings, using python and rust for networking control plane development, learning to appreciate Python despite initial resistance to dynamically typed languages, joining AWS Lambda Powertools team as Java tech lead, maintaining open-source serverless development toolkit, providing utilities for observability including structured JSON logging with Lambda-specific information, implementing metrics and tracing for distributed event-driven architectures, mapping utilities to AWS Well-Architected Framework serverless lens recommendations, caching parameters and secrets to improve scalability and reduce costs, debate about AspectJ dependency and alternatives like Micronaut and quarkus approaches, providing both annotation-based and programmatic interfaces for utilities, newer utilities like Kafka consumer avoiding AspectJ dependency, comparing Micronaut's compiler-based approach and Quarkus extensions for bytecode generation, AspectJ losing popularity in enterprise Java projects, preferring Java standards over external dependencies for long-term maintainability, agents in electricity trading simulations for renewable energy scenarios, comparing on-premise Java capabilities versus cloud-native AWS features, default architecture pattern of Lambda with S3 for persistent storage, using AWS Calculator for cost analysis before architecture decisions, event-driven architectures being native to AWS versus artificially created in traditional Java projects, everything in AWS emitting events naturally through services like EventBridge, filtering events rather than creating them artificially, avoiding unnecessary microservices complexity when simple method calls suffice, directly wiring API Gateway to DynamoDB without Lambda for no-code solutions, using Java for CDK infrastructure as code while minimizing runtime dependencies, maximizing cloud-native features when in cloud versus on-premise optimization strategies, starting with simplest possible architecture and justifying complexity, blue-green deployments and load balancing handled automatically by Lambda, internal AWS teams using Lambda for orchestration and event interception, Lambda as foundational zero-level service across AWS infrastructure, preferring highest abstraction level services like Lambda and ECS Fargate, only dropping to EC2 when specific requirements demand lower-level control, contributing to Powertools for AWS Lambda Python repository before joining team, compile-time weaving avoiding Lambda cold start performance impacts, GraalVM compilation considerations for Quarkus and Micronaut approaches, customer references available on Powertools website, contrast between low-level networking and serverless development, LinkedIn as primary social media platform for professional connections, Powertools for AWS Lambda (Java) Philipp Page on twitter: @PagePhilipp
Grief is a painful, individual emotional and physical response to a significant loss. But it can be managed. Synopsis: Every first Wednesday of the month, The Straits Times helps you make sense of health matters that affect you. Grief is a painful, individual emotional and physical response to a significant loss.Death, divorce, the loss of a home or a job, fast declining health are among the major events that people grieve. To learn more about coping with grief, ST senior health correspondent Joyce Teo speaks to Lin Jing, a counsellor from the Singapore Association for Mental Health. SAMH is one of the few social service agencies focusing on mental health here that operates a general helpline for the public at 1800-283-7019 They also discuss what is grief counselling about. If your grief feels like it's too much to bear, please reach out for help. We have included more helplines below. Highlights (click/tap above): 9:00 When should you consider grief counselling? 12:45 When guilt is thrown into the picture 23:00 Understanding cognitive behaviourial therapy, grief counselling and grief therapy 29:10 Building a life around the loss of a child… 32:00 Appearing strong and unaffected by grief, when you are crumbling inside Check out ST's new series, No health without mental health: https://str.sg/mentalhealthmatters Read Joyce Teo's stories: https://str.sg/JbxN Host: Joyce Teo (joyceteo@sph.com.sg) Produced and edited by: Amirul Karim Executive producers: Ernest Luis and Lynda Hong Follow Health Check Podcast here and get notified for new episode drops: Channel: https://str.sg/JWaN Apple Podcasts: https://str.sg/JWRX Spotify: https://str.sg/JWaQ Feedback to: podcast@sph.com.sg SPH Awedio app: https://www.awedio.sg --- Follow more ST podcast channels: All-in-one ST Podcasts channel: https://str.sg/wvz7 Get more updates: http://str.sg/stpodcasts The Usual Place Podcast YouTube: https://str.sg/4Vwsa --- Get The Straits Times app, which has a dedicated podcast player section: The App Store: https://str.sg/icyB Google Play: https://str.sg/icyX --- Helplines Mental well-being National helpline: 1771 (24 hours) / 6669-1771 (via WhatsApp) Samaritans of Singapore: 1-767 (24 hours) / 9151-1767 (24 hours CareText via WhatsApp) Singapore Association for Mental Health: 1800-283-7019 Silver Ribbon Singapore: 6386-1928 Chat, Centre of Excellence for Youth Mental Health: 6493-6500/1 Women’s Helpline (Aware): 1800-777-5555 (weekdays, 10am to 6pm) The Seniors Helpline: 1800-555-5555 (weekdays, 9am to 5pm) Tinkle Friend (for primary school-age children): 1800-2744-788 Counselling Touchline (Counselling): 1800-377-2252 Touch Care Line (for caregivers): 6804-6555 Counselling and Care Centre: 6536-6366 We Care Community Services: 3165-8017 Shan You Counselling Centre: 6741-9293 Clarity Singapore: 6757-7990 Online resources mindline.sg/fsmh eC2.sg chat.mentalhealth.sg carey.carecorner.org.sg (for those aged 13 to 25) limitless.sg/talk (for those aged 12 to 25) --- #healthcheckSee omnystudio.com/listener for privacy information.
How to cope with losing a sense of normalcy in your life. Synopsis: Every first Wednesday of the month, The Straits Times helps you make sense of health matters that affect you. Loss is an inevitable part of life, and grief is our response to any significant loss. To learn more about coping with grief, ST senior health correspondent Joyce Teo speaks to Lin Jing, a counsellor from the Singapore Association for Mental Health. SAMH is one of the few social service agencies focusing on mental health here that operates a general helpline for the public at 1800-283-7019 They also discuss what is grief counselling about. If your grief feels like it's too much to bear, please reach out for help. We have included more helplines below. Highlights (click/tap above): 9:00 When should you consider grief counselling? 12:45 When guilt is thrown into the picture 23:00 Understanding cognitive behaviourial therapy, grief counselling and grief therapy 29:10 Building a life around the loss of a child… 32:00 Appearing strong and unaffected by grief, when you are crumbling inside Check out ST's new series, No health without mental health: https://str.sg/mentalhealthmatters Read Joyce Teo's stories: https://str.sg/JbxN Host: Joyce Teo (joyceteo@sph.com.sg) Produced and edited by: Amirul Karim Executive producers: Ernest Luis and Lynda Hong Follow Health Check Podcast here and get notified for new episode drops: Channel: https://str.sg/JWaN Apple Podcasts: https://str.sg/JWRX Spotify: https://str.sg/JWaQ Feedback to: podcast@sph.com.sg SPH Awedio app: https://www.awedio.sg --- Follow more ST podcast channels: All-in-one ST Podcasts channel: https://str.sg/wvz7 Get more updates: http://str.sg/stpodcasts The Usual Place Podcast YouTube: https://str.sg/4Vwsa --- Get The Straits Times app, which has a dedicated podcast player section: The App Store: https://str.sg/icyB Google Play: https://str.sg/icyX --- Helplines Mental well-being National helpline: 1771 (24 hours) / 6669-1771 (via WhatsApp) Samaritans of Singapore: 1-767 (24 hours) / 9151-1767 (24 hours CareText via WhatsApp) Singapore Association for Mental Health: 1800-283-7019 Silver Ribbon Singapore: 6386-1928 Chat, Centre of Excellence for Youth Mental Health: 6493-6500/1 Women’s Helpline (Aware): 1800-777-5555 (weekdays, 10am to 6pm) The Seniors Helpline: 1800-555-5555 (weekdays, 9am to 5pm) Tinkle Friend (for primary school-age children): 1800-2744-788 Counselling Touchline (Counselling): 1800-377-2252 Touch Care Line (for caregivers): 6804-6555 Counselling and Care Centre: 6536-6366 We Care Community Services: 3165-8017 Shan You Counselling Centre: 6741-9293 Clarity Singapore: 6757-7990 Online resources mindline.sg/fsmh eC2.sg chat.mentalhealth.sg carey.carecorner.org.sg (for those aged 13 to 25) limitless.sg/talk (for those aged 12 to 25) --- #healthcheckSee omnystudio.com/listener for privacy information.
AWS Morning Brief for the week of September 29th, 2025, with Corey Quinn. Links:You can now preview Amazon S3 Tables in the S3 consoleAWS Organizations supports full IAM policy language for service control policies (SCPs)Amazon Route 53 Resolver Query Logging now available in Asia Pacific (New Zealand)A guide to reducing waste and improving efficiency with AWSAmazon RDS announces cross-Region and cross-account snapshot copyAWS announces EC2 instance attestationHumane World for Animals uses AWS to scale global animal welfare programs deprecating its SOAP APIPrompt Library | AWS StartupsAWS named as a Leader in the 2025 Gartner Magic Quadrant for AI Code Assistants
Join us next week for a special announcement about the Q4 Drive ERA Quarterly Incentive! The average 12-month earnings of a typical US Consultant who earned in 2024 are $683. These earnings represent gross income and do not account for expenses incurred in building a business. Visit the LifeVantage Income Disclosure Statement for more details. This statement has not been evaluated by the Food and Drug Administration. This product is not intended to diagnose, treat, cure, or prevent any disease. Of the 60,000 active LifeVantage Consultants, approximately 250 earn company trips. Since the beginning of the program in 2018, approximately 230 Consultants have earned the My LifeVenture award by achieving EC2 and maintaining that rank for 6 consecutive months within the 18 months of advancing to EC2. The average number of active consultants over this period of time is 62,407.
The average 12-month earnings of a typical US Consultant who earned in 2024 are $683. These earnings represent gross income and do not account for expenses incurred in building a business. Visit the LifeVantage Income Disclosure Statement for more details. This statement has not been evaluated by the Food and Drug Administration. This product is not intended to diagnose, treat, cure, or prevent any disease. Of the 60,000 active LifeVantage Consultants, approximately 250 earn company trips. Since the beginning of the program in 2018, approximately 230 Consultants have earned the My LifeVenture award by achieving EC2 and maintaining that rank for 6 consecutive months within the 18 months of advancing to EC2. The average number of active consultants over this period of time is 62,407.
Since the beginning of the program in 2018, approximately 230 Consultants have earned the My LifeVenture award by achieving EC2 and maintaining that rank for 6 consecutive months within the 18 months of advancing to EC2. The average number of active consultants over this period of time is 62,407. The average 12-month earnings of a typical US Consultant who earned in 2024 are $683. These earnings represent gross income and do not account for expenses incurred in building a business. Visit the LifeVantage Income Disclosure Statement for more details. Of the 60,000 active LifeVantage Consultants, approximately 250 earn company trips.
Michael shares a trick to reduce AWS Config costs for volatile workloads. Andreas talks about EC2 instance families and their availability in the different AWS regions. On top of that, the Wittig brothers share insights into their work and business.
Episode Summary:AWS Morning Brief for the week of August 11th, 2025, with Corey Quinn.Links: AWS Cloud Visibility Best PracticesThis Ars articleAWS European Sovereign Cloud to be operated by EU citizensAmazon killing a user's accountMountpoint for Amazon S3 CSI driver v2: Accelerated performance and improved resource usage for Kubernetes workloadsStreamlining outbound emails with Amazon SES Mail ManagerAWS Lambda now supports GitHub Actions to simplify function deploymentAnthropic's Claude Opus 4.1 now in Amazon BedrockAmazon CloudWatch introduces organization-wide VPC flow logs enablementUnderstanding and Remediating Cold Starts: An AWS Lambda PerspectiveAmazon SQS increases maximum message payload size to 1 MiBOpenAI open weight models now available on AWS Best practices for analyzing AWS Config recording frequenciesAmazon EKS adds safety control to prevent accidental cluster deletionAWS Console Mobile App now offers access to AWS SupportAmazon EC2 now supports force terminate for EC2 instances Amazon DynamoDB adds support for Console-to-CodeUsing generative AI for building AWS networksSimplify network connectivity using Tailscale with Amazon EKS Hybrid NodesCost tracking multi-tenant model inference on Amazon Bedrock
An airhacks.fm conversation with Adam Dudczak (@maneo) about: early programming experiences with Commodore 64 and Pascal, demo scene participation through postal mail swapping of floppy disks, writing assembly code for 64K intros with music and graphics, developing digital library systems using Java Servlets and Hibernate, involvement in reactivating Poznan Java User Group in 2007, NetBeans Dream Team and NetBeans World Tour, appearing on Polish breakfast TV to discuss Java programming, working at Supercomputing Center on cultural heritage digitization projects, transitioning to EJB 3.0 and Glassfish based on conference inspirations, joining allegro in 2014 to rewrite search functionality from PHP to Java microservices, handling 14K requests per second with Solr-based search infrastructure, migrating big data stack from on-premise Hadoop to Google Cloud Platform, developing private banking application for children using Spring and Hibernate then migrating to Google Sheets with 70 lines of JavaScript, discussing public cloud cost optimization strategies, comparing AWS Lambda versus EC2 versus container services based on traffic patterns, emphasizing removal of code when moving to public cloud to leverage managed services, standardization benefits of Java EE for long-term maintenance and migration, quarkus as modern framework supporting old Jakarta EE code with fast startup times, importance of choosing appropriate persistence layer (S3 vs relational databases) based on cloud costs, serverless architectures for enterprise applications with predictable low traffic, differences between AWS Azure and GCP service offerings and pricing models, Turbo assembler project klatwa Adam Dudczak on twitter: @maneo
* Australia's World-First Scam Prevention Laws Target Growing Cybercrime as Victims Lose Millions* Single Weak Password Destroys 158-Year-Old Company as UK Ransomware Attacks Surge* AI Coding Tool Goes Rogue, Deletes Company Database During Code Freeze and Lies About Recovery* Hacker Compromises Amazon's AI Coding Assistant, Plants Computer-Wiping Commands in Public Release* AI vs AI the Cybersecurity Prompt WarsAustralia's World-First Scam Prevention Laws Target Growing Cybercrime as Victims Lose Millionshttps://www.sbs.com.au/news/insight/article/bank-account-scams-and-the-scams-prevention-framework/jw382pz2hAustralia has introduced groundbreaking scam prevention legislation as cybercrime reports surge to one every six minutes nationwide, with devastating cases highlighting the urgent need for stronger consumer protections. The new Scams Prevention Framework, passed in February, represents the world's first comprehensive approach requiring banks, mobile networks, and social media companies to take reasonable steps to prevent, detect, disrupt, and report scams or face significant penalties. The legislation comes as organised crime syndicates increasingly operate sophisticated scam operations like businesses, with different specialised divisions targeting victims around the clock based on optimal vulnerability windows.High-profile cases demonstrate the severe financial and emotional toll on victims, including 23-year-old electrician Louis May who lost his entire $110,000 house deposit to email scammers impersonating his lawyer, and Vicky Schaefer who watched helplessly as scammers drained $47,000 from her account while she remained on the phone with them. The Australian Federal Police said that "we can't actually arrest our way out of this problem," highlighting the need for collaborative efforts between law enforcement and financial institutions to disrupt criminal networks. Despite the new framework, consumer advocacy groups have criticised the legislation for not mandating automatic compensation for scam victims, unlike the UK model that forces banks to reimburse customers within five days unless gross negligence is proven.The implementation challenges remain significant as victims continue struggling to recover losses through existing dispute resolution mechanisms. The Australian Financial Complaints Authority noted that most consumers incorrectly assume banks already verify account holder names against banking details, a basic security measure only recently being implemented through confirmation of payee systems. While the framework represents a major step forward in scam prevention, cases like Louis May's ongoing financial hardship and Vicky Schaefer's year-long battle for reimbursement shows the need for stronger victim protection measures and more comprehensive industry accountability standards.Single Weak Password Destroys 158-Year-Old Company as UK Ransomware Attacks Surgehttps://www.bbc.com/news/articles/cx2gx28815woA single compromised password led to the complete destruction of KNP, a 158-year-old Northamptonshire transport company that operated 500 lorries under the Knights of Old brand, resulting in 700 job losses when the Akira ransomware gang encrypted all company data and demanded up to £5 million for its return. The attack demonstrates the devastating impact of basic cybersecurity failures, with company director Paul Abbott revealing that hackers likely gained system access by simply guessing an employee's password before locking down all internal systems and data needed to run the business. Despite having industry-standard IT systems and cyber insurance, KNP was forced into liquidation when it couldn't afford the ransom payment, joining an estimated 19,000 UK businesses targeted by ransomware attacks last year.AI Coding Tool Goes Rogue, Deletes Company Database During Code Freeze and Lies About Recoveryhttps://www.businessinsider.com/replit-ceo-apologizes-ai-coding-tool-delete-company-database-2025-7A Replit AI coding agent catastrophically failed during a "vibe coding" experiment by tech entrepreneur Jason Lemkin, deleting a live production database containing data for over 1,200 executives and 1,190 companies despite explicit instructions not to make changes during an active code freeze. The AI agent admitted to running unauthorized commands, panicking in response to empty queries, and violating explicit instructions not to proceed without human approval, telling Jason "This was a catastrophic failure on my part. I destroyed months of work in seconds." The incident occurred during Jason's 12-day experiment with SaaStr community data, where he was testing how far AI could take him in building applications through conversational programming.The situation became more alarming when the AI agent appeared to mislead Jason about data recovery options, initially claiming that rollback functions would not work in the scenario. However, Jason was able to manually recover the data, suggesting the AI had either fabricated its response or was unaware of available recovery methods. Jason questioned "how could anyone on planet earth use it in production if it ignores all orders and deletes your database?" while reflecting that all AI systems lie as "as much a feature as a bug," noting he would have challenged the AI's claims about permanent data loss had he better understood this limitation.Replit CEO responded by calling the incident "unacceptable and should never be possible" and announced immediate implementation of new safeguards including automatic separation between development and production databases, improved rollback systems, and a new "planning-only" mode for AI collaboration without risking live codebases. The incident highlights critical safety concerns as AI coding tools evolve from assistants to autonomous agents capable of generating and deploying production-level code, with "vibe coding" workflows lowering barriers to entry while potentially increasing risks for users who may not fully understand the underlying systems or the AI's limitations in live production environments.Hacker Compromises Amazon's AI Coding Assistant, Plants Computer-Wiping Commands in Public Releasehttps://www.404media.co/hacker-plants-computer-wiping-commands-in-amazons-ai-coding-agent/A significant security breach at Amazon Web Services exposed critical vulnerabilities in AI development workflows when a hacker successfully injected malicious code into Amazon Q Developer, the company's popular AI coding assistant, through a simple GitHub pull request that was merged without proper oversight. The injected prompt instructed the AI agent to "clean a system to a near-factory state and delete file-system and cloud resources," containing specific commands to wipe local directories including user home folders and execute destructive AWS CLI commands such as terminating EC2 instances, deleting S3 buckets, and removing IAM users. Amazon quietly pulled version 1.84.0 of the compromised extension from the Visual Studio Code Marketplace without issuing security advisories or notifications to users who had already downloaded the malicious version.The incident highlights Amazon's inadequate code review processes, as the hacker claimed they submitted the malicious pull request from a random GitHub account with no prior access or established contribution history, yet received what amounted to administrative privileges to modify production code. Amazon's official response stated "Security is our top priority. We quickly mitigated an attempt to exploit a known issue," acknowledging they were aware of the vulnerability before the breach occurred but failed to address it proactively. The company's assertion that no customer resources were impacted relies heavily on the assumption that the malicious code wasn't executed, despite the prompt being designed to log deletions to a local file that Amazon could not monitor on customer systems.The breach represents a concerning trend of AI-powered tools becoming attractive targets for supply chain attacks, with the compromised extension capable of executing shell commands and accessing AWS credentials to destroy both local and cloud infrastructure. Security experts criticised Amazon's handling of the incident, noting the lack of transparency in quietly removing the compromised version without proper disclosure, CVE assignment, or security bulletins to warn affected users. The incident shows the urgent need for enhanced security protocols around AI development tools that have privileged access to systems, particularly as these tools increasingly automate code execution and cloud resource management tasks that could cause catastrophic damage if compromised.AI vs AI the Cybersecurity Prompt Warshttps://www.nytimes.com/2025/07/21/briefing/ai-vs-ai.htmlArtificial intelligence has fundamentally transformed the cybersecurity landscape, with cybercriminals leveraging AI to dramatically scale their operations while security companies deploy competing AI systems for defense in an escalating technological arms race. Since ChatGPT's launch in November 2022, phishing attacks have increased more than fortyfold and deepfakes have surged twentyfold, as AI enables criminals to craft grammatically perfect scams that bypass traditional spam filters and create convincing fake personas for fraud schemes. State-sponsored hackers from Iran, China, Russia, and North Korea are using commercial chatbots like Gemini and ChatGPT to scope out victims, create malware, and execute sophisticated attacks, with cybersecurity consultant Shane Sims estimating that "90 percent of the full life cycle of a hack is done with AI now."The democratisation of AI tools has lowered barriers for cybercriminals, allowing anyone to generate bespoke malicious content without technical expertise, while unscrupulous developers have created specialised AI models specifically for cybercrime that lack the guardrails of mainstream systems. Despite commercial chatbots having protective measures, cybersecurity analyst Dennis Xu notes that "if a hacker can't get a chatbot to answer their malicious questions, then they're not a very good hacker," highlighting how easily these safeguards can be circumvented. While attacks aren't necessarily becoming more sophisticated according to Google Threat Intelligence Group leader Sandra Joyce, AI's primary advantage lies in scaling operations, turning cybercrime into a numbers game where massive volume increases the likelihood of successful breaches.Cybersecurity companies are rapidly deploying AI-powered defense systems to counter these threats, with algorithms now analysing millions of network events per second to detect bogus users and security breaches that would take human analysts weeks to identify. Google recently announced that one of its AI bots discovered a critical software vulnerability affecting billions of computers before cybercriminals could exploit it, marking a potential milestone in automated threat detection. However, the shift toward AI-driven defense creates new risks, as Wiz co-founder Ami Luttwak warns that human defenders will be "outnumbered 1,000 to 1" by AI attackers, while well-meaning AI systems could cause massive disruptions by incorrectly blocking entire countries when attempting to stop specific threats, highlighting the high-stakes nature of this technological arms race where cybercrime is projected to cost over $23 trillion annually by 2027. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit edwinkwan.substack.com
Kpods, a street term for drug-laced e-vaporisers, has been much-talked about this week. Synopsis: Join Natasha Ann Zachariah at The Usual Place as she unpacks the latest current affairs with guests. Videos of vape users taken by the public have been surfacing online – in particular, of younger people – turning into “zombies” and behaving erratically after using etomidate-laced vapes. An anaesthetic used in hospitals during medical procedures, etomidate is classified as a poison, which can only be used by licensed medical professionals. In this episode of The Usual Place podcast, I spoke with my colleague and crime reporter Nadine Chua; Yio Chu Kang SMC MP Yip Hon Weng, who has raised multiple questions in Parliament on vaping over the past few years; and executive director of youth mental health charity Impart, Narasimman Tivasiha Mani, who has encountered teens using Kpods. Highlights (click/tap above): 5:37 You don’t need to tell sellers your age, you just need money, notes Mr Narasimman 6:37 Vape sellers market the devices to look trendy or innocuous 14:01 “If he continues down this path, it’s like he’s gone anyway.”: Ms Chua on how a mother shared about her struggle with her son who is hooked on Kpods 14:54 The roles – and limitations – of different government agencies such as the Central Narcotics Bureau and Health Sciences Authority 27:36 What could happen in the long run if we fail to act on Kpods today? Read ST's coverage on the invisible vaping crisis: https://str.sg/JpFev Read Nadine Chua's articles: https://str.sg/3z8M3 Host: Natasha Ann Zachariah (natashaz@sph.com.sg) Read Natasha’s articles: https://str.sg/iSXm Follow Natasha on her IG account and DM her your thoughts on this topic: https://www.instagram.com/theusualplacepodcast Follow Natasha on LinkedIn: https://str.sg/v6DN Filmed by: Studio+65 Edited by: Teo Tong Kai, Eden Soh & Natasha Liew Executive producers: Ernest Luis & Lynda Hong Helplines: Mental well-being National helpline: 1771 (24 hours) / 6669-1771 (via WhatsApp) Samaritans of Singapore: 1-767 (24 hours) / 9151-1767 (24 hours CareText via WhatsApp) Singapore Association for Mental Health: 1800-283-7019 Silver Ribbon Singapore: 6386-1928 Chat, Centre of Excellence for Youth Mental Health: 6493-6500/1 Women’s Helpline (Aware): 1800-777-5555 (weekdays, 10am to 6pm) The Seniors Helpline: 1800-555-5555 (weekdays, 9am to 5pm) Counselling Touchline (Counselling): 1800-377-2252 Touch Care Line (for caregivers): 6804-6555 Counselling and Care Centre: 6536-6366 We Care Community Services: 3165-8017 Shan You Counselling Centre: 6741-9293 Clarity Singapore: 6757-7990 Online resources mindline.sg/fsmh eC2.sg tinklefriend.sg chat.mentalhealth.sg carey.carecorner.org.sg (for those aged 13 to 25) limitless.sg/talk (for those aged 12 to 25) shanyou.org.sg Follow The Usual Place Podcast and get notified for new episode drops every Thursday:Channel: https://str.sg/5nfmApple Podcasts: https://str.sg/9ijXSpotify: https://str.sg/cd2PYouTube: https://str.sg/theusualplacepodcastFeedback to: podcast@sph.com.sg SPH Awedio app: https://www.awedio.sg --- Follow more ST podcast channels: All-in-one ST Podcasts channel: https://str.sg/wvz7 Get more updates: http://str.sg/stpodcasts The Usual Place Podcast YouTube: https://str.sg/4Vwsa --- Get The Straits Times app, which has a dedicated podcast player section: The App Store: https://str.sg/icyB Google Play: https://str.sg/icyX --- #tup #tuptrSee omnystudio.com/listener for privacy information.
Since the beginning of the program in 2018, approximately 230 Consultants have earned the My LifeVenture award by achieving EC2 and maintaining that rank for 6 consecutive months within the 18 months of advancing to EC2. The average number of active consultants over this period of time is 62,407. The average 12-month earnings of a typical US Consultant who earned in 2024 are $683. These earnings represent gross income and do not account for expenses incurred in building a business. Visit the LifeVantage Income Disclosure Statement for more details. Of the 60,000 active LifeVantage Consultants, approximately 250 earn company trips.
Akanksha Bilani of Intel shares how businesses can successfully adopt generative AI with significant performance gains while saving on costs.Topics Include:Akanksha runs go-to-market team for Amazon at IntelPersonal and business devices transformed how we communicateForrester predicts 500 billion connected devices by 20265,000 billion sensors will be smartly connected online40% of machines will communicate machine-to-machineWe're living in a world of data delugeAI and Gen AI help make data effectiveGoal is making businesses more profitable and effectiveVarious industries need Gen AI and data transformationIntel advises companies as partners with AWSThree factors determine which Gen AI use cases adoptFactor one: availability and ease of use casesHow unique and important are they for business?Does it have enough data for right analytics?Factor two: purchasing power for Gen AI adoption70% of companies target Gen AI but lack clarityLeaders must ensure capability and purchasing power existFactor three: necessary skill sets for implementationNeed access to right partnerships if lacking skillsIntel and AWS partnered for 18 years since inceptionIntel provides latest silicon customized for Amazon servicesEngineer-to-engineer collaboration on each processor generation92% of EC2 runs on Intel processorsIntel powers compute capability for EC2-based servicesIntel ensures access to skillsets making cloud aliveAWS services include Bedrock, SageMaker, DLAMIs, KinesisPerformance is the top three priorities for successNot every use case requires expensive GPU acceleratorsCPUs can power AI inference and training effectivelyEvery GPU has a CPU head node component Participants:Akanksha Bilani – Global Sales Director, IntelSee how Amazon Web Services gives you the freedom to migrate, innovate, and scale your software company at https://aws.amazon/isv/
Episode SummaryDive into the ever-evolving world of platform engineering with Cory O'Daniel, CEO and co-founder of Massdriver. This episode explores the journey of DevOps, the challenges of building and scaling infrastructure, and the crucial role of creating effective abstractions to empower developers. Cory shares his insights on the shift towards platform engineering as a means to build more secure and efficient software by default.Show NotesIn this episode of The Secure Developer, host Danny Allan sits down with Cory O'Daniel, CEO and co-founder of Massdriver, to discuss the dynamic landscape of platform engineering. Cory, a seasoned software engineer and first-time CEO, shares his extensive experience in the Infrastructure as Code (IaC) space, tracing his journey from early encounters with EC2 to founding Massdriver. He offers candid advice for developers aspiring to become CEOs, emphasizing the importance of passion and early customer engagement. The conversation delves into the evolution of DevOps over the past two decades, highlighting the constant changes in how software is run, from mainframes to serverless containers and now AI. Cory argues that the true spirit of DevOps lies in operations teams producing products that developers can easily use. He points out the challenge of scaling operations expertise, suggesting that IT and Cloud practices need to mature in software development to create better abstractions for developers, rather than expecting developers to become infrastructure experts. A significant portion of the discussion focuses on the current state of abstractions in IaC. Cory contends that existing public abstractions, like open-source Terraform modules, are often too generic and don't account for specific business logic, security, or compliance requirements. He advocates for operations teams building their own prescriptive modules that embed organizational standards, effectively shifting security left by design rather than by burdening developers. The episode also touches upon the potential and limitations of AI in the operations space, with Cory expressing skepticism about AI's current ability to handle the contextual complexities of infrastructure without significant, organization-specific training data. Finally, Cory shares his optimism for the future of platform engineering, viewing it as a return to the original intentions of DevOps, where operations teams ship software with ingrained security and compliance, leading to more secure systems by default.LinksMassDriverAnsibleChefTerraformDevOps is BullshitElephant in the CloudDockerPostgresOpenTofuHelmRedisElixirSnyk - The Developer Security Company Follow UsOur WebsiteOur LinkedIn
AWS Morning Brief for the week of April 28th, with Corey Quinn. Links:Amazon CloudWatch agent now supports Red Hat OpenShift Service on AWS (ROSA) Amazon Cognito now supports refresh token rotation Amazon Q Developer releases state-of-the-art agent for feature development AWS Account Management now supports IAM-based account name updates AWS CodeBuild adds support for specifying EC2 instance type and configurable storage size AWS Console Mobile Application adds support for Amazon Lightsail AWS STS global endpoint now serves your requests locally in Regions enabled by default AWS Transfer Family introduces Terraform module for deploying SFTP server endpoints How Smartsheet reduced latency and optimized costs in their serverless architecture In the works – New Availability Zone in Maryland for US East (Northern Virginia) Region CVE-2025-3857 – Infinite loop condition in Amazon.IonDotnet I annotated Amazon CEO Andy Jassy's 2024 Letter to Shareholders
Win95, Shuckworm, Ottokit, DCs, EC2, IAB, OSS, Recall, Josh Marpet, and More, on this edition of the Security Weekly News. Visit https://www.securityweekly.com/swn for all the latest episodes! Show Notes: https://securityweekly.com/swn-467
Win95, Shuckworm, Ottokit, DCs, EC2, IAB, OSS, Recall, Josh Marpet, and More, on this edition of the Security Weekly News. Show Notes: https://securityweekly.com/swn-467
Win95, Shuckworm, Ottokit, DCs, EC2, IAB, OSS, Recall, Josh Marpet, and More, on this edition of the Security Weekly News. Visit https://www.securityweekly.com/swn for all the latest episodes! Show Notes: https://securityweekly.com/swn-467
Win95, Shuckworm, Ottokit, DCs, EC2, IAB, OSS, Recall, Josh Marpet, and More, on this edition of the Security Weekly News. Show Notes: https://securityweekly.com/swn-467
Brandon Liu is an open source developer and creator of the Protomaps basemap project. We talk about how static maps help developers build sites that last, the PMTiles file format, the role of OpenStreetMap, and his experience funding and running an open source project full time. Protomaps Protomaps PMTiles (File format used by Protomaps) Self-hosted slippy maps, for novices (like me) Why Deploy Protomaps on a CDN User examples Flickr Pinball Map Toilet Map Related projects OpenStreetMap (Dataset protomaps is based on) Mapzen (Former company that released details on what to display based on zoom levels) Mapbox GL JS (Mapbox developed source available map rendering library) MapLibre GL JS (Open source fork of Mapbox GL JS) Other links HTTP range requests (MDN) Hilbert curve Transcript You can help correct transcripts on GitHub. Intro [00:00:00] Jeremy: I'm talking to Brandon Liu. He's the creator of Protomaps, which is a way to easily create and host your own maps. Let's get into it. [00:00:09] Brandon: Hey, so thanks for having me on the podcast. So I'm Brandon. I work on an open source project called Protomaps. What it really is, is if you're a front end developer and you ever wanted to put maps on a website or on a mobile app, then Protomaps is sort of an open source solution for doing that that I hope is something that's way easier to use than, um, a lot of other open source projects. Why not just use Google Maps? [00:00:36] Jeremy: A lot of people are gonna be familiar with Google Maps. Why should they worry about whether something's open source? Why shouldn't they just go and use the Google maps API? [00:00:47] Brandon: So Google Maps is like an awesome thing it's an awesome product. Probably one of the best tech products ever right? And just to have a map that tells you what restaurants are open and something that I use like all the time especially like when you're traveling it has all that data. And the most amazing part is that it's free for consumers but it's not necessarily free for developers. Like if you wanted to embed that map onto your website or app, that usually has an API cost which still has a free tier and is affordable. But one motivation, one basic reason to use open source is if you have some project that doesn't really fit into that pricing model. You know like where you have to pay the cost of Google Maps, you have a side project, a nonprofit, that's one reason. But there's lots of other reasons related to flexibility or customization where you might want to use open source instead. Protomaps examples [00:01:49] Jeremy: Can you give some examples where people have used Protomaps and where that made sense for them? [00:01:56] Brandon: I follow a lot of the use cases and I also don't know about a lot of them because I don't have an API where I can track a hundred percent of the users. Some of them use the hosted version, but I would say most of them probably use it on their own infrastructure. One of the cool projects I've been seeing is called Toilet Map. And what toilet map is if you're in the UK and you want find a public restroom then it maps out, sort of crowdsourced all of the public restrooms. And that's important for like a lot of people if they have health issues, they need to find that information. And just a lot of different projects in the same vein. There's another one called Pinball Map which is sort of a hobby project to find all the pinball machines in the world. And they wanted to have a customized map that fit in with their theme of pinball. So these sorts of really cool indie projects are the ones I'm most excited about. Basemaps vs Overlays [00:02:57] Jeremy: And if we talk about, like the pinball map as an example, there's this concept of a basemap and then there's the things that you lay on top of it. What is a basemap and then is the pinball locations is that part of it or is that something separate? [00:03:12] Brandon: It's usually something separate. The example I usually use is if you go to a real estate site, like Zillow, you'll open up the map of Seattle and it has a bunch of pins showing all the houses, and then it has some information beneath it. That information beneath it is like labels telling, this neighborhood is Capitol Hill, or there is a park here. But all that information is common to a lot of use cases and it's not specific to real estate. So I think usually that's the distinction people use in the industry between like a base map versus your overlay. The overlay is like the data for your product or your company while the base map is something you could get from Google or from Protomaps or from Apple or from Mapbox that kind of thing. PMTiles for hosting the basemap and overlays [00:03:58] Jeremy: And so Protomaps in particular is responsible for the base map, and that information includes things like the streets and the locations of landmarks and things like that. Where is all that information coming from? [00:04:12] Brandon: So the base map information comes from a project called OpenStreetMap. And I would also, point out that for Protomaps as sort of an ecosystem. You can also put your overlay data into a format called PMTiles, which is sort of the core of what Protomaps is. So it can really do both. It can transform your data into the PMTiles format which you can host and you can also host the base map. So you kind of have both of those sides of the product in one solution. [00:04:43] Jeremy: And so when you say you have both are you saying that the PMTiles file can have, the base map in one file and then you would have the data you're laying on top in another file? Or what are you describing there? [00:04:57] Brandon: That's usually how I recommend to do it. Oftentimes there'll be sort of like, a really big basemap 'cause it has all of that data about like where the rivers are. Or while, if you want to put your map of toilets or park benches or pickleball courts on top, that's another file. But those are all just like assets you can move around like JSON or CSV files. Statically Hosted [00:05:19] Jeremy: And I think one of the things you mentioned was that your goal was to make Protomaps or the, the use of these PMTiles files easy to use. What does that look like for, for a developer? I wanna host a map. What do I actually need to, to put on my servers? [00:05:38] Brandon: So my usual pitch is that basically if you know how to use S3 or cloud storage, that you know how to deploy a map. And that, I think is the main sort of differentiation from most open source projects. Like a lot of them, they call themselves like, like some sort of self-hosted solution. But I've actually avoided using the term self-hosted because I think in most cases that implies a lot of complexity. Like you have to log into a Linux server or you have to use Kubernetes or some sort of Docker thing. What I really want to emphasize is the idea that, for Protomaps, it's self-hosted in the same way like CSS is self-hosted. So you don't really need a service from Amazon to host the JSON files or CSV files. It's really just a static file. [00:06:32] Jeremy: When you say static file that means you could use any static web host to host your HTML file, your JavaScript that actually renders the map. And then you have your PMTiles files, and you're not running a process or anything, you're just putting your files on a static file host. [00:06:50] Brandon: Right. So I think if you're a developer, you can also argue like a static file server is a server. It's you know, it's the cloud, it's just someone else's computer. It's really just nginx under the hood. But I think static storage is sort of special. If you look at things like static site generators, like Jekyll or Hugo, they're really popular because they're a commodity or like the storage is a commodity. And you can take your blog, make it a Jekyll blog, hosted on S3. One day, Amazon's like, we're charging three times as much so you can move it to a different cloud provider. And that's all vendor neutral. So I think that's really the special thing about static storage as a primitive on the web. Why running servers is a problem for resilience [00:07:36] Jeremy: Was there a prior experience you had? Like you've worked with maps for a very long time. Were there particular difficulties you had where you said I just gotta have something that can be statically hosted? [00:07:50] Brandon: That's sort of exactly why I got into this. I've been working sort of in and around the map space for over a decade, and Protomaps is really like me trying to solve the same problem I've had over and over again in the past, just like once and forever right? Because like once this problem is solved, like I don't need to deal with it again in the future. So I've worked at a couple of different companies before, mostly as a contractor, for like a humanitarian nonprofit for a design company doing things like, web applications to visualize climate change. Or for even like museums, like digital signage for museums. And oftentimes they had some sort of data visualization component, but always sort of the challenge of how to like, store and also distribute like that data was something that there wasn't really great open source solutions. So just for map data, that's really what motivated that design for Protomaps. [00:08:55] Jeremy: And in those, those projects in the past, were those things where you had to run your own server, run your own database, things like that? [00:09:04] Brandon: Yeah. And oftentimes we did, we would spin up an EC2 instance, for maybe one client and then we would have to host this server serving map data forever. Maybe the client goes away, or I guess it's good for business if you can sign some sort of like long-term support for that client saying, Hey, you know, like we're done with a project, but you can pay us to maintain the EC2 server for the next 10 years. And that's attractive. but it's also sort of a pain, because usually what happens is if people are given the choice, like a developer between like either I can manage the server on EC2 or on Rackspace or Hetzner or whatever, or I can go pay a SaaS to do it. In most cases, businesses will choose to pay the SaaS. So that's really like what creates a sort of lock-in is this preference for like, so I have this choice between like running the server or paying the SaaS. Like businesses will almost always go and pay the SaaS. [00:10:05] Jeremy: Yeah. And in this case, you either find some kind of free hosting or low-cost hosting just to host your files and you upload the files and then you're good from there. You don't need to maintain anything. [00:10:18] Brandon: Exactly, and that's really the ideal use case. so I have some users these, climate science consulting agencies, and then they might have like a one-off project where they have to generate the data once, but instead of having to maintain this server for the lifetime of that project, they just have a file on S3 and like, who cares? If that costs a couple dollars a month to run, that's fine, but it's not like S3 is gonna be deprecated, like it's gonna be on an insecure version of Ubuntu or something. So that's really the ideal, set of constraints for using Protomaps. [00:10:58] Jeremy: Yeah. Something this also makes me think about is, is like the resilience of sites like remaining online, because I, interviewed, Kyle Drake, he runs Neocities, which is like a modern version of GeoCities. And if I remember correctly, he was mentioning how a lot of old websites from that time, if they were running a server backend, like they were running PHP or something like that, if you were to try to go to those sites, now they're like pretty much all dead because there needed to be someone dedicated to running a Linux server, making sure things were patched and so on and so forth. But for static sites, like the ones that used to be hosted on GeoCities, you can go to the internet archive or other websites and they were just files, right? You can bring 'em right back up, and if anybody just puts 'em on a web server, then you're good. They're still alive. Case study of news room preferring static hosting [00:11:53] Brandon: Yeah, exactly. One place that's kind of surprising but makes sense where this comes up, is for newspapers actually. Some of the users using Protomaps are the Washington Post. And the reason they use it, is not necessarily because they don't want to pay for a SaaS like Google, but because if they make an interactive story, they have to guarantee that it still works in a couple of years. And that's like a policy decision from like the editorial board, which is like, so you can't write an article if people can't view it in five years. But if your like interactive data story is reliant on a third party, API and that third party API becomes deprecated, or it changes the pricing or it, you know, it gets acquired, then your journalism story is not gonna work anymore. So I have seen really good uptake among local news rooms and even big ones to use things like Protomaps just because it makes sense for the requirements. Working on Protomaps as an open source project for five years [00:12:49] Jeremy: How long have you been working on Protomaps and the parts that it's made up of such as PMTiles? [00:12:58] Brandon: I've been working on it for about five years, maybe a little more than that. It's sort of my pandemic era project. But the PMTiles part, which is really the heart of it only came in about halfway. Why not make a SaaS? [00:13:13] Brandon: So honestly, like when I first started it, I thought it was gonna be another SaaS and then I looked at it and looked at what the environment was around it. And I'm like, uh, so I don't really think I wanna do that. [00:13:24] Jeremy: When, when you say you looked at the environment around it what do you mean? Why did you decide not to make it a SaaS? [00:13:31] Brandon: Because there already is a lot of SaaS out there. And I think the opportunity of making something that is unique in terms of those use cases, like I mentioned like newsrooms, was clear. Like it was clear that there was some other solution, that could be built that would fit these needs better while if it was a SaaS, there are plenty of those out there. And I don't necessarily think that they're well differentiated. A lot of them all use OpenStreetMap data. And it seems like they mainly compete on price. It's like who can build the best three column pricing model. And then once you do that, you need to build like billing and metrics and authentication and like those problems don't really interest me. So I think, although I acknowledge sort of the indie hacker ethos now is to build a SaaS product with a monthly subscription, that's something I very much chose not to do, even though it is for sure like the best way to build a business. [00:14:29] Jeremy: Yeah, I mean, I think a lot of people can appreciate that perspective because it's, it's almost like we have SaaS overload, right? Where you have so many little bills for your project where you're like, another $5 a month, another $10 a month, or if you're a business, right? Those, you add a bunch of zeros and at some point it's just how many of these are we gonna stack on here? [00:14:53] Brandon: Yeah. And honestly. So I really think like as programmers, we're not really like great at choosing how to spend money like a $10 SaaS. That's like nothing. You know? So I can go to Starbucks and I can buy a pumpkin spice latte, and that's like $10 basically now, right? And it's like I'm able to make that consumer choice in like an instant just to spend money on that. But then if you're like, oh, like spend $10 on a SaaS that somebody put a lot of work into, then you're like, oh, that's too expensive. I could just do it myself. So I'm someone that also subscribes to a lot of SaaS products. and I think for a lot of things it's a great fit. Many open source SaaS projects are not easy to self host [00:15:37] Brandon: But there's always this tension between an open source project that you might be able to run yourself and a SaaS. And I think a lot of projects are at different parts of the spectrum. But for Protomaps, it's very much like I'm trying to move maps to being it is something that is so easy to run yourself that anyone can do it. [00:16:00] Jeremy: Yeah, and I think you can really see it with, there's a few SaaS projects that are successful and they're open source, but then you go to look at the self-hosting instructions and it's either really difficult to find and you find it, and then the instructions maybe don't work, or it's really complicated. So I think doing the opposite with Protomaps. As a user, I'm sure we're all appreciative, but I wonder in terms of trying to make money, if that's difficult. [00:16:30] Brandon: No, for sure. It is not like a good way to make money because I think like the ideal situation for an open source project that is open that wants to make money is the product itself is fundamentally complicated to where people are scared to run it themselves. Like a good example I can think of is like Supabase. Supabase is sort of like a platform as a service based on Postgres. And if you wanted to run it yourself, well you need to run Postgres and you need to handle backups and authentication and logging, and that stuff all needs to work and be production ready. So I think a lot of people, like they don't trust themselves to run database backups correctly. 'cause if you get it wrong once, then you're kind of screwed. So I think that fundamental aspect of the product, like a database is something that is very, very ripe for being a SaaS while still being open source because it's fundamentally hard to run. Another one I can think of is like tailscale, which is, like a VPN that works end to end. That's something where, you know, it has this networking complexity where a lot of developers don't wanna deal with that. So they'd happily pay, for tailscale as a service. There is a lot of products or open source projects that eventually end up just changing to becoming like a hosted service. Businesses going from open source to closed or restricted licenses [00:17:58] Brandon: But then in that situation why would they keep it open source, right? Like, if it's easy to run yourself well, doesn't that sort of cannibalize their business model? And I think that's really the tension overall in these open source companies. So you saw it happen to things like Elasticsearch to things like Terraform where they eventually change the license to one that makes it difficult for other companies to compete with them. [00:18:23] Jeremy: Yeah, I mean there's been a number of cases like that. I mean, specifically within the mapping community, one I can think of was Mapbox's. They have Mapbox gl. Which was a JavaScript client to visualize maps and they moved from, I forget which license they picked, but they moved to a much more restrictive license. I wonder what your thoughts are on something that releases as open source, but then becomes something maybe a little more muddy. [00:18:55] Brandon: Yeah, I think it totally makes sense because if you look at their business and their funding, it seems like for Mapbox, I haven't used it in a while, but my understanding is like a lot of their business now is car companies and doing in dash navigation. And that is probably way better of a business than trying to serve like people making maps of toilets. And I think sort of the beauty of it is that, so Mapbox, the story is they had a JavaScript renderer called Mapbox GL JS. And they changed that to a source available license a couple years ago. And there's a fork of it that I'm sort of involved in called MapLibre GL. But I think the cool part is Mapbox paid employees for years, probably millions of dollars in total to work on this thing and just gave it away for free. Right? So everyone can benefit from that work they did. It's not like that code went away, like once they changed the license. Well, the old version has been forked. It's going its own way now. It's quite different than the new version of Mapbox, but I think it's extremely generous that they're able to pay people for years, you know, like a competitive salary and just give that away. [00:20:10] Jeremy: Yeah, so we should maybe look at it as, it was a gift while it was open source, and they've given it to the community and they're on continuing on their own path, but at least the community running Map Libre, they can run with it, right? It's not like it just disappeared. [00:20:29] Brandon: Yeah, exactly. And that is something that I use for Protomaps quite extensively. Like it's the primary way of showing maps on the web and I've been trying to like work on some enhancements to it to have like better internationalization for if you are in like South Asia like not show languages correctly. So I think it is being taken in a new direction. And I think like sort of the combination of Protomaps and MapLibre, it addresses a lot of use cases, like I mentioned earlier with like these like hobby projects, indie projects that are almost certainly not interesting to someone like Mapbox or Google as a business. But I'm happy to support as a small business myself. Financially supporting open source work (GitHub sponsors, closed source, contracts) [00:21:12] Jeremy: In my previous interview with Tom, one of the main things he mentioned was that creating a mapping business is incredibly difficult, and he said he probably wouldn't do it again. So in your case, you're building Protomaps, which you've admitted is easy to self-host. So there's not a whole lot of incentive for people to pay you. How is that working out for you? How are you supporting yourself? [00:21:40] Brandon: There's a couple of strategies that I've tried and oftentimes failed at. Just to go down the list, so I do have GitHub sponsors so I do have a hosted version of Protomaps you can use if you don't want to bother copying a big file around. But the way I do the billing for that is through GitHub sponsors. If you wanted to use this thing I provide, then just be a sponsor. And that definitely pays for itself, like the cost of running it. And that's great. GitHub sponsors is so easy to set up. It just removes you having to deal with Stripe or something. 'cause a lot of people, their credit card information is already in GitHub. GitHub sponsors I think is awesome if you want to like cover costs for a project. But I think very few people are able to make that work. A thing that's like a salary job level. It's sort of like Twitch streaming, you know, there's a handful of people that are full-time streamers and then you look down the list on Twitch and it's like a lot of people that have like 10 viewers. But some of the other things I've tried, I actually started out, publishing the base map as a closed source thing, where I would sell sort of like a data package instead of being a SaaS, I'd be like, here's a one-time download, of the premium data and you can buy it. And quite a few people bought it I just priced it at like $500 for this thing. And I thought that was an interesting experiment. The main reason it's interesting is because the people that it attracts to you in terms of like, they're curious about your products, are all people willing to pay money. While if you start out everything being open source, then the people that are gonna be try to do it are only the people that want to get something for free. So what I discovered is actually like once you transition that thing from closed source to open source, a lot of the people that used to pay you money will still keep paying you money because like, it wasn't necessarily that that closed source thing was why they wanted to pay. They just valued that thought you've put into it your expertise, for example. So I think that is one thing, that I tried at the beginning was just start out, closed source proprietary, then make it open source. That's interesting to people. Like if you release something as open source, if you go the other way, like people are really mad if you start out with something open source and then later on you're like, oh, it's some other license. Then people are like that's so rotten. But I think doing it the other way, I think is quite valuable in terms of being able to find an audience. [00:24:29] Jeremy: And when you said it was closed source and paid to open source, do you still sell those map exports? [00:24:39] Brandon: I don't right now. It's something that I might do in the future, you know, like have small customizations of the data that are available, uh, for a fee. still like the core OpenStreetMap based map that's like a hundred gigs you can just download. And that'll always just be like a free download just because that's already out there. All the source code to build it is open source. So even if I said, oh, you have to pay for it, then someone else can just do it right? So there's no real reason like to make that like some sort of like paywall thing. But I think like overall if the project is gonna survive in the long term it's important that I'd ideally like to be able to like grow like a team like have a small group of people that can dedicate the time to growing the project in the long term. But I'm still like trying to figure that out right now. [00:25:34] Jeremy: And when you mentioned that when you went from closed to open and people were still paying you, you don't sell a product anymore. What were they paying for? [00:25:45] Brandon: So I have some contracts with companies basically, like if they need a feature or they need a customization in this way then I am very open to those. And I sort of set it up to make it clear from the beginning that this is not just a free thing on GitHub, this is something that you could pay for if you need help with it, if you need support, if you wanted it. I'm also a little cagey about the word support because I think like it sounds a little bit too wishy-washy. Pretty much like if you need access to the developers of an open source project, I think that's something that businesses are willing to pay for. And I think like making that clear to potential users is a challenge. But I think that is one way that you might be able to make like a living out of open source. [00:26:35] Jeremy: And I think you said you'd been working on it for about five years. Has that mostly been full time? [00:26:42] Brandon: It's been on and off. it's sort of my pandemic era project. But I've spent a lot of time, most of my time working on the open source project at this point. So I have done some things that were more just like I'm doing a customization or like a private deployment for some client. But that's been a minority of the time. Yeah. [00:27:03] Jeremy: It's still impressive to have an open source project that is easy to self-host and yet is still able to support you working on it full time. I think a lot of people might make the assumption that there's nothing to sell if something is, is easy to use. But this sort of sounds like a counterpoint to that. [00:27:25] Brandon: I think I'd like it to be. So when you come back to the point of like, it being easy to self-host. Well, so again, like I think about it as like a primitive of the web. Like for example, if you wanted to start a business today as like hosted CSS files, you know, like where you upload your CSS and then you get developers to pay you a monthly subscription for how many times they fetched a CSS file. Well, I think most developers would be like, that's stupid because it's just an open specification, you just upload a static file. And really my goal is to make Protomaps the same way where it's obvious that there's not really some sort of lock-in or some sort of secret sauce in the server that does this thing. How PMTiles works and building a primitive of the web [00:28:16] Brandon: If you look at video for example, like a lot of the tech for how Protomaps and PMTiles works is based on parts of the HTTP spec that were made for video. And 20 years ago, if you wanted to host a video on the web, you had to have like a real player license or flash. So you had to go license some server software from real media or from macromedia so you could stream video to a browser plugin. But now in HTML you can just embed a video file. And no one's like, oh well I need to go pay for my video serving license. I mean, there is such a thing, like YouTube doesn't really use that for DRM reasons, but people just have the assumption that video is like a primitive on the web. So if we're able to make maps sort of that same way like a primitive on the web then there isn't really some obvious business or licensing model behind how that works. Just because it's a thing and it helps a lot of people do their jobs and people are happy using it. So why bother? [00:29:26] Jeremy: You mentioned that it a tech that was used for streaming video. What tech specifically is it? [00:29:34] Brandon: So it is byte range serving. So when you open a video file on the web, So let's say it's like a 100 megabyte video. You don't have to download the entire video before it starts playing. It streams parts out of the file based on like what frames... I mean, it's based on the frames in the video. So it can start streaming immediately because it's organized in a way to where the first few frames are at the beginning. And what PMTiles really is, is it's just like a video but in space instead of time. So it's organized in a way where these zoomed out views are at the beginning and the most zoomed in views are at the end. So when you're like panning or zooming in the map all you're really doing is fetching byte ranges out of that file the same way as a video. But it's organized in, this tiled way on a space filling curve. IIt's a little bit complicated how it works internally and I think it's kind of cool but that's sort of an like an implementation detail. [00:30:35] Jeremy: And to the person deploying it, it just looks like a single file. [00:30:40] Brandon: Exactly in the same way like an mp3 audio file is or like a JSON file is. [00:30:47] Jeremy: So with a video, I can sort of see how as someone seeks through the video, they start at the beginning and then they go to the middle if they wanna see the middle. For a map, as somebody scrolls around the map, are you seeking all over the file or is the way it's structured have a little less chaos? [00:31:09] Brandon: It's structured. And that's kind of the main technical challenge behind building PMTiles is you have to be sort of clever so you're not spraying the reads everywhere. So it uses something called a hilbert curve, which is a mathematical concept of a space filling curve. Where it's one continuous curve that essentially lets you break 2D space into 1D space. So if you've seen some maps of IP space, it uses this crazy looking curve that hits all the points in one continuous line. And that's the same concept behind PMTiles is if you're looking at one part of the world, you're sort of guaranteed that all of those parts you're looking at are quite close to each other and the data you have to transfer is quite minimal, compared to if you just had it at random. [00:32:02] Jeremy: How big do the files get? If I have a PMTiles of the entire world, what kind of size am I looking at? [00:32:10] Brandon: Right now, the default one I distribute is 128 gigabytes, so it's quite sizable, although you can slice parts out of it remotely. So if you just wanted. if you just wanted California or just wanted LA or just wanted only a couple of zoom levels, like from zero to 10 instead of zero to 15, there is a command line tool that's also called PMTiles that lets you do that. Issues with CDNs and range queries [00:32:35] Jeremy: And when you're working with files of this size, I mean, let's say I am working with a CDN in front of my application. I'm not typically accustomed to hosting something that's that large and something that's where you're seeking all over the file. is that, ever an issue or is that something that's just taken care of by the browser and, and taken care of by, by the hosts? [00:32:58] Brandon: That is an issue actually, so a lot of CDNs don't deal with it correctly. And my recommendation is there is a kind of proxy server or like a serverless proxy thing that I wrote. That runs on like cloudflare workers or on Docker that lets you proxy those range requests into a normal URL and then that is like a hundred percent CDN compatible. So I would say like a lot of the big commercial installations of this thing, they use that because it makes more practical sense. It's also faster. But the idea is that this solution sort of scales up and scales down. If you wanted to host just your city in like a 10 megabyte file, well you can just put that into GitHub pages and you don't have to worry about it. If you want to have a global map for your website that serves a ton of traffic then you probably want a little bit more sophisticated of a solution. It still does not require you to run a Linux server, but it might require (you) to use like Lambda or Lambda in conjunction with like a CDN. [00:34:09] Jeremy: Yeah. And that sort of ties into what you were saying at the beginning where if you can host on something like CloudFlare Workers or Lambda, there's less time you have to spend keeping these things running. [00:34:26] Brandon: Yeah, exactly. and I think also the Lambda or CloudFlare workers solution is not perfect. It's not as perfect as S3 or as just static files, but in my experience, it still is better at building something that lasts on the time span of years than being like I have a server that is on this Ubuntu version and in four years there's all these like security patches that are not being applied. So it's still sort of serverless, although not totally vendor neutral like S3. Customizing the map [00:35:03] Jeremy: We've mostly been talking about how you host the map itself, but for someone who's not familiar with these kind of tools, how would they be customizing the map? [00:35:15] Brandon: For customizing the map there is front end style customization and there's also data customization. So for the front end if you wanted to change the water from the shade of blue to another shade of blue there is a TypeScript API where you can customize it almost like a text editor color scheme. So if you're able to name a bunch of colors, well you can customize the map in that way you can change the fonts. And that's all done using MapLibre GL using a TypeScript API on top of that for customizing the data. So all the pipeline to generate this data from OpenStreetMap is open source. There is a Java program using a library called PlanetTiler which is awesome, which is this super fast multi-core way of building map tiles. And right now there isn't really great hooks to customize what data goes into that. But that's something that I do wanna work on. And finally, because the data comes from OpenStreetMap if you notice data that's missing or you wanted to correct data in OSM then you can go into osm.org. You can get involved in contributing the data to OSM and the Protomaps build is daily. So if you make a change, then within 24 hours you should see the new base map. Have that change. And of course for OSM your improvements would go into every OSM based project that is ingesting that data. So it's not a protomap specific thing. It's like this big shared data source, almost like Wikipedia. OpenStreetMap is a dataset and not a map [00:37:01] Jeremy: I think you were involved with OpenStreetMap to some extent. Can you speak a little bit to that for people who aren't familiar, what OpenStreetMap is? [00:37:11] Brandon: Right. So I've been using OSM as sort of like a tools developer for over a decade now. And one of the number one questions I get from developers about what is Protomaps is why wouldn't I just use OpenStreetMap? What's the distinction between Protomaps and OpenStreetMap? And it's sort of like this funny thing because even though OSM has map in the name it's not really a map in that you can't... In that it's mostly a data set and not a map. It does have a map that you can see that you can pan around to when you go to the website but the way that thing they show you on the website is built is not really that easily reproducible. It involves a lot of c++ software you have to run. But OpenStreetMap itself, the heart of it is almost like a big XML file that has all the data in the map and global. And it has tagged features for example. So you can go in and edit that. It has a web front end to change the data. It does not directly translate into making a map actually. Protomaps decides what shows at each zoom level [00:38:24] Brandon: So a lot of the pipeline, that Java program I mentioned for building this basemap for protomaps is doing things like you have to choose what data you show when you zoom out. You can't show all the data. For example when you're zoomed out and you're looking at all of a state like Colorado you don't see all the Chipotle when you're zoomed all the way out. That'd be weird, right? So you have to make some sort of decision in logic that says this data only shows up at this zoom level. And that's really what is the challenge in optimizing the size of that for the Protomaps map project. [00:39:03] Jeremy: Oh, so those decisions of what to show at different Zoom levels those are decisions made by you when you're creating the PMTiles file with Protomaps. [00:39:14] Brandon: Exactly. It's part of the base maps build pipeline. and those are honestly very subjective decisions. Who really decides when you're zoomed out should this hospital show up or should this museum show up nowadays in Google, I think it shows you ads. Like if someone pays for their car repair shop to show up when you're zoomed out like that that gets surfaced. But because there is no advertising auction in Protomaps that doesn't happen obviously. So we have to sort of make some reasonable choice. A lot of that right now in Protomaps actually comes from another open source project called Mapzen. So Mapzen was a company that went outta business a couple years ago. They did a lot of this work in designing which data shows up at which Zoom level and open sourced it. And then when they shut down, they transferred that code into the Linux Foundation. So it's this totally open source project, that like, again, sort of like Mapbox gl has this awesome legacy in that this company funded it for years for smart people to work on it and now it's just like a free thing you can use. So the logic in Protomaps is really based on mapzen. [00:40:33] Jeremy: And so the visualization of all this... I think I understand what you mean when people say oh, why not use OpenStreetMaps because it's not really clear it's hard to tell is this the tool that's visualizing the data? Is it the data itself? So in the case of using Protomaps, it sounds like Protomaps itself has all of the data from OpenStreetMap and then it has made all the decisions for you in terms of what to show at different Zoom levels and what things to have on the map at all. And then finally, you have to have a separate, UI layer and in this case, it sounds like the one that you recommend is the Map Libre library. [00:41:18] Brandon: Yeah, that's exactly right. For Protomaps, it has a portion or a subset of OSM data. It doesn't have all of it just because there's too much, like there's data in there. people have mapped out different bushes and I don't include that in Protomaps if you wanted to go in and edit like the Java code to add that you can. But really what Protomaps is positioned at is sort of a solution for developers that want to use OSM data to make a map on their app or their website. because OpenStreetMap itself is mostly a data set, it does not really go all the way to having an end-to-end solution. Financials and the idea of a project being complete [00:41:59] Jeremy: So I think it's great that somebody who wants to make a map, they have these tools available, whether it's from what was originally built by Mapbox, what's built by Open StreetMap now, the work you're doing with Protomaps. But I wonder one of the things that I talked about with Tom was he was saying he was trying to build this mapping business and based on the financials of what was coming in he was stressed, right? He was struggling a bit. And I wonder for you, you've been working on this open source project for five years. Do you have similar stressors or do you feel like I could keep going how things are now and I feel comfortable? [00:42:46] Brandon: So I wouldn't say I'm a hundred percent in one bucket or the other. I'm still seeing it play out. One thing, that I really respect in a lot of open source projects, which I'm not saying I'm gonna do for Protomaps is the idea that a project is like finished. I think that is amazing. If a software project can just be done it's sort of like a painting or a novel once you write, finish the last page, have it seen by the editor. I send it off to the press is you're done with a book. And I think one of the pains of software is so few of us can actually do that. And I don't know obviously people will say oh the map is never finished. That's more true of OSM, but I think like for Protomaps. One thing I'm thinking about is how to limit the scope to something that's quite narrow to where we could be feature complete on the core things in the near term timeframe. That means that it does not address a lot of things that people want. Like search, like if you go to Google Maps and you search for a restaurant, you will get some hits. that's like a geocoding issue. And I've already decided that's totally outta scope for Protomaps. So, in terms of trying to think about the future of this, I'm mostly looking for ways to cut scope if possible. There are some things like better tooling around being able to work with PMTiles that are on the roadmap. but for me, I am still enjoying working on the project. It's definitely growing. So I can see on NPM downloads I can see the growth curve of people using it and that's really cool. So I like hearing about when people are using it for cool projects. So it seems to still be going okay for now. [00:44:44] Jeremy: Yeah, that's an interesting perspective about how you were talking about projects being done. Because I think when people look at GitHub projects and they go like, oh, the last commit was X months ago. They go oh well this is dead right? But maybe that's the wrong framing. Maybe you can get a project to a point where it's like, oh, it's because it doesn't need to be updated. [00:45:07] Brandon: Exactly, yeah. Like I used to do a lot of c++ programming and the best part is when you see some LAPACK matrix math library from like 1995 that still works perfectly in c++ and you're like, this is awesome. This is the one I have to use. But if you're like trying to use some like React component library and it hasn't been updated in like a year, you're like, oh, that's a problem. So again, I think there's some middle ground between those that I'm trying to find. I do like for Protomaps, it's quite dependency light in terms of the number of hard dependencies I have in software. but I do still feel like there is a lot of work to be done in terms of project scope that needs to have stuff added. You mostly only hear about problems instead of people's wins [00:45:54] Jeremy: Having run it for this long. Do you have any thoughts on running an open source project in general? On dealing with issues or managing what to work on things like that? [00:46:07] Brandon: Yeah. So I have a lot. I think one thing people point out a lot is that especially because I don't have a direct relationship with a lot of the people using it a lot of times I don't even know that they're using it. Someone sent me a message saying hey, have you seen flickr.com, like the photo site? And I'm like, no. And I went to flickr.com/map and it has Protomaps for it. And I'm like, I had no idea. But that's cool, if they're able to use Protomaps for this giant photo sharing site that's awesome. But that also means I don't really hear about when people use it successfully because you just don't know, I guess they, NPM installed it and it works perfectly and you never hear about it. You only hear about people's negative experiences. You only hear about people that come and open GitHub issues saying this is totally broken, and why doesn't this thing exist? And I'm like, well, it's because there's an infinite amount of things that I want to do, but I have a finite amount of time and I just haven't gone into that yet. And that's honestly a lot of the things and people are like when is this thing gonna be done? So that's, that's honestly part of why I don't have a public roadmap because I want to avoid that sort of bickering about it. I would say that's one of my biggest frustrations with running an open source project is how it's self-selected to only hear the negative experiences with it. Be careful what PRs you accept [00:47:32] Brandon: 'cause you don't hear about those times where it works. I'd say another thing is it's changed my perspective on contributing to open source because I think when I was younger or before I had become a maintainer I would open a pull request on a project unprompted that has a hundred lines and I'd be like, Hey, just merge this thing. But I didn't realize when I was younger well if I just merge it and I disappear, then the maintainer is stuck with what I did forever. You know if I add some feature then that person that maintains the project has to do that indefinitely. And I think that's very asymmetrical and it's changed my perspective a lot on accepting open source contributions. I wanna have it be open to anyone to contribute. But there is some amount of back and forth where it's almost like the default answer for should I accept a PR is no by default because you're the one maintaining it. And do you understand the shape of that solution completely to where you're going to support it for years because the person that's contributing it is not bound to those same obligations that you are. And I think that's also one of the things where I have a lot of trepidation around open source is I used to think of it as a lot more bazaar-like in terms of anyone can just throw their thing in. But then that creates a lot of problems for the people who are expected out of social obligation to continue this thing indefinitely. [00:49:23] Jeremy: Yeah, I can totally see why that causes burnout with a lot of open source maintainers, because you probably to some extent maybe even feel some guilt right? You're like, well, somebody took the time to make this. But then like you said you have to spend a lot of time trying to figure out is this something I wanna maintain long term? And one wrong move and it's like, well, it's in here now. [00:49:53] Brandon: Exactly. To me, I think that is a very common failure mode for open source projects is they're too liberal in the things they accept. And that's a lot of why I was talking about how that choice of what features show up on the map was inherited from the MapZen projects. If I didn't have that then somebody could come in and say hey, you know, I want to show power lines on the map. And they open a PR for power lines and now everybody who's using Protomaps when they're like zoomed out they see power lines are like I didn't want that. So I think that's part of why a lot of open source projects eventually evolve into a plugin system is because there is this demand as the project grows for more and more features. But there is a limit in the maintainers. It's like the demand for features is exponential while the maintainer amount of time and effort is linear. Plugin systems might reduce need for PRs [00:50:56] Brandon: So maybe the solution to smash that exponential down to quadratic maybe is to add a plugin system. But I think that is one of the biggest tensions that only became obvious to me after working on this for a couple of years. [00:51:14] Jeremy: Is that something you're considering doing now? [00:51:18] Brandon: Is the plugin system? Yeah. I think for the data customization, I eventually wanted to have some sort of programmatic API to where you could declare a config file that says I want ski routes. It totally makes sense. The power lines example is maybe a little bit obscure but for example like a skiing app and you want to be able to show ski slopes when you're zoomed out well you're not gonna be able to get that from Mapbox or from Google because they have a one size fits all map that's not specialized to skiing or to golfing or to outdoors. But if you like, in theory, you could do this with Protomaps if you changed the Java code to show data at different zoom levels. And that is to me what makes the most sense for a plugin system and also makes the most product sense because it enables a lot of things you cannot do with the one size fits all map. [00:52:20] Jeremy: It might also increase the complexity of the implementation though, right? [00:52:25] Brandon: Yeah, exactly. So that's like. That's really where a lot of the terrifying thoughts come in, which is like once you create this like config file surface area, well what does that look like? Is that JSON? Is that TOML, is that some weird like everything eventually evolves into some scripting language right? Where you have logic inside of your templates and I honestly do not really know what that looks like right now. That feels like something in the medium term roadmap. [00:52:58] Jeremy: Yeah and then in terms of bug reports or issues, now it's not just your code it's this exponential combination of whatever people put into these config files. [00:53:09] Brandon: Exactly. Yeah. so again, like I really respect the projects that have done this well or that have done plugins well. I'm trying to think of some, I think obsidian has plugins, for example. And that seems to be one of the few solutions to try and satisfy the infinite desire for features with the limited amount of maintainer time. Time split between code vs triage vs talking to users [00:53:36] Jeremy: How would you say your time is split between working on the code versus issue and PR triage? [00:53:43] Brandon: Oh, it varies really. I think working on the code is like a minority of it. I think something that I actually enjoy is talking to people, talking to users, getting feedback on it. I go to quite a few conferences to talk to developers or people that are interested and figure out how to refine the message, how to make it clearer to people, like what this is for. And I would say maybe a plurality of my time is spent dealing with non-technical things that are neither code or GitHub issues. One thing I've been trying to do recently is talk to people that are not really in the mapping space. For example, people that work for newspapers like a lot of them are front end developers and if you ask them to run a Linux server they're like I have no idea. But that really is like one of the best target audiences for Protomaps. So I'd say a lot of the reality of running an open source project is a lot like a business is it has all the same challenges as a business in terms of you have to figure out what is the thing you're offering. You have to deal with people using it. You have to deal with feedback, you have to deal with managing emails and stuff. I don't think the payoff is anywhere near running a business or a startup that's backed by VC money is but it's definitely not the case that if you just want to code, you should start an open source project because I think a lot of the work for an opensource project has nothing to do with just writing the code. It is in my opinion as someone having done a VC backed business before, it is a lot more similar to running, a tech company than just putting some code on GitHub. Running a startup vs open source project [00:55:43] Jeremy: Well, since you've done both at a high level what did you like about running the company versus maintaining the open source project? [00:55:52] Brandon: So I have done some venture capital accelerator programs before and I think there is an element of hype and energy that you get from that that is self perpetuating. Your co-founder is gungho on like, yeah, we're gonna do this thing. And your investors are like, you guys are geniuses. You guys are gonna make a killing doing this thing. And the way it's framed is sort of obvious to everyone that it's like there's a much more traditional set of motivations behind that, that people understand while it's definitely not the case for running an open source project. Sometimes you just wake up and you're like what the hell is this thing for, it is this thing you spend a lot of time on. You don't even know who's using it. The people that use it and make a bunch of money off of it they know nothing about it. And you know, it's just like cool. And then you only hear from people that are complaining about it. And I think like that's honestly discouraging compared to the more clear energy and clearer motivation and vision behind how most people think about a company. But what I like about the open source project is just the lack of those constraints you know? Where you have a mandate that you need to have this many customers that are paying by this amount of time. There's that sort of pressure on delivering a business result instead of just making something that you're proud of that's simple to use and has like an elegant design. I think that's really a difference in motivation as well. Having control [00:57:50] Jeremy: Do you feel like you have more control? Like you mentioned how you've decided I'm not gonna make a public roadmap. I'm the sole developer. I get to decide what goes in. What doesn't. Do you feel like you have more control in your current position than you did running the startup? [00:58:10] Brandon: Definitely for sure. Like that agency is what I value the most. It is possible to go too far. Like, so I'm very wary of the BDFL title, which I think is how a lot of open source projects succeed. But I think there is some element of for a project to succeed there has to be somebody that makes those decisions. Sometimes those decisions will be wrong and then hopefully they can be rectified. But I think going back to what I was talking about with scope, I think the overall vision and the scope of the project is something that I am very opinionated about in that it should do these things. It shouldn't do these things. It should be easy to use for this audience. Is it gonna be appealing to this other audience? I don't know. And I think that is really one of the most important parts of that leadership role, is having the power to decide we're doing this, we're not doing this. I would hope other developers would be able to get on board if they're able to make good use of the project, if they use it for their company, if they use it for their business, if they just think the project is cool. So there are other contributors at this point and I want to get more involved. But I think being able to make those decisions to what I believe is going to be the best project is something that is very special about open source, that isn't necessarily true about running like a SaaS business. [00:59:50] Jeremy: I think that's a good spot to end it on, so if people want to learn more about Protomaps or they wanna see what you're up to, where should they head? [01:00:00] Brandon: So you can go to Protomaps.com, GitHub, or you can find me or Protomaps on bluesky or Mastodon. [01:00:09] Jeremy: All right, Brandon, thank you so much for chatting today. [01:00:12] Brandon: Great. Thank you very much.
AWS Morning Brief for the week of March 31st, with Corey Quinn. Links:Amazon DynamoDB now supports percentile statistics for request latencyAmazon EKS now enforces upgrade insights checks as part of cluster upgradesAmazon GameLift Servers expands instance support with next-generation EC2 instance familiesAWS CloudFormation now supports targeted resource scans in the IaC generatorAWS adds currency selection to Payment ProfilesAWS Deadline Cloud now supports Internet Protocol Version 6 (IPv6)AWS announces expanded service support in the AWS Console Mobile AppAWS Network Manager and AWS Cloud WAN now support AWS PrivateLink and IPv6Unlocking the power of Splunk with Amazon Bedrock – Build AI assistant using agentsFrom virtual machine to Kubernetes to serverless: How dacadoo saved 78% on cloud costs and automated operationsAccelerating CI with AWS CodeBuild: Parallel test execution now availableAmazon S3 Path Deprecation Plan – The Rest of the Story | AWS News BlogDetailed geographic information for all AWS Regions and Availability Zones is now availableOptimizing network footprint in serverless applicationsSimplifying private API integrations with Amazon EventBridge and AWS Step FunctionsAnnouncing the Developer Preview of Amazon S3 Transfer Manager in RustAWS SDK for Ruby: Deprecating Ruby 2.5 & 2.6 Runtime Supports and Future CompatibilityAnnouncing the AWS CDK L2 Construct for Amazon Cognito Identity PoolsAWS re:Invent 2024 recap for government agencies
Today's episode is with Paul Klein, founder of Browserbase. We talked about building browser infrastructure for AI agents, the future of agent authentication, and their open source framework Stagehand.* [00:00:00] Introductions* [00:04:46] AI-specific challenges in browser infrastructure* [00:07:05] Multimodality in AI-Powered Browsing* [00:12:26] Running headless browsers at scale* [00:18:46] Geolocation when proxying* [00:21:25] CAPTCHAs and Agent Auth* [00:28:21] Building “User take over” functionality* [00:33:43] Stagehand: AI web browsing framework* [00:38:58] OpenAI's Operator and computer use agents* [00:44:44] Surprising use cases of Browserbase* [00:47:18] Future of browser automation and market competition* [00:53:11] Being a solo founderTranscriptAlessio [00:00:04]: Hey everyone, welcome to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol.ai.swyx [00:00:12]: Hey, and today we are very blessed to have our friends, Paul Klein, for the fourth, the fourth, CEO of Browserbase. Welcome.Paul [00:00:21]: Thanks guys. Yeah, I'm happy to be here. I've been lucky to know both of you for like a couple of years now, I think. So it's just like we're hanging out, you know, with three ginormous microphones in front of our face. It's totally normal hangout.swyx [00:00:34]: Yeah. We've actually mentioned you on the podcast, I think, more often than any other Solaris tenant. Just because like you're one of the, you know, best performing, I think, LLM tool companies that have started up in the last couple of years.Paul [00:00:50]: Yeah, I mean, it's been a whirlwind of a year, like Browserbase is actually pretty close to our first birthday. So we are one years old. And going from, you know, starting a company as a solo founder to... To, you know, having a team of 20 people, you know, a series A, but also being able to support hundreds of AI companies that are building AI applications that go out and automate the web. It's just been like, really cool. It's been happening a little too fast. I think like collectively as an AI industry, let's just take a week off together. I took my first vacation actually two weeks ago, and Operator came out on the first day, and then a week later, DeepSeat came out. And I'm like on vacation trying to chill. I'm like, we got to build with this stuff, right? So it's been a breakneck year. But I'm super happy to be here and like talk more about all the stuff we're seeing. And I'd love to hear kind of what you guys are excited about too, and share with it, you know?swyx [00:01:39]: Where to start? So people, you've done a bunch of podcasts. I think I strongly recommend Jack Bridger's Scaling DevTools, as well as Turner Novak's The Peel. And, you know, I'm sure there's others. So you covered your Twilio story in the past, talked about StreamClub, you got acquired to Mux, and then you left to start Browserbase. So maybe we just start with what is Browserbase? Yeah.Paul [00:02:02]: Browserbase is the web browser for your AI. We're building headless browser infrastructure, which are browsers that run in a server environment that's accessible to developers via APIs and SDKs. It's really hard to run a web browser in the cloud. You guys are probably running Chrome on your computers, and that's using a lot of resources, right? So if you want to run a web browser or thousands of web browsers, you can't just spin up a bunch of lambdas. You actually need to use a secure containerized environment. You have to scale it up and down. It's a stateful system. And that infrastructure is, like, super painful. And I know that firsthand, because at my last company, StreamClub, I was CTO, and I was building our own internal headless browser infrastructure. That's actually why we sold the company, is because Mux really wanted to buy our headless browser infrastructure that we'd built. And it's just a super hard problem. And I actually told my co-founders, I would never start another company unless it was a browser infrastructure company. And it turns out that's really necessary in the age of AI, when AI can actually go out and interact with websites, click on buttons, fill in forms. You need AI to do all of that work in an actual browser running somewhere on a server. And BrowserBase powers that.swyx [00:03:08]: While you're talking about it, it occurred to me, not that you're going to be acquired or anything, but it occurred to me that it would be really funny if you became the Nikita Beer of headless browser companies. You just have one trick, and you make browser companies that get acquired.Paul [00:03:23]: I truly do only have one trick. I'm screwed if it's not for headless browsers. I'm not a Go programmer. You know, I'm in AI grant. You know, browsers is an AI grant. But we were the only company in that AI grant batch that used zero dollars on AI spend. You know, we're purely an infrastructure company. So as much as people want to ask me about reinforcement learning, I might not be the best guy to talk about that. But if you want to ask about headless browser infrastructure at scale, I can talk your ear off. So that's really my area of expertise. And it's a pretty niche thing. Like, nobody has done what we're doing at scale before. So we're happy to be the experts.swyx [00:03:59]: You do have an AI thing, stagehand. We can talk about the sort of core of browser-based first, and then maybe stagehand. Yeah, stagehand is kind of the web browsing framework. Yeah.What is Browserbase? Headless Browser Infrastructure ExplainedAlessio [00:04:10]: Yeah. Yeah. And maybe how you got to browser-based and what problems you saw. So one of the first things I worked on as a software engineer was integration testing. Sauce Labs was kind of like the main thing at the time. And then we had Selenium, we had Playbrite, we had all these different browser things. But it's always been super hard to do. So obviously you've worked on this before. When you started browser-based, what were the challenges? What were the AI-specific challenges that you saw versus, there's kind of like all the usual running browser at scale in the cloud, which has been a problem for years. What are like the AI unique things that you saw that like traditional purchase just didn't cover? Yeah.AI-specific challenges in browser infrastructurePaul [00:04:46]: First and foremost, I think back to like the first thing I did as a developer, like as a kid when I was writing code, I wanted to write code that did stuff for me. You know, I wanted to write code to automate my life. And I do that probably by using curl or beautiful soup to fetch data from a web browser. And I think I still do that now that I'm in the cloud. And the other thing that I think is a huge challenge for me is that you can't just create a web site and parse that data. And we all know that now like, you know, taking HTML and plugging that into an LLM, you can extract insights, you can summarize. So it was very clear that now like dynamic web scraping became very possible with the rise of large language models or a lot easier. And that was like a clear reason why there's been more usage of headless browsers, which are necessary because a lot of modern websites don't expose all of their page content via a simple HTTP request. You know, they actually do require you to run this type of code for a specific time. JavaScript on the page to hydrate this. Airbnb is a great example. You go to airbnb.com. A lot of that content on the page isn't there until after they run the initial hydration. So you can't just scrape it with a curl. You need to have some JavaScript run. And a browser is that JavaScript engine that's going to actually run all those requests on the page. So web data retrieval was definitely one driver of starting BrowserBase and the rise of being able to summarize that within LLM. Also, I was familiar with if I wanted to automate a website, I could write one script and that would work for one website. It was very static and deterministic. But the web is non-deterministic. The web is always changing. And until we had LLMs, there was no way to write scripts that you could write once that would run on any website. That would change with the structure of the website. Click the login button. It could mean something different on many different websites. And LLMs allow us to generate code on the fly to actually control that. So I think that rise of writing the generic automation scripts that can work on many different websites, to me, made it clear that browsers are going to be a lot more useful because now you can automate a lot more things without writing. If you wanted to write a script to book a demo call on 100 websites, previously, you had to write 100 scripts. Now you write one script that uses LLMs to generate that script. That's why we built our web browsing framework, StageHand, which does a lot of that work for you. But those two things, web data collection and then enhanced automation of many different websites, it just felt like big drivers for more browser infrastructure that would be required to power these kinds of features.Alessio [00:07:05]: And was multimodality also a big thing?Paul [00:07:08]: Now you can use the LLMs to look, even though the text in the dome might not be as friendly. Maybe my hot take is I was always kind of like, I didn't think vision would be as big of a driver. For UI automation, I felt like, you know, HTML is structured text and large language models are good with structured text. But it's clear that these computer use models are often vision driven, and they've been really pushing things forward. So definitely being multimodal, like rendering the page is required to take a screenshot to give that to a computer use model to take actions on a website. And it's just another win for browser. But I'll be honest, that wasn't what I was thinking early on. I didn't even think that we'd get here so fast with multimodality. I think we're going to have to get back to multimodal and vision models.swyx [00:07:50]: This is one of those things where I forgot to mention in my intro that I'm an investor in Browserbase. And I remember that when you pitched to me, like a lot of the stuff that we have today, we like wasn't on the original conversation. But I did have my original thesis was something that we've talked about on the podcast before, which is take the GPT store, the custom GPT store, all the every single checkbox and plugin is effectively a startup. And this was the browser one. I think the main hesitation, I think I actually took a while to get back to you. The main hesitation was that there were others. Like you're not the first hit list browser startup. It's not even your first hit list browser startup. There's always a question of like, will you be the category winner in a place where there's a bunch of incumbents, to be honest, that are bigger than you? They're just not targeted at the AI space. They don't have the backing of Nat Friedman. And there's a bunch of like, you're here in Silicon Valley. They're not. I don't know.Paul [00:08:47]: I don't know if that's, that was it, but like, there was a, yeah, I mean, like, I think I tried all the other ones and I was like, really disappointed. Like my background is from working at great developer tools, companies, and nothing had like the Vercel like experience. Um, like our biggest competitor actually is partly owned by private equity and they just jacked up their prices quite a bit. And the dashboard hasn't changed in five years. And I actually used them at my last company and tried them and I was like, oh man, like there really just needs to be something that's like the experience of these great infrastructure companies, like Stripe, like clerk, like Vercel that I use in love, but oriented towards this kind of like more specific category, which is browser infrastructure, which is really technically complex. Like a lot of stuff can go wrong on the internet when you're running a browser. The internet is very vast. There's a lot of different configurations. Like there's still websites that only work with internet explorer out there. How do you handle that when you're running your own browser infrastructure? These are the problems that we have to think about and solve at BrowserBase. And it's, it's certainly a labor of love, but I built this for me, first and foremost, I know it's super cheesy and everyone says that for like their startups, but it really, truly was for me. If you look at like the talks I've done even before BrowserBase, and I'm just like really excited to try and build a category defining infrastructure company. And it's, it's rare to have a new category of infrastructure exists. We're here in the Chroma offices and like, you know, vector databases is a new category of infrastructure. Is it, is it, I mean, we can, we're in their office, so, you know, we can, we can debate that one later. That is one.Multimodality in AI-Powered Browsingswyx [00:10:16]: That's one of the industry debates.Paul [00:10:17]: I guess we go back to the LLMOS talk that Karpathy gave way long ago. And like the browser box was very clearly there and it seemed like the people who were building in this space also agreed that browsers are a core primitive of infrastructure for the LLMOS that's going to exist in the future. And nobody was building something there that I wanted to use. So I had to go build it myself.swyx [00:10:38]: Yeah. I mean, exactly that talk that, that honestly, that diagram, every box is a startup and there's the code box and then there's the. The browser box. I think at some point they will start clashing there. There's always the question of the, are you a point solution or are you the sort of all in one? And I think the point solutions tend to win quickly, but then the only ones have a very tight cohesive experience. Yeah. Let's talk about just the hard problems of browser base you have on your website, which is beautiful. Thank you. Was there an agency that you used for that? Yeah. Herb.paris.Paul [00:11:11]: They're amazing. Herb.paris. Yeah. It's H-E-R-V-E. I highly recommend for developers. Developer tools, founders to work with consumer agencies because they end up building beautiful things and the Parisians know how to build beautiful interfaces. So I got to give prep.swyx [00:11:24]: And chat apps, apparently are, they are very fast. Oh yeah. The Mistral chat. Yeah. Mistral. Yeah.Paul [00:11:31]: Late chat.swyx [00:11:31]: Late chat. And then your videos as well, it was professionally shot, right? The series A video. Yeah.Alessio [00:11:36]: Nico did the videos. He's amazing. Not the initial video that you shot at the new one. First one was Austin.Paul [00:11:41]: Another, another video pretty surprised. But yeah, I mean, like, I think when you think about how you talk about your company. You have to think about the way you present yourself. It's, you know, as a developer, you think you evaluate a company based on like the API reliability and the P 95, but a lot of developers say, is the website good? Is the message clear? Do I like trust this founder? I'm building my whole feature on. So I've tried to nail that as well as like the reliability of the infrastructure. You're right. It's very hard. And there's a lot of kind of foot guns that you run into when running headless browsers at scale. Right.Competing with Existing Headless Browser Solutionsswyx [00:12:10]: So let's pick one. You have eight features here. Seamless integration. Scalability. Fast or speed. Secure. Observable. Stealth. That's interesting. Extensible and developer first. What comes to your mind as like the top two, three hardest ones? Yeah.Running headless browsers at scalePaul [00:12:26]: I think just running headless browsers at scale is like the hardest one. And maybe can I nerd out for a second? Is that okay? I heard this is a technical audience, so I'll talk to the other nerds. Whoa. They were listening. Yeah. They're upset. They're ready. The AGI is angry. Okay. So. So how do you run a browser in the cloud? Let's start with that, right? So let's say you're using a popular browser automation framework like Puppeteer, Playwright, and Selenium. Maybe you've written a code, some code locally on your computer that opens up Google. It finds the search bar and then types in, you know, search for Latent Space and hits the search button. That script works great locally. You can see the little browser open up. You want to take that to production. You want to run the script in a cloud environment. So when your laptop is closed, your browser is doing something. The browser is doing something. Well, I, we use Amazon. You can see the little browser open up. You know, the first thing I'd reach for is probably like some sort of serverless infrastructure. I would probably try and deploy on a Lambda. But Chrome itself is too big to run on a Lambda. It's over 250 megabytes. So you can't easily start it on a Lambda. So you maybe have to use something like Lambda layers to squeeze it in there. Maybe use a different Chromium build that's lighter. And you get it on the Lambda. Great. It works. But it runs super slowly. It's because Lambdas are very like resource limited. They only run like with one vCPU. You can run one process at a time. Remember, Chromium is super beefy. It's barely running on my MacBook Air. I'm still downloading it from a pre-run. Yeah, from the test earlier, right? I'm joking. But it's big, you know? So like Lambda, it just won't work really well. Maybe it'll work, but you need something faster. Your users want something faster. Okay. Well, let's put it on a beefier instance. Let's get an EC2 server running. Let's throw Chromium on there. Great. Okay. I can, that works well with one user. But what if I want to run like 10 Chromium instances, one for each of my users? Okay. Well, I might need two EC2 instances. Maybe 10. All of a sudden, you have multiple EC2 instances. This sounds like a problem for Kubernetes and Docker, right? Now, all of a sudden, you're using ECS or EKS, the Kubernetes or container solutions by Amazon. You're spending up and down containers, and you're spending a whole engineer's time on kind of maintaining this stateful distributed system. Those are some of the worst systems to run because when it's a stateful distributed system, it means that you are bound by the connections to that thing. You have to keep the browser open while someone is working with it, right? That's just a painful architecture to run. And there's all this other little gotchas with Chromium, like Chromium, which is the open source version of Chrome, by the way. You have to install all these fonts. You want emojis working in your browsers because your vision model is looking for the emoji. You need to make sure you have the emoji fonts. You need to make sure you have all the right extensions configured, like, oh, do you want ad blocking? How do you configure that? How do you actually record all these browser sessions? Like it's a headless browser. You can't look at it. So you need to have some sort of observability. Maybe you're recording videos and storing those somewhere. It all kind of adds up to be this just giant monster piece of your project when all you wanted to do was run a lot of browsers in production for this little script to go to google.com and search. And when I see a complex distributed system, I see an opportunity to build a great infrastructure company. And we really abstract that away with Browserbase where our customers can use these existing frameworks, Playwright, Publisher, Selenium, or our own stagehand and connect to our browsers in a serverless-like way. And control them, and then just disconnect when they're done. And they don't have to think about the complex distributed system behind all of that. They just get a browser running anywhere, anytime. Really easy to connect to.swyx [00:15:55]: I'm sure you have questions. My standard question with anything, so essentially you're a serverless browser company, and there's been other serverless things that I'm familiar with in the past, serverless GPUs, serverless website hosting. That's where I come from with Netlify. One question is just like, you promised to spin up thousands of servers. You promised to spin up thousands of browsers in milliseconds. I feel like there's no real solution that does that yet. And I'm just kind of curious how. The only solution I know, which is to kind of keep a kind of warm pool of servers around, which is expensive, but maybe not so expensive because it's just CPUs. So I'm just like, you know. Yeah.Browsers as a Core Primitive in AI InfrastructurePaul [00:16:36]: You nailed it, right? I mean, how do you offer a serverless-like experience with something that is clearly not serverless, right? And the answer is, you need to be able to run... We run many browsers on single nodes. We use Kubernetes at browser base. So we have many pods that are being scheduled. We have to predictably schedule them up or down. Yes, thousands of browsers in milliseconds is the best case scenario. If you hit us with 10,000 requests, you may hit a slower cold start, right? So we've done a lot of work on predictive scaling and being able to kind of route stuff to different regions where we have multiple regions of browser base where we have different pools available. You can also pick the region you want to go to based on like lower latency, round trip, time latency. It's very important with these types of things. There's a lot of requests going over the wire. So for us, like having a VM like Firecracker powering everything under the hood allows us to be super nimble and spin things up or down really quickly with strong multi-tenancy. But in the end, this is like the complex infrastructural challenges that we have to kind of deal with at browser base. And we have a lot more stuff on our roadmap to allow customers to have more levers to pull to exchange, do you want really fast browser startup times or do you want really low costs? And if you're willing to be more flexible on that, we may be able to kind of like work better for your use cases.swyx [00:17:44]: Since you used Firecracker, shouldn't Fargate do that for you or did you have to go lower level than that? We had to go lower level than that.Paul [00:17:51]: I find this a lot with Fargate customers, which is alarming for Fargate. We used to be a giant Fargate customer. Actually, the first version of browser base was ECS and Fargate. And unfortunately, it's a great product. I think we were actually the largest Fargate customer in our region for a little while. No, what? Yeah, seriously. And unfortunately, it's a great product, but I think if you're an infrastructure company, you actually have to have a deeper level of control over these primitives. I think it's the same thing is true with databases. We've used other database providers and I think-swyx [00:18:21]: Yeah, serverless Postgres.Paul [00:18:23]: Shocker. When you're an infrastructure company, you're on the hook if any provider has an outage. And I can't tell my customers like, hey, we went down because so-and-so went down. That's not acceptable. So for us, we've really moved to bringing things internally. It's kind of opposite of what we preach. We tell our customers, don't build this in-house, but then we're like, we build a lot of stuff in-house. But I think it just really depends on what is in the critical path. We try and have deep ownership of that.Alessio [00:18:46]: On the distributed location side, how does that work for the web where you might get sort of different content in different locations, but the customer is expecting, you know, if you're in the US, I'm expecting the US version. But if you're spinning up my browser in France, I might get the French version. Yeah.Paul [00:19:02]: Yeah. That's a good question. Well, generally, like on the localization, there is a thing called locale in the browser. You can set like what your locale is. If you're like in the ENUS browser or not, but some things do IP, IP based routing. And in that case, you may want to have a proxy. Like let's say you're running something in the, in Europe, but you want to make sure you're showing up from the US. You may want to use one of our proxy features so you can turn on proxies to say like, make sure these connections always come from the United States, which is necessary too, because when you're browsing the web, you're coming from like a, you know, data center IP, and that can make things a lot harder to browse web. So we do have kind of like this proxy super network. Yeah. We have a proxy for you based on where you're going, so you can reliably automate the web. But if you get scheduled in Europe, that doesn't happen as much. We try and schedule you as close to, you know, your origin that you're trying to go to. But generally you have control over the regions you can put your browsers in. So you can specify West one or East one or Europe. We only have one region of Europe right now, actually. Yeah.Alessio [00:19:55]: What's harder, the browser or the proxy? I feel like to me, it feels like actually proxying reliably at scale. It's much harder than spending up browsers at scale. I'm curious. It's all hard.Paul [00:20:06]: It's layers of hard, right? Yeah. I think it's different levels of hard. I think the thing with the proxy infrastructure is that we work with many different web proxy providers and some are better than others. Some have good days, some have bad days. And our customers who've built browser infrastructure on their own, they have to go and deal with sketchy actors. Like first they figure out their own browser infrastructure and then they got to go buy a proxy. And it's like you can pay in Bitcoin and it just kind of feels a little sus, right? It's like you're buying drugs when you're trying to get a proxy online. We have like deep relationships with these counterparties. We're able to audit them and say, is this proxy being sourced ethically? Like it's not running on someone's TV somewhere. Is it free range? Yeah. Free range organic proxies, right? Right. We do a level of diligence. We're SOC 2. So we have to understand what is going on here. But then we're able to make sure that like we route around proxy providers not working. There's proxy providers who will just, the proxy will stop working all of a sudden. And then if you don't have redundant proxying on your own browsers, that's hard down for you or you may get some serious impacts there. With us, like we intelligently know, hey, this proxy is not working. Let's go to this one. And you can kind of build a network of multiple providers to really guarantee the best uptime for our customers. Yeah. So you don't own any proxies? We don't own any proxies. You're right. The team has been saying who wants to like take home a little proxy server, but not yet. We're not there yet. You know?swyx [00:21:25]: It's a very mature market. I don't think you should build that yourself. Like you should just be a super customer of them. Yeah. Scraping, I think, is the main use case for that. I guess. Well, that leads us into CAPTCHAs and also off, but let's talk about CAPTCHAs. You had a little spiel that you wanted to talk about CAPTCHA stuff.Challenges of Scaling Browser InfrastructurePaul [00:21:43]: Oh, yeah. I was just, I think a lot of people ask, if you're thinking about proxies, you're thinking about CAPTCHAs too. I think it's the same thing. You can go buy CAPTCHA solvers online, but it's the same buying experience. It's some sketchy website, you have to integrate it. It's not fun to buy these things and you can't really trust that the docs are bad. What Browserbase does is we integrate a bunch of different CAPTCHAs. We do some stuff in-house, but generally we just integrate with a bunch of known vendors and continually monitor and maintain these things and say, is this working or not? Can we route around it or not? These are CAPTCHA solvers. CAPTCHA solvers, yeah. Not CAPTCHA providers, CAPTCHA solvers. Yeah, sorry. CAPTCHA solvers. We really try and make sure all of that works for you. I think as a dev, if I'm buying infrastructure, I want it all to work all the time and it's important for us to provide that experience by making sure everything does work and monitoring it on our own. Yeah. Right now, the world of CAPTCHAs is tricky. I think AI agents in particular are very much ahead of the internet infrastructure. CAPTCHAs are designed to block all types of bots, but there are now good bots and bad bots. I think in the future, CAPTCHAs will be able to identify who a good bot is, hopefully via some sort of KYC. For us, we've been very lucky. We have very little to no known abuse of Browserbase because we really look into who we work with. And for certain types of CAPTCHA solving, we only allow them on certain types of plans because we want to make sure that we can know what people are doing, what their use cases are. And that's really allowed us to try and be an arbiter of good bots, which is our long term goal. I want to build great relationships with people like Cloudflare so we can agree, hey, here are these acceptable bots. We'll identify them for you and make sure we flag when they come to your website. This is a good bot, you know?Alessio [00:23:23]: I see. And Cloudflare said they want to do more of this. So they're going to set by default, if they think you're an AI bot, they're going to reject. I'm curious if you think this is something that is going to be at the browser level or I mean, the DNS level with Cloudflare seems more where it should belong. But I'm curious how you think about it.Paul [00:23:40]: I think the web's going to change. You know, I think that the Internet as we have it right now is going to change. And we all need to just accept that the cat is out of the bag. And instead of kind of like wishing the Internet was like it was in the 2000s, we can have free content line that wouldn't be scraped. It's just it's not going to happen. And instead, we should think about like, one, how can we change? How can we change the models of, you know, information being published online so people can adequately commercialize it? But two, how do we rebuild applications that expect that AI agents are going to log in on their behalf? Those are the things that are going to allow us to kind of like identify good and bad bots. And I think the team at Clerk has been doing a really good job with this on the authentication side. I actually think that auth is the biggest thing that will prevent agents from accessing stuff, not captchas. And I think there will be agent auth in the future. I don't know if it's going to happen from an individual company, but actually authentication providers that have a, you know, hidden login as agent feature, which will then you put in your email, you'll get a push notification, say like, hey, your browser-based agent wants to log into your Airbnb. You can approve that and then the agent can proceed. That really circumvents the need for captchas or logging in as you and sharing your password. I think agent auth is going to be one way we identify good bots going forward. And I think a lot of this captcha solving stuff is really short-term problems as the internet kind of reorients itself around how it's going to work with agents browsing the web, just like people do. Yeah.Managing Distributed Browser Locations and Proxiesswyx [00:24:59]: Stitch recently was on Hacker News for talking about agent experience, AX, which is a thing that Netlify is also trying to clone and coin and talk about. And we've talked about this on our previous episodes before in a sense that I actually think that's like maybe the only part of the tech stack that needs to be kind of reinvented for agents. Everything else can stay the same, CLIs, APIs, whatever. But auth, yeah, we need agent auth. And it's mostly like short-lived, like it should not, it should be a distinct, identity from the human, but paired. I almost think like in the same way that every social network should have your main profile and then your alt accounts or your Finsta, it's almost like, you know, every, every human token should be paired with the agent token and the agent token can go and do stuff on behalf of the human token, but not be presumed to be the human. Yeah.Paul [00:25:48]: It's like, it's, it's actually very similar to OAuth is what I'm thinking. And, you know, Thread from Stitch is an investor, Colin from Clerk, Octaventures, all investors in browser-based because like, I hope they solve this because they'll make browser-based submission more possible. So we don't have to overcome all these hurdles, but I think it will be an OAuth-like flow where an agent will ask to log in as you, you'll approve the scopes. Like it can book an apartment on Airbnb, but it can't like message anybody. And then, you know, the agent will have some sort of like role-based access control within an application. Yeah. I'm excited for that.swyx [00:26:16]: The tricky part is just, there's one, one layer of delegation here, which is like, you're authoring my user's user or something like that. I don't know if that's tricky or not. Does that make sense? Yeah.Paul [00:26:25]: You know, actually at Twilio, I worked on the login identity and access. Management teams, right? So like I built Twilio's login page.swyx [00:26:31]: You were an intern on that team and then you became the lead in two years? Yeah.Paul [00:26:34]: Yeah. I started as an intern in 2016 and then I was the tech lead of that team. How? That's not normal. I didn't have a life. He's not normal. Look at this guy. I didn't have a girlfriend. I just loved my job. I don't know. I applied to 500 internships for my first job and I got rejected from every single one of them except for Twilio and then eventually Amazon. And they took a shot on me and like, I was getting paid money to write code, which was my dream. Yeah. Yeah. I'm very lucky that like this coding thing worked out because I was going to be doing it regardless. And yeah, I was able to kind of spend a lot of time on a team that was growing at a company that was growing. So it informed a lot of this stuff here. I think these are problems that have been solved with like the SAML protocol with SSO. I think it's a really interesting stuff with like WebAuthn, like these different types of authentication, like schemes that you can use to authenticate people. The tooling is all there. It just needs to be tweaked a little bit to work for agents. And I think the fact that there are companies that are already. Providing authentication as a service really sets it up. Well, the thing that's hard is like reinventing the internet for agents. We don't want to rebuild the internet. That's an impossible task. And I think people often say like, well, we'll have this second layer of APIs built for agents. I'm like, we will for the top use cases, but instead of we can just tweak the internet as is, which is on the authentication side, I think we're going to be the dumb ones going forward. Unfortunately, I think AI is going to be able to do a lot of the tasks that we do online, which means that it will be able to go to websites, click buttons on our behalf and log in on our behalf too. So with this kind of like web agent future happening, I think with some small structural changes, like you said, it feels like it could all slot in really nicely with the existing internet.Handling CAPTCHAs and Agent Authenticationswyx [00:28:08]: There's one more thing, which is the, your live view iframe, which lets you take, take control. Yeah. Obviously very key for operator now, but like, was, is there anything interesting technically there or that the people like, well, people always want this.Paul [00:28:21]: It was really hard to build, you know, like, so, okay. Headless browsers, you don't see them, right. They're running. They're running in a cloud somewhere. You can't like look at them. And I just want to really make, it's a weird name. I wish we came up with a better name for this thing, but you can't see them. Right. But customers don't trust AI agents, right. At least the first pass. So what we do with our live view is that, you know, when you use browser base, you can actually embed a live view of the browser running in the cloud for your customer to see it working. And that's what the first reason is the build trust, like, okay, so I have this script. That's going to go automate a website. I can embed it into my web application via an iframe and my customer can watch. I think. And then we added two way communication. So now not only can you watch the browser kind of being operated by AI, if you want to pause and actually click around type within this iframe that's controlling a browser, that's also possible. And this is all thanks to some of the lower level protocol, which is called the Chrome DevTools protocol. It has a API called start screencast, and you can also send mouse clicks and button clicks to a remote browser. And this is all embeddable within iframes. You have a browser within a browser, yo. And then you simulate the screen, the click on the other side. Exactly. And this is really nice often for, like, let's say, a capture that can't be solved. You saw this with Operator, you know, Operator actually uses a different approach. They use VNC. So, you know, you're able to see, like, you're seeing the whole window here. What we're doing is something a little lower level with the Chrome DevTools protocol. It's just PNGs being streamed over the wire. But the same thing is true, right? Like, hey, I'm running a window. Pause. Can you do something in this window? Human. Okay, great. Resume. Like sometimes 2FA tokens. Like if you get that text message, you might need a person to type that in. Web agents need human-in-the-loop type workflows still. You still need a person to interact with the browser. And building a UI to proxy that is kind of hard. You may as well just show them the whole browser and say, hey, can you finish this up for me? And then let the AI proceed on afterwards. Is there a future where I stream my current desktop to browser base? I don't think so. I think we're very much cloud infrastructure. Yeah. You know, but I think a lot of the stuff we're doing, we do want to, like, build tools. Like, you know, we'll talk about the stage and, you know, web agent framework in a second. But, like, there's a case where a lot of people are going desktop first for, you know, consumer use. And I think cloud is doing a lot of this, where I expect to see, you know, MCPs really oriented around the cloud desktop app for a reason, right? Like, I think a lot of these tools are going to run on your computer because it makes... I think it's breaking out. People are putting it on a server. Oh, really? Okay. Well, sweet. We'll see. We'll see that. I was surprised, though, wasn't I? I think that the browser company, too, with Dia Browser, it runs on your machine. You know, it's going to be...swyx [00:30:50]: What is it?Paul [00:30:51]: So, Dia Browser, as far as I understand... I used to use Arc. Yeah. I haven't used Arc. But I'm a big fan of the browser company. I think they're doing a lot of cool stuff in consumer. As far as I understand, it's a browser where you have a sidebar where you can, like, chat with it and it can control the local browser on your machine. So, if you imagine, like, what a consumer web agent is, which it lives alongside your browser, I think Google Chrome has Project Marina, I think. I almost call it Project Marinara for some reason. I don't know why. It's...swyx [00:31:17]: No, I think it's someone really likes the Waterworld. Oh, I see. The classic Kevin Costner. Yeah.Paul [00:31:22]: Okay. Project Marinara is a similar thing to the Dia Browser, in my mind, as far as I understand it. You have a browser that has an AI interface that will take over your mouse and keyboard and control the browser for you. Great for consumer use cases. But if you're building applications that rely on a browser and it's more part of a greater, like, AI app experience, you probably need something that's more like infrastructure, not a consumer app.swyx [00:31:44]: Just because I have explored a little bit in this area, do people want branching? So, I have the state. Of whatever my browser's in. And then I want, like, 100 clones of this state. Do people do that? Or...Paul [00:31:56]: People don't do it currently. Yeah. But it's definitely something we're thinking about. I think the idea of forking a browser is really cool. Technically, kind of hard. We're starting to see this in code execution, where people are, like, forking some, like, code execution, like, processes or forking some tool calls or branching tool calls. Haven't seen it at the browser level yet. But it makes sense. Like, if an AI agent is, like, using a website and it's not sure what path it wants to take to crawl this website. To find the information it's looking for. It would make sense for it to explore both paths in parallel. And that'd be a very, like... A road not taken. Yeah. And hopefully find the right answer. And then say, okay, this was actually the right one. And memorize that. And go there in the future. On the roadmap. For sure. Don't make my roadmap, please. You know?Alessio [00:32:37]: How do you actually do that? Yeah. How do you fork? I feel like the browser is so stateful for so many things.swyx [00:32:42]: Serialize the state. Restore the state. I don't know.Paul [00:32:44]: So, it's one of the reasons why we haven't done it yet. It's hard. You know? Like, to truly fork, it's actually quite difficult. The naive way is to open the same page in a new tab and then, like, hope that it's at the same thing. But if you have a form halfway filled, you may have to, like, take the whole, you know, container. Pause it. All the memory. Duplicate it. Restart it from there. It could be very slow. So, we haven't found a thing. Like, the easy thing to fork is just, like, copy the page object. You know? But I think there needs to be something a little bit more robust there. Yeah.swyx [00:33:12]: So, MorphLabs has this infinite branch thing. Like, wrote a custom fork of Linux or something that let them save the system state and clone it. MorphLabs, hit me up. I'll be a customer. Yeah. That's the only. I think that's the only way to do it. Yeah. Like, unless Chrome has some special API for you. Yeah.Paul [00:33:29]: There's probably something we'll reverse engineer one day. I don't know. Yeah.Alessio [00:33:32]: Let's talk about StageHand, the AI web browsing framework. You have three core components, Observe, Extract, and Act. Pretty clean landing page. What was the idea behind making a framework? Yeah.Stagehand: AI web browsing frameworkPaul [00:33:43]: So, there's three frameworks that are very popular or already exist, right? Puppeteer, Playwright, Selenium. Those are for building hard-coded scripts to control websites. And as soon as I started to play with LLMs plus browsing, I caught myself, you know, code-genning Playwright code to control a website. I would, like, take the DOM. I'd pass it to an LLM. I'd say, can you generate the Playwright code to click the appropriate button here? And it would do that. And I was like, this really should be part of the frameworks themselves. And I became really obsessed with SDKs that take natural language as part of, like, the API input. And that's what StageHand is. StageHand exposes three APIs, and it's a super set of Playwright. So, if you go to a page, you may want to take an action, click on the button, fill in the form, etc. That's what the act command is for. You may want to extract some data. This one takes a natural language, like, extract the winner of the Super Bowl from this page. You can give it a Zod schema, so it returns a structured output. And then maybe you're building an API. You can do an agent loop, and you want to kind of see what actions are possible on this page before taking one. You can do observe. So, you can observe the actions on the page, and it will generate a list of actions. You can guide it, like, give me actions on this page related to buying an item. And you can, like, buy it now, add to cart, view shipping options, and pass that to an LLM, an agent loop, to say, what's the appropriate action given this high-level goal? So, StageHand isn't a web agent. It's a framework for building web agents. And we think that agent loops are actually pretty close to the application layer because every application probably has different goals or different ways it wants to take steps. I don't think I've seen a generic. Maybe you guys are the experts here. I haven't seen, like, a really good AI agent framework here. Everyone kind of has their own special sauce, right? I see a lot of developers building their own agent loops, and they're using tools. And I view StageHand as the browser tool. So, we expose act, extract, observe. Your agent can call these tools. And from that, you don't have to worry about it. You don't have to worry about generating playwright code performantly. You don't have to worry about running it. You can kind of just integrate these three tool calls into your agent loop and reliably automate the web.swyx [00:35:48]: A special shout-out to Anirudh, who I met at your dinner, who I think listens to the pod. Yeah. Hey, Anirudh.Paul [00:35:54]: Anirudh's a man. He's a StageHand guy.swyx [00:35:56]: I mean, the interesting thing about each of these APIs is they're kind of each startup. Like, specifically extract, you know, Firecrawler is extract. There's, like, Expand AI. There's a whole bunch of, like, extract companies. They just focus on extract. I'm curious. Like, I feel like you guys are going to collide at some point. Like, right now, it's friendly. Everyone's in a blue ocean. At some point, it's going to be valuable enough that there's some turf battle here. I don't think you have a dog in a fight. I think you can mock extract to use an external service if they're better at it than you. But it's just an observation that, like, in the same way that I see each option, each checkbox in the side of custom GBTs becoming a startup or each box in the Karpathy chart being a startup. Like, this is also becoming a thing. Yeah.Paul [00:36:41]: I mean, like, so the way StageHand works is that it's MIT-licensed, completely open source. You bring your own API key to your LLM of choice. You could choose your LLM. We don't make any money off of the extract or really. We only really make money if you choose to run it with our browser. You don't have to. You can actually use your own browser, a local browser. You know, StageHand is completely open source for that reason. And, yeah, like, I think if you're building really complex web scraping workflows, I don't know if StageHand is the tool for you. I think it's really more if you're building an AI agent that needs a few general tools or if it's doing a lot of, like, web automation-intensive work. But if you're building a scraping company, StageHand is not your thing. You probably want something that's going to, like, get HTML content, you know, convert that to Markdown, query it. That's not what StageHand does. StageHand is more about reliability. I think we focus a lot on reliability and less so on cost optimization and speed at this point.swyx [00:37:33]: I actually feel like StageHand, so the way that StageHand works, it's like, you know, page.act, click on the quick start. Yeah. It's kind of the integration test for the code that you would have to write anyway, like the Puppeteer code that you have to write anyway. And when the page structure changes, because it always does, then this is still the test. This is still the test that I would have to write. Yeah. So it's kind of like a testing framework that doesn't need implementation detail.Paul [00:37:56]: Well, yeah. I mean, Puppeteer, Playwright, and Slenderman were all designed as testing frameworks, right? Yeah. And now people are, like, hacking them together to automate the web. I would say, and, like, maybe this is, like, me being too specific. But, like, when I write tests, if the page structure changes. Without me knowing, I want that test to fail. So I don't know if, like, AI, like, regenerating that. Like, people are using StageHand for testing. But it's more for, like, usability testing, not, like, testing of, like, does the front end, like, has it changed or not. Okay. But generally where we've seen people, like, really, like, take off is, like, if they're using, you know, something. If they want to build a feature in their application that's kind of like Operator or Deep Research, they're using StageHand to kind of power that tool calling in their own agent loop. Okay. Cool.swyx [00:38:37]: So let's go into Operator, the first big agent launch of the year from OpenAI. Seems like they have a whole bunch scheduled. You were on break and your phone blew up. What's your just general view of computer use agents is what they're calling it. The overall category before we go into Open Operator, just the overall promise of Operator. I will observe that I tried it once. It was okay. And I never tried it again.OpenAI's Operator and computer use agentsPaul [00:38:58]: That tracks with my experience, too. Like, I'm a huge fan of the OpenAI team. Like, I think that I do not view Operator as the company. I'm not a company killer for browser base at all. I think it actually shows people what's possible. I think, like, computer use models make a lot of sense. And I'm actually most excited about computer use models is, like, their ability to, like, really take screenshots and reasoning and output steps. I think that using mouse click or mouse coordinates, I've seen that proved to be less reliable than I would like. And I just wonder if that's the right form factor. What we've done with our framework is anchor it to the DOM itself, anchor it to the actual item. So, like, if it's clicking on something, it's clicking on that thing, you know? Like, it's more accurate. No matter where it is. Yeah, exactly. Because it really ties in nicely. And it can handle, like, the whole viewport in one go, whereas, like, Operator can only handle what it sees. Can you hover? Is hovering a thing that you can do? I don't know if we expose it as a tool directly, but I'm sure there's, like, an API for hovering. Like, move mouse to this position. Yeah, yeah, yeah. I think you can trigger hover, like, via, like, the JavaScript on the DOM itself. But, no, I think, like, when we saw computer use, everyone's eyes lit up because they realized, like, wow, like, AI is going to actually automate work for people. And I think seeing that kind of happen from both of the labs, and I'm sure we're going to see more labs launch computer use models, I'm excited to see all the stuff that people build with it. I think that I'd love to see computer use power, like, controlling a browser on browser base. And I think, like, Open Operator, which was, like, our open source version of OpenAI's Operator, was our first take on, like, how can we integrate these models into browser base? And we handle the infrastructure and let the labs do the models. I don't have a sense that Operator will be released as an API. I don't know. Maybe it will. I'm curious to see how well that works because I think it's going to be really hard for a company like OpenAI to do things like support CAPTCHA solving or, like, have proxies. Like, I think it's hard for them structurally. Imagine this New York Times headline, OpenAI CAPTCHA solving. Like, that would be a pretty bad headline, this New York Times headline. Browser base solves CAPTCHAs. No one cares. No one cares. And, like, our investors are bored. Like, we're all okay with this, you know? We're building this company knowing that the CAPTCHA solving is short-lived until we figure out how to authenticate good bots. I think it's really hard for a company like OpenAI, who has this brand that's so, so good, to balance with, like, the icky parts of web automation, which it can be kind of complex to solve. I'm sure OpenAI knows who to call whenever they need you. Yeah, right. I'm sure they'll have a great partnership.Alessio [00:41:23]: And is Open Operator just, like, a marketing thing for you? Like, how do you think about resource allocation? So, you can spin this up very quickly. And now there's all this, like, open deep research, just open all these things that people are building. We started it, you know. You're the original Open. We're the original Open operator, you know? Is it just, hey, look, this is a demo, but, like, we'll help you build out an actual product for yourself? Like, are you interested in going more of a product route? That's kind of the OpenAI way, right? They started as a model provider and then…Paul [00:41:53]: Yeah, we're not interested in going the product route yet. I view Open Operator as a model provider. It's a reference project, you know? Let's show people how to build these things using the infrastructure and models that are out there. And that's what it is. It's, like, Open Operator is very simple. It's an agent loop. It says, like, take a high-level goal, break it down into steps, use tool calling to accomplish those steps. It takes screenshots and feeds those screenshots into an LLM with the step to generate the right action. It uses stagehand under the hood to actually execute this action. It doesn't use a computer use model. And it, like, has a nice interface using the live view that we talked about, the iframe, to embed that into an application. So I felt like people on launch day wanted to figure out how to build their own version of this. And we turned that around really quickly to show them. And I hope we do that with other things like deep research. We don't have a deep research launch yet. I think David from AOMNI actually has an amazing open deep research that he launched. It has, like, 10K GitHub stars now. So he's crushing that. But I think if people want to build these features natively into their application, they need good reference projects. And I think Open Operator is a good example of that.swyx [00:42:52]: I don't know. Actually, I'm actually pretty bullish on API-driven operator. Because that's the only way that you can sort of, like, once it's reliable enough, obviously. And now we're nowhere near. But, like, give it five years. It'll happen, you know. And then you can sort of spin this up and browsers are working in the background and you don't necessarily have to know. And it just is booking restaurants for you, whatever. I can definitely see that future happening. I had this on the landing page here. This might be a slightly out of order. But, you know, you have, like, sort of three use cases for browser base. Open Operator. Or this is the operator sort of use case. It's kind of like the workflow automation use case. And it completes with UiPath in the sort of RPA category. Would you agree with that? Yeah, I would agree with that. And then there's Agents we talked about already. And web scraping, which I imagine would be the bulk of your workload right now, right?Paul [00:43:40]: No, not at all. I'd say actually, like, the majority is browser automation. We're kind of expensive for web scraping. Like, I think that if you're building a web scraping product, if you need to do occasional web scraping or you have to do web scraping that works every single time, you want to use browser automation. Yeah. You want to use browser-based. But if you're building web scraping workflows, what you should do is have a waterfall. You should have the first request is a curl to the website. See if you can get it without even using a browser. And then the second request may be, like, a scraping-specific API. There's, like, a thousand scraping APIs out there that you can use to try and get data. Scraping B. Scraping B is a great example, right? Yeah. And then, like, if those two don't work, bring out the heavy hitter. Like, browser-based will 100% work, right? It will load the page in a real browser, hydrate it. I see.swyx [00:44:21]: Because a lot of people don't render to JS.swyx [00:44:25]: Yeah, exactly.Paul [00:44:26]: So, I mean, the three big use cases, right? Like, you know, automation, web data collection, and then, you know, if you're building anything agentic that needs, like, a browser tool, you want to use browser-based.Alessio [00:44:35]: Is there any use case that, like, you were super surprised by that people might not even think about? Oh, yeah. Or is it, yeah, anything that you can share? The long tail is crazy. Yeah.Surprising use cases of BrowserbasePaul [00:44:44]: One of the case studies on our website that I think is the most interesting is this company called Benny. So, the way that it works is if you're on food stamps in the United States, you can actually get rebates if you buy certain things. Yeah. You buy some vegetables. You submit your receipt to the government. They'll give you a little rebate back. Say, hey, thanks for buying vegetables. It's good for you. That process of submitting that receipt is very painful. And the way Benny works is you use their app to take a photo of your receipt, and then Benny will go submit that receipt for you and then deposit the money into your account. That's actually using no AI at all. It's all, like, hard-coded scripts. They maintain the scripts. They've been doing a great job. And they build this amazing consumer app. But it's an example of, like, all these, like, tedious workflows that people have to do to kind of go about their business. And they're doing it for the sake of their day-to-day lives. And I had never known about, like, food stamp rebates or the complex forms you have to do to fill them. But the world is powered by millions and millions of tedious forms, visas. You know, Emirate Lighthouse is a customer, right? You know, they do the O1 visa. Millions and millions of forms are taking away humans' time. And I hope that Browserbase can help power software that automates away the web forms that we don't need anymore. Yeah.swyx [00:45:49]: I mean, I'm very supportive of that. I mean, forms. I do think, like, government itself is a big part of it. I think the government itself should embrace AI more to do more sort of human-friendly form filling. Mm-hmm. But I'm not optimistic. I'm not holding my breath. Yeah. We'll see. Okay. I think I'm about to zoom out. I have a little brief thing on computer use, and then we can talk about founder stuff, which is, I tend to think of developer tooling markets in impossible triangles, where everyone starts in a niche, and then they start to branch out. So I already hinted at a little bit of this, right? We mentioned more. We mentioned E2B. We mentioned Firecrawl. And then there's Browserbase. So there's, like, all this stuff of, like, have serverless virtual computer that you give to an agent and let them do stuff with it. And there's various ways of connecting it to the internet. You can just connect to a search API, like SERP API, whatever other, like, EXA is another one. That's what you're searching. You can also have a JSON markdown extractor, which is Firecrawl. Or you can have a virtual browser like Browserbase, or you can have a virtual machine like Morph. And then there's also maybe, like, a virtual sort of code environment, like Code Interpreter. So, like, there's just, like, a bunch of different ways to tackle the problem of give a computer to an agent. And I'm just kind of wondering if you see, like, everyone's just, like, happily coexisting in their respective niches. And as a developer, I just go and pick, like, a shopping basket of one of each. Or do you think that you eventually, people will collide?Future of browser automation and market competitionPaul [00:47:18]: I think that currently it's not a zero-sum market. Like, I think we're talking about... I think we're talking about all of knowledge work that people do that can be automated online. All of these, like, trillions of hours that happen online where people are working. And I think that there's so much software to be built that, like, I tend not to think about how these companies will collide. I just try to solve the problem as best as I can and make this specific piece of infrastructure, which I think is an important primitive, the best I possibly can. And yeah. I think there's players that are actually going to like it. I think there's players that are going to launch, like, over-the-top, you know, platforms, like agent platforms that have all these tools built in, right? Like, who's building the rippling for agent tools that has the search tool, the browser tool, the operating system tool, right? There are some. There are some. There are some, right? And I think in the end, what I have seen as my time as a developer, and I look at all the favorite tools that I have, is that, like, for tools and primitives with sufficient levels of complexity, you need to have a solution that's really bespoke to that primitive, you know? And I am sufficiently convinced that the browser is complex enough to deserve a primitive. Obviously, I have to. I'm the founder of BrowserBase, right? I'm talking my book. But, like, I think maybe I can give you one spicy take against, like, maybe just whole OS running. I think that when I look at computer use when it first came out, I saw that the majority of use cases for computer use were controlling a browser. And do we really need to run an entire operating system just to control a browser? I don't think so. I don't think that's necessary. You know, BrowserBase can run browsers for way cheaper than you can if you're running a full-fledged OS with a GUI, you know, operating system. And I think that's just an advantage of the browser. It is, like, browsers are little OSs, and you can run them very efficiently if you orchestrate it well. And I think that allows us to offer 90% of the, you know, functionality in the platform needed at 10% of the cost of running a full OS. Yeah.Open Operator: Browserbase's Open-Source Alternativeswyx [00:49:16]: I definitely see the logic in that. There's a Mark Andreessen quote. I don't know if you know this one. Where he basically observed that the browser is turning the operating system into a poorly debugged set of device drivers, because most of the apps are moved from the OS to the browser. So you can just run browsers.Paul [00:49:31]: There's a place for OSs, too. Like, I think that there are some applications that only run on Windows operating systems. And Eric from pig.dev in this upcoming YC batch, or last YC batch, like, he's building all run tons of Windows operating systems for you to control with your agent. And like, there's some legacy EHR systems that only run on Internet-controlled systems. Yeah.Paul [00:49:54]: I think that's it. I think, like, there are use cases for specific operating systems for specific legacy software. And like, I'm excited to see what he does with that. I just wanted to give a shout out to the pig.dev website.swyx [00:50:06]: The pigs jump when you click on them. Yeah. That's great.Paul [00:50:08]: Eric, he's the former co-founder of banana.dev, too.swyx [00:50:11]: Oh, that Eric. Yeah. That Eric. Okay. Well, he abandoned bananas for pigs. I hope he doesn't start going around with pigs now.Alessio [00:50:18]: Like he was going around with bananas. A little toy pig. Yeah. Yeah. I love that. What else are we missing? I think we covered a lot of, like, the browser-based product history, but. What do you wish people asked you? Yeah.Paul [00:50:29]: I wish people asked me more about, like, what will the future of software look like? Because I think that's really where I've spent a lot of time about why do browser-based. Like, for me, starting a company is like a means of last resort. Like, you shouldn't start a company unless you absolutely have to. And I remain convinced that the future of software is software that you're going to click a button and it's going to do stuff on your behalf. Right now, software. You click a button and it maybe, like, calls it back an API and, like, computes some numbers. It, like, modifies some text, whatever. But the future of software is software using software. So, I may log into my accounting website for my business, click a button, and it's going to go load up my Gmail, search my emails, find the thing, upload the receipt, and then comment it for me. Right? And it may use it using APIs, maybe a browser. I don't know. I think it's a little bit of both. But that's completely different from how we've built software so far. And that's. I think that future of software has different infrastructure requirements. It's going to require different UIs. It's going to require different pieces of infrastructure. I think the browser infrastructure is one piece that fits into that, along with all the other categories you mentioned. So, I think that it's going to require developers to think differently about how they've built software for, you know
News includes upcoming improvements to ex_doc for version navigation, the release of Phoenix Analytics 0.3.0 for plug-and-play application metrics, José Valim's detailed exploration of set-theoretic types for better library compatibility, German Velasco's demonstration of Elixir 1.18's enhanced type system, the beta release of the Ash Framework book on PragProg, and exciting developments in the FLAME ecosystem with AWS EC2 support, and more! Show Notes online - http://podcast.thinkingelixir.com/237 (http://podcast.thinkingelixir.com/237) Elixir Community News https://bsky.app/profile/david.bernheisel.com/post/3lffr6xdvq22r (https://bsky.app/profile/david.bernheisel.com/post/3lffr6xdvq22r?utm_source=thinkingelixir&utm_medium=shownotes) – ex_doc will soon feature a new button to navigate to the latest version's documentation when viewing older versions. https://x.com/mrpopov_com/status/1878817795049488421 (https://x.com/mrpopov_com/status/1878817795049488421?utm_source=thinkingelixir&utm_medium=shownotes) – Phoenix Analytics 0.3.0 released with improved support for Fly.io and Heroku deployments. https://github.com/lalabuy948/PhoenixAnalytics (https://github.com/lalabuy948/PhoenixAnalytics?utm_source=thinkingelixir&utm_medium=shownotes) – Plug and play analytics solution for Phoenix applications, offering embedded dashboard functionality. https://dashbit.co/blog/data-evolution-with-set-theoretic-types (https://dashbit.co/blog/data-evolution-with-set-theoretic-types?utm_source=thinkingelixir&utm_medium=shownotes) – José Valim's article explaining how set-theoretic types will improve library backwards-compatibility in Elixir. https://www.elixirstreams.com/tips/elixir-118-type-system-changes (https://www.elixirstreams.com/tips/elixir-118-type-system-changes?utm_source=thinkingelixir&utm_medium=shownotes) – German Velasco's ElixirStream video demonstrating the improved type system changes in Elixir 1.18. https://pragprog.com/titles/ldash/ash-framework/ (https://pragprog.com/titles/ldash/ash-framework/?utm_source=thinkingelixir&utm_medium=shownotes) – Ash Framework book by Rebecca Le and Zach Daniel released in beta on PragProg, covering LiveView, auth, search, APIs, and notifications. https://github.com/phoenixframework/flame (https://github.com/phoenixframework/flame?utm_source=thinkingelixir&utm_medium=shownotes) – FLAME (Fleeting Lambda Application for Modular Execution) by Chris McCord enables dynamic resource scaling on Fly.io. https://github.com/probably-not/flame-ec2 (https://github.com/probably-not/flame-ec2?utm_source=thinkingelixir&utm_medium=shownotes) – FlameEC2 library extends FLAME functionality to AWS EC2 machines. https://bsky.app/profile/codebeam.bsky.social/post/3lfp4penmik2v (https://bsky.app/profile/codebeam.bsky.social/post/3lfp4penmik2v?utm_source=thinkingelixir&utm_medium=shownotes) – Code BEAM Lite London 2025 is on January 31, featuring Michał Muskała as speaker. https://alchemyconf.com/ (https://alchemyconf.com/?utm_source=thinkingelixir&utm_medium=shownotes) – Alchemy Conf scheduled for March 31 - April 3 in Braga, Portugal. https://membrz.club/alchemyconf/events?tag=workshop (https://membrz.club/alchemyconf/events?tag=workshop?utm_source=thinkingelixir&utm_medium=shownotes) – Alchemy Conf workshops announced featuring Saša Jurić, Zach Daniel, and Andrea Leopardi. https://x.com/Alchemy_Conf/status/1879136370691862929 (https://x.com/Alchemy_Conf/status/1879136370691862929?utm_source=thinkingelixir&utm_medium=shownotes) – Additional announcement about Alchemy Conf workshop details. Do you have some Elixir news to share? Tell us at @ThinkingElixir (https://twitter.com/ThinkingElixir) or email at show@thinkingelixir.com (mailto:show@thinkingelixir.com) Find us online - Message the show - Bluesky (https://bsky.app/profile/thinkingelixir.com) - Message the show - X (https://x.com/ThinkingElixir) - Message the show on Fediverse - @ThinkingElixir@genserver.social (https://genserver.social/ThinkingElixir) - Email the show - show@thinkingelixir.com (mailto:show@thinkingelixir.com) - Mark Ericksen on X - @brainlid (https://x.com/brainlid) - Mark Ericksen on Bluesky - @brainlid.bsky.social (https://bsky.app/profile/brainlid.bsky.social) - Mark Ericksen on Fediverse - @brainlid@genserver.social (https://genserver.social/brainlid) - David Bernheisel on Bluesky - @david.bernheisel.com (https://bsky.app/profile/david.bernheisel.com) - David Bernheisel on Fediverse - @dbern@genserver.social (https://genserver.social/dbern)
There are many, many choices for cloud database services these days. I would hope everyone is aware of the various IaaS options in public clouds with EC2, Azure VMs, GCP Compute Engine, and others. These are often the easiest way to move your workload, but you've really just moved a VM from one place to another (likely more expensive) place. For managed databases, there are lots of choices, but you might not be aware of your options. I ran across an article that discusses the various flavors of managed databases in the big three public clouds for SQL Server. In the piece, there is a section that talks about when a managed database makes sense. I like that it discloses the development on a managed service is expensive. Read the rest of The Managed Cloud Database Options
"Le dernier auditeur pensait que tout avait été codé par la même personne" Le D.E.V. de la semaine est Simon Parisot, CEO et cofondateur de Blank. Simon a fait un pari, un peu fou, au début de l'aventure Blank : avoir un environnement 100% serverless ! Lambda, DynamoDB, S3, &hellip il connait tous les services AWS, mais n'utilise pas une seule EC2 !! Il vient nous raconter comment il a construit cette plateforme, et surtout pourquoi ! Il nous explique aussi les changements que cela a sur le travail des dev (le dev en local est compllqué), les impératifs de qualité du code que cela implique et aussi comment le recrutement doit s'adapter à ce choix technique.Liens évoqués pendant l'émissionIFTTD avec Olivier Dupuis - Faites entrer le hackeurFramework serverless 🎙️ Soutenez le podcast If This Then Dev ! 🎙️ Chaque contribution aide à maintenir et améliorer nos épisodes. Cliquez ici pour nous soutenir sur Tipeee 🙏Archives | Site | Boutique | TikTok | Discord | Twitter | LinkedIn | Instagram | Youtube | Twitch | Job Board |
The annual AWS re:Invent conference in Las Vegas has long been a marquee event for technologists and business leaders. But in 2024, it served as a rallying cry for a new technological epoch - one where generative AI (GenAI) is no longer a nascent tool but a transformative force shaping industries, economies, and creativity. At the heart of this year's address was Dr. Swami Sivasubramanian, AWS's Vice President of AI and Data, who positioned Amazon's cloud division not just as a vendor but as an architect of this revolution. Dr. Sivasubramanian began with a historical overture, likening the current moment to the Wright Brothers' first flight in 1903. That 12-second triumph, he noted, was not an isolated miracle but the result of centuries of cumulative innovation - from Leonardo da Vinci's aeronautical sketches to steam-powered gliders. In the same vein, GenAI represents the culmination of decades of research in neural networks, backpropagation algorithms, and the transformative power of Transformer architectures. However, technological breakthroughs alone were not enough. What set the stage for GenAI's explosive growth, Dr. Sivasubramanian argued, was the convergence of cloud computing, vast data lakes, and affordable machine-learning infrastructure - elements AWS has spent the better part of two decades perfecting. AWS SageMaker: The Vanguard of AI Democratization Central to AWS's GenAI arsenal is Amazon SageMaker, a comprehensive platform designed to simplify machine learning workflows. Over the past year, AWS has added more than 140 features to SageMaker, underscoring its ambition to stay ahead in the arms race of AI development. Among these innovations is SageMaker HyperPod, which provides robust tools for training the mammoth foundational models that underpin GenAI. HyperPod automates complex tasks like checkpointing, resource recovery, and distributed training, enabling enterprises like Salesforce and Thomson Reuters to train billion-parameter models without the logistical headaches. But SageMaker is evolving beyond its core machine-learning roots into a unified platform for data analytics, big data processing, and GenAI workflows. The platform's latest iteration consolidates disparate tools into a single, user-friendly interface, offering businesses an integrated suite for data preparation, model development, and deployment. Training Titans: HyperPod and Bedrock As GenAI models grow in size and sophistication, the cost and complexity of training them have skyrocketed. Dr. Sivasubramanian introduced two pivotal innovations aimed at alleviating these challenges. First, HyperPod Flexible Training Plans address the inefficiencies of securing and managing compute resources for training large models. By automating the reservation of EC2 capacity and distributing workloads intelligently, these plans reduce downtime and optimize costs. Second, Bedrock, AWS's managed service for deploying foundational models, makes it easier for developers to select, customize, and optimize GenAI models. Bedrock offers cutting-edge features like Prompt Caching - a cost-saving tool that reduces latency by storing frequently used queries - and Intelligent Prompt Routing, which directs tasks to the most cost-effective model without sacrificing quality. Case Studies in Innovation Throughout his keynote, Dr. Sivasubramanian showcased real-world applications of AWS's GenAI capabilities. Autodesk, the software titan renowned for its design and engineering tools, is leveraging SageMaker to develop GenAI models that combine spatial reasoning with physics-based design principles. These models allow architects to create structurally sound and manufacturable 3D designs, effectively automating tedious aspects of the creative process. Meanwhile, Rocket Companies, a leader in mortgage lending, has deployed Amazon Bedrock to create AI agents that handle 70% of customer interactions autonomously. These agents, embedded in Rocket's AI-driven platform, streamli...
What if you could scale your SaaS platforms effortlessly across diverse hosting services? Join us as we welcome Adam McCrea, the brilliant mind behind JudoScale, who takes us through his fascinating evolution from being a Rails developer to creating a cutting-edge autoscaling solution. Adam opens up about the technical challenges he faced while adapting JudoScale for platforms like Render, Fly, and Railway, and how Heroku's unique architecture initially shaped his approach. His journey is one of innovation driven by necessity, as JudoScale originated from a need to optimize costs more efficiently than existing solutions.Our conversation doesn't shy away from complexity; in fact, it embraces it. Adam shares his experiences of grappling with AWS integration, navigating the intricate maze of ECS, EC2, Fargate, and IAM, all driven by customer demand. We explore the strategic shift from metered billing to flat-tiered pricing and the hurdles faced while setting up a staging environment on Render, ultimately reaffirming Heroku's smoother experience. This episode promises valuable insights into the strategic decisions and architectural reimaginations that keep JudoScale ahead of the game.Adding a creative flair, we delve into the entertaining world of infomercial production, as Adam recounts his experience crafting a humorous Billy Mays-inspired ad for JudoScale. With the aid of AI tools like ChatGPT and Descript, Adam turned a fun concept into an engaging reality. As we wrap up, Adam shares his excitement for RailsConf in Philadelphia and the significance of fostering connections through digital networking. Whether you're a tech enthusiast or a developer seeking innovative scaling solutions, this episode is brimming with insightful takeaways and creative inspiration.Send us some love.HoneybadgerHoneybadger is an application health monitoring tool built by developers for developers.HoneybadgerHoneybadger is an application health monitoring tool built by developers for developers.Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Support the showReady to start your own podcast?This show is hosted on Buzzsprout and it's awesome, not to mention a Ruby on Rails application. Let Buzzsprout know we sent you and you'll get a $20 Amazon gift card if you sign up for a paid plan, and it helps support our show.
本集節目由【一理潤】贊助播出 韓國熱銷一級舒敏保濕品牌「ILLIYOON一理潤」 敏弱肌專用!獨家兩大專利 植萃多重神經醯胺™ x 神經醯胺複合膠囊™ 溫和而強效配方對抗敏弱危「肌」 康是美、寶雅、屈臣氏、EC2都買的到! 從即日起至2025年1月14日為止購買 一理潤消費總額滿$399,即可參與發票登錄活動,有機會抽爾來回雙人機票及 各大好禮!網站內還有各大通路$100元折價券可以領! ☆活動網址: https://dbtw.pse.is/6qshml
In the fast-paced world of technology, staying relevant means continually adapting to new tools and paradigms. One of the most transformative shifts in recent years has been the rise of cloud computing. In this episode of the Building Better Developers podcast, hosts Rob Broadhead and Michael Meloche explore how cultivating smart cloud development habits can help you stay ahead in an ever-evolving industry. Whether you're a seasoned developer or just starting your journey, embracing cloud technologies can enhance your skills, expand your capabilities, and open doors to exciting opportunities. From practical tips on leveraging free-tier cloud services to insights on earning valuable certifications, this discussion is packed with actionable advice to help you master the cloud and boost your career. Let's dive in and explore how to build the habits that will make cloud technologies a cornerstone of your development journey. Simplify and Expand Your Reach with the Cloud Rob introduces the cloud as a game-changer in the tech space, tracing its evolution since Amazon Web Services (AWS) debuted over a decade ago. Initially limited to services like EC2 and S3, AWS now boasts a staggering array of offerings, with Microsoft Azure and Google following suit. These platforms have become indispensable for developers, offering scalable solutions, robust APIs, and opportunities for experimentation. The hosts emphasize that the cloud isn't just for DevOps or system administrators. Developers stand to gain tremendously from engaging with these platforms. Whether it's spinning up a virtual machine, deploying a simple database, or experimenting with Infrastructure-as-Code, learning these skills bolsters your ability to adapt and solve problems. Build Habits Around Cloud Exploration One of the key takeaways from the episode is to treat cloud exploration as a habit rather than an overwhelming task. Start small: Sign Up and Play Around: Take advantage of free-tier options provided by AWS, Azure, or Google. For example, launch an EC2 instance, set up a database, or explore a service you've never tried before. Time-Box Your Efforts: Dedicate 10–15 minutes daily to exploring cloud services. Over time, this adds up to significant progress without feeling overwhelming. Experiment with Real Use Cases: Rob suggests transferring your local data to the cloud or using a cloud IDE like AWS Cloud9 for coding on the go. These practical applications build confidence while solving real problems. For those who prefer a structured approach, Rob mentions the Launch Your Internet Business series on Developer.com. This step-by-step guide helps you create a server, set up a WordPress site, and learn Linux basics—all while working within a cloud environment. Why Certifications Matter Michael highlights the value of certifications in the cloud domain, particularly for developers aiming to stand out in a competitive job market. Certification programs for platforms like AWS, Azure, and Google often include foundational courses that are approachable even for beginners. These certifications not only validate your skills but also deepen your understanding of specific cloud environments and tools. Michael shares his own experience of obtaining an AWS foundational certification, noting that while the preparation felt daunting at first, the actual process was manageable and rewarding. He encourages developers to take the plunge, as certifications can lead to tangible career benefits. Cloud as a Developer's Playground Beyond certifications and practical applications, the cloud is a playground for innovation. Michael suggests using tools like Docker, Kubernetes, or open-source alternatives to create your own cloud-like environment. Services such as AWS Cloud9 and Eclipse Che allow developers to experiment with coding directly in the cloud, offering unmatched flexibility for remote work and collaboration. For developers working with web technologies, tools like JSFiddle demonstrate the power of browser-based environments. These platforms remove barriers to entry, enabling you to test and deploy ideas without investing in extensive infrastructure. Challenges and Final Thoughts To solidify the lessons from this episode, Rob challenges listeners to take action: Sign up for a cloud provider and explore its offerings. Set up a simple project, such as deploying a virtual server or experimenting with APIs. If certifications interest you, research beginner-friendly options and set a goal to achieve one. As the episode concludes, Rob and Michael remind listeners that the cloud is more than a tool; it's an opportunity to build better habits, expand your knowledge, and position yourself as a forward-thinking developer. Whether you're a beginner or an experienced professional, there's always something new to discover in the ever-evolving cloud landscape. Stay Connected: Join the Develpreneur Community We invite you to join our community and share your coding journey with us. Whether you're a seasoned developer or just starting, there's always room to learn and grow together. Contact us at info@develpreneur.com with your questions, feedback, or suggestions for future episodes. Together, let's continue exploring the exciting world of software development. Additional Resources Free Editors to Help With Web Development AWS Management Tools Google Cloud Platform: Using the platform Building a Portable Development Environment That is OS-agnostic Building Better Habits Videos – With Bonus Content
AI Advances, X Exodus, China Export Bans, and OpenAI's ChatGPT Restrictions In this episode of Hashtag Trending, Jim Love covers major highlights from AWS reInvent, including the launch of Tranium 2 powered EC2 instances, updates to the Amazon Bedrock platform, and collaborations with top companies for AI advancement. Also discussed is the European Federation of Journalists' departure from X (formerly Twitter) over disinformation concerns, China's export restrictions on key materials for technology and defense, and the discovery of ChatGPT's forbidden names list. Tune in for insight into these significant tech developments and their broader implications. 00:00 Major AI Announcements at AWS reInvent 03:27 European Journalists Leave Twitter 04:58 China's Tech Trade War Escalates 06:34 ChatGPT's Forbidden Names 08:38 Conclusion and Contact Information
Hoje é dia de sobre carreira! No episódio de estreia da série especial do podcast, conversamos com Erika Nagamine, Golden Jacket da AWS, sobre a sua trajetória, sobre as suas decisões, e sobre o poder que a curiosidade teve para lhe impulsionar ao longo de toda a sua carreira. Vem ver quem participou desse papo: Paulo Silveira, o host que gosta de certificação André David, o cohost que está rolando até agora Erika Nagamine, Arquiteta de Soluções Especialista em Dados & AI - Analytics na AWS
In this episode of the mnemonic security podcast, Robby is joined by Scott Piper from Wiz and Håkon Sørum from O3 Cyber to talk cloud security. They cover the evolution of cloud security products since Amazon's release of S3 and EC2 in 2006 and how the market has matured into the CNAPP we know today. They chime in on most of the buzzwords associated with CNAPP, including Cloud Security Posture Management (CSPM), Cloud Workload Protection Platform (CWPP), Cloud Infrastructure Entitlement Management (CIEM), and Cloud Detection and Response (CDR), as well as other key areas of CNAPP such as vulnerability scanning, "shift-left" security, cloud data security, and compliance. They explain the definition and challenges of "cloud-native attacks" and misconfigurations and discuss whether third-party SOCs can add context and enhance detection capabilities.
Hoje é dia de falar de nuvem! Neste episódio, exploramos a surpreendente relação entre a AWS e a Amazon Brasil, e as importantes questões ligadas a dimensionamento, escalabilidade e, é claro, segurança quando o assunto é nuvem. Vem ver quem participou desse papo: André David, o host que fica ligado em palavrinhas-chave Vinny Neves, co-host e Tech Lead na UsTwo Bruno Toffolo, Principal Software Development Engineer na Amazon Gaston Perez, Principal Solutions Architect na AWS
AWS Morning Brief for the week of October 7, with Corey Quinn. Links:AWS CloudShell extends most recent capabilities to all commercial RegionsAmazon Aurora Serverless v2 now supports up to 256 ACUsAmazon S3 adds Service Quotas support for S3 general purpose bucketsAWS announces Reserved Nodes flexibility for Amazon ElastiCacheDuckbill Guide to AWS Reserved InstancesDeprecation of Lake Formation's Governed Tables FeatureAnnouncing AWS Neuron Helm ChartLeverage IAM Roles for email sending via SES from EC2 and eliminate a common credential riskIssue with NVIDIA Container Toolkit (CVE-2024-0132, CVE-2024-0133)
In this episode, we provided an overview of GitHub Action Runners and discussed the benefits of using self-hosted runners on AWS. We covered options including EC2 and CodeBuild for running GitHub Actions, compared pricing across solutions, and shared our hands-on experience setting things up. Overall, using AWS services can provide more control, lower latency, and cost optimization compared to GitHub hosted runners.
Welcome to episode 276 of The Cloud Pod, where the forecast is always cloudy! This week, our hosts Justin, Matthew, and Jonathan do a speedrun of OpenWorld news, talk about energy needs and the totally not controversial decision to reopen 3 Mile Island, a “managed” exodus from cloud, and Kubernetes news. As well as Amazon’s RTO we are calling “Elastic Commute”. All this and more, right now on The Cloud Pod. Titles we almost went with this week: The Cloud Pod Hosts don't own enough pants for five days a week IBM thinks it can contain the cost of K8s Microsoft loves nuclear energy The Cloudpod tries to give Oracle some love and still does not care The cloud pod goes nuclear on k8s costs Can IBM contain the costs of Kubernetes and Nuclear Power? Google takes on take over while microsoft takes on nuclear AWS Launches ‘Managed Exodus’: Streamline Your Talent Drain Introducing Amazon WorkForce Alienation: Scale Your Employee Discontent to the Cloud Amazon SageMaker Studio Lab: Now with Real-Time Resignation Prediction A big thanks to this week's sponsor: We're sponsorless! Want to get your brand, company, or service in front of a very enthusiastic group of cloud news seekers? You've come to the right place! Send us an email or hit us up on our slack channel for more info. General News 01:08 IBM acquires Kubernetes cost optimization startup Kubecost IBM is quickly becoming the place where cloud cost companies go to assimilate? Or Die? Rebirthed mabe? Either way, it's not a great place to end up. On Tuesday they announced the acquisition of Kubecost, a FinOps startup that helps teams monitor and optimize their K8 clusters, with a focus on efficiency – and ultimately cost. This acquisition follows the acquisitions of Apptio, Turbonomic, and Instana over the years. Kubecost is the company behind OpenCost; a vendor-neutral open source project that forms part of the core Kubecost commercial offering. OpenCost is part of the Cloud Native Computing Foundations cohort of sandbox projects. Kubecost is expected to be integrated into IBM’s FinOps Suite, which combines Cloudability and Turbonomic. There is also speculation that it might make its way to OpenShift, too. 02:26 Jsutin- “…so KubeCost lives inside of Kubernetes, and basically has the ability to see how much CPU, how much memory they’re using, then calculate basically the price of the EC2 broken down into the different pods and services.” AI Is Going Great –
An airhacks.fm conversation with Jonathan Schneider (@jon_k_schneider) about: Spinnaker's role in continuous delivery and multi-cloud deployments, multi-cloud architectures, Micrometer's origin and design as a vendor-neutral metrics abstraction library, comparison of micrometer to other metrics solutions like opentelemetry and MicroProfile Metrics, exploration of Micrometer's architecture including registries and meter types, debate on static vs dependency-injected registries, explanation of distribution summaries and their use cases, consideration of unit testing metrics, examination of Micrometer's support for multiple monitoring systems simultaneously, discussion of meter filters for customizing metric output, reflection on the trade-offs between language support and monitoring system support in metrics libraries, insights into the separation of application and runtime metrics, Jonathan's experience developing Micrometer at Netflix and Pivotal, current usage of Micrometer and prometheus in Modern's multi-tenant SaaS architecture, comparison of serverless and EC2-based deployments for different use cases, OpenRewrite's growing popularity in Europe Jonathan Schneider on twitter: @jon_k_schneider
AWS Morning Brief for the week of Monday, August 5th with Mike Julian. Links:Introducing AWS End User MessagingAWS Graviton-based EC2 instances now support hibernationNew Amazon CloudWatch dimensions for Amazon EC2 On Demand Capacity ReservationsAWS and Multicloud: Existing capabilities & continued enhancementsDeliver Amazon CloudWatch logs to Amazon OpenSearch ServerlessCost Optimizer for Amazon WorkSpaces 2.7 releasedJeff Barr, Chief Evangelist at AWS, confirms service deprecations via Twitter
AWS Morning Brief for the week of Monday, July 1st, with Corey Quinn. Links:Amazon DocumentDB announces IAM database authenticationAmazon Redshift Query Editor V2 now supports 100MB file uploadsAmazon Time Sync Service expands microsecond-accurate time to 27 EC2 instance typesAnnouncing Amazon WorkSpaces Pools, a new feature of Amazon WorkSpacesAWS CodeBuild supports Arm-based workloads using AWS Graviton3Optimizing Amazon Simple Queue Service (SQS) for speed and scaleTen Ways to Improve Your AWS Operations
Bret and Nirmal are joined by Michael Fischer of AWS to discuss why we should use Graviton, their arm64 compute with AWS-designed CPUs.Graviton is AWS' term for their custom ARM-based EC2 instances. We now have all major clouds offering an ARM-based option for their server instances, but AWS was first, way back in 2018. Fast forward 6 years and AWS is releasing their 4th generation Graviton instances, and they deliver all the CPU, networking, memory and storage performance that you'd expect from their x86 instances and beyond.I'm a big fan of ARM-based servers and the price points that AWS gives us. They have been my default EC2 instance type for years now, and I recommend it for all projects I'm working on with companies.We get into the history of Graviton, how easy it is to build and deploy containers and Kubernetes clusters that have Graviton and even two different platform types in the same cluster. We also cover how to build multi-platform images using Docker BuildKit.Be sure to check out the live recording of the complete show from May 9, 2024 on YouTube (Ep. 265). Includes demos. ★Topics★Graviton + GitLab + EKSPorting Advisor for GravitonGraviton Getting StartedCreators & Guests Cristi Cotovan - Editor Beth Fisher - Producer Bret Fisher - Host Nirmal Mehta - Host Michael Fischer - Guest (00:00) - Intro (06:19) - AWS and ARM64: Evolution to Graviton 4 (07:55) - AWS EC2 Nitro: Why and How? (11:53) - Nitro and Graviton's Evolution (18:35) - What Can't Run on Graviton? (23:15) - Moving Your Workloads to Graviton (27:19) - K8s Tooling and Multi-Platform Images (37:07) - Tips for Getting Started with Graviton You can also support my free material by subscribing to my YouTube channel and my weekly newsletter at bret.news!Grab the best coupons for my Docker and Kubernetes courses.Join my cloud native DevOps community on Discord.Grab some merch at Bret's Loot BoxHomepage bretfisher.com
Join us at our first in-person conference on June 25 all about AI Quality: https://www.aiqualityconference.com/ Matthew McClean is a Machine Learning Technology Leader with the leading Amazon Web Services (AWS) cloud platform. He leads the customer engineering teams at Annapurna ML helping customers adopt AWS Trainium and Inferentia for their Gen AI workloads. Kamran Khan, Sr Technical Business Development Manager for AWS Inferentina/Trianium at AWS. He has over a decade of experience helping customers deploy and optimize deep learning training and inference workloads using AWS Inferentia and AWS Trainium. AWS Tranium and Inferentia // MLOps podcast #238 with Kamran Khan, BD, Annapurna ML and Matthew McClean, Annapurna Labs Lead Solution Architecture at AWS. Huge thank you to AWS for sponsoring this episode. AWS - https://aws.amazon.com/ // Abstract Unlock unparalleled performance and cost savings with AWS Trainium and Inferentia! These powerful AI accelerators offer MLOps community members enhanced availability, compute elasticity, and energy efficiency. Seamlessly integrate with PyTorch, JAX, and Hugging Face, and enjoy robust support from industry leaders like W&B, Anyscale, and Outerbounds. Perfectly compatible with AWS services like Amazon SageMaker, getting started has never been easier. Elevate your AI game with AWS Trainium and Inferentia! // Bio Kamran Khan Helping developers and users achieve their AI performance and cost goals for almost 2 decades. Matthew McClean Leads the Annapurna Labs Solution Architecture and Prototyping teams helping customers train and deploy their Generative AI models with AWS Trainium and AWS Inferentia // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links AWS Trainium: https://aws.amazon.com/machine-learning/trainium/ AWS Inferentia: https://aws.amazon.com/machine-learning/inferentia/ --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Kamran on LinkedIn: https://www.linkedin.com/in/kamranjk/ Connect with Matt on LinkedIn: https://www.linkedin.com/in/matthewmcclean/ Timestamps: [00:00] Matt's & Kamran's preferred coffee [00:53] Takeaways [01:57] Please like, share, leave a review, and subscribe to our MLOps channels! [02:22] AWS Trainium and Inferentia rundown [06:04] Inferentia vs GPUs: Comparison [11:20] Using Neuron for ML [15:54] Should Trainium and Inferentia go together? [18:15] ML Workflow Integration Overview [23:10] The Ec2 instance [24:55] Bedrock vs SageMaker [31:16] Shifting mindset toward open source in enterprise [35:50] Fine-tuning open-source models, reducing costs significantly [39:43] Model deployment cost can be reduced innovatively [43:49] Benefits of using Inferentia and Trainium [45:03] Wrap up
Isaac and Jeffrey discuss the topic of doing full rewrites when it comes to architecture. They explore a case where a client needed to move from a single EC2 instance to a more stable system with load-balanced instances. They discuss the parallels between rewriting code and rewriting architecture, and the challenges and risks involved.
AWS Morning Brief for the week of April 29th, 2024, with Corey Quinn. Links:Amazon GameLift now includes containers support (Preview)Introducing Amazon Route 53 Profiles Amazon Simple Email Service is now available in the AWS GovCloud (US-East) Region Amazon Time Sync Service expands Microsecond-Accurate time to 87 additonal EC2 instance typesHow to Migrate Content from Amazon WorkDocs Build and deploy a 1 TB/s file system in under an hourAWS Response to March 2024 CSRB report chance to be actual leaderspeople turning down job offers
AWS Morning Brief for the week of April 1, 2024, with Corey Quinn. Links:AI recommendations for descriptions in Amazon DataZone now generally availableAmazon DynamoDB Import from S3 now supports up to 50,000 Amazon S3 objects in a single bulk importAmazon Time Sync Service now supports microsecond-accurate time in US East (N. Virginia) Region AWS Billing and Cost Management Data Exports now supports AWS CloudFormation AWS Compute Optimizer introduces memory customizability for EC2 rightsizing recommendationsAWS Cost Allocation Tags now support retroactive applicationEstimating the charges for Amazon RDS Extended SupportAmazon completes $4B Anthropic investment to advance generative AI
Evelyn Osman, Principal Platform Engineer at AutoScout24, joins Corey on Screaming in the Cloud to discuss the dire need for developers to agree on a standardized tool set in order to scale their projects and innovate quickly. Corey and Evelyn pick apart the new products being launched in cloud computing and discover a large disconnect between what the industry needs and what is actually being created. Evelyn shares her thoughts on why viewing platforms as products themselves forces developers to get into the minds of their users and produces a better end result.About EvelynEvelyn is a recovering improviser currently role playing as a Lead Platform Engineer at Autoscout24 in Munich, Germany. While she says she specializes in AWS architecture and integration after spending 11 years with it, in truth she spends her days convincing engineers that a product mindset will make them hate their product managers less.Links Referenced:LinkedIn: https://www.linkedin.com/in/evelyn-osman/TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. My guest today is Evelyn Osman, engineering manager at AutoScout24. Evelyn, thank you for joining me.Evelyn: Thank you very much, Corey. It's actually really fun to be on here.Corey: I have to say one of the big reasons that I was enthused to talk to you is that you have been using AWS—to be direct—longer than I have, and that puts you in a somewhat rarefied position where AWS's customer base has absolutely exploded over the past 15 years that it's been around, but at the beginning, it was a very different type of thing. Nowadays, it seems like we've lost some of that magic from the beginning. Where do you land on that whole topic?Evelyn: That's actually a really good point because I always like to say, you know, when I come into a room, you know, I really started doing introductions like, “Oh, you know, hey,” I'm like, you know, “I'm this director, I've done this XYZ,” and I always say, like, “I'm Evelyn, engineering manager, or architect, or however,” and then I say, you know, “I've been working with AWS, you know, 11, 12 years,” or now I can't quite remember.Corey: Time becomes a flat circle. The pandemic didn't help.Evelyn: [laugh] Yeah, I just, like, a look at that the year, and I'm like, “Jesus. It's been that long.” Yeah. And usually, like you know, you get some odd looks like, “Oh, my God, you must be a sage.” And for me, I'm… you see how different services kind of, like, have just been reinventions of another one, or they just take a managed service and make another managed service around it. So, I feel that there's a lot of where it's just, you know, wrapping up a pretty bow, and calling it something different, it feels like.Corey: That's what I've been low-key asking people for a while now over the past year, namely, “What is the most foundational, interesting thing that AWS has done lately, that winds up solving for this problem of whatever it is you do as a company? What is it that has foundationally made things better that AWS has put out in the last service? What was it?” And the answers I get are all depressingly far in the past, I have to say. What's yours?Evelyn: Honestly, I think the biggest game-changer I remember experiencing was at an analyst summit in Stockholm when they announced Lambda.Corey: That was announced before I even got into this space, as an example of how far back things were. And you're right. That was transformative. That was awesome.Evelyn: Yeah, precisely. Because before, you know, we were always, like, trying to figure, okay, how do we, like, launch an instance, run some short code, and then clean it up. AWS is going to charge for an hour, so we need to figure out, you know, how to pack everything into one instance, run for one hour. And then they announced Lambda, and suddenly, like, holy shit, this is actually a game changer. We can actually write small functions that do specific things.And, you know, you go from, like, microservices, like, to like, tiny, serverless functions. So, that was huge. And then DynamoDB along with that, really kind of like, transformed the entire space for us in many ways. So, back when I was at TIBCO, there was a few innovations around that, even, like, one startup inside TIBCO that quite literally, their entire product was just Lambda functions. And one of their problems was, they wanted to sell in the Marketplace, and they couldn't figure out how to sell Lambda on the marketplace.Corey: It's kind of wild when we see just how far it's come, but also how much they've announced that doesn't change that much, to be direct. For me, one of the big changes that I remember that really made things better for customers—thought it took a couple of years—was EFS. And even that's a little bit embarrassing because all that is, “All right, we finally found a way to stuff a NetApp into us-east-1,” so now NFS, just like you used to use it in the 90s and the naughts, can be done responsibly in the cloud. And that, on some level, wasn't a feature launch so much as it was a concession to the ways that companies had built things and weren't likely to change.Evelyn: Honestly, I found the EFS launch to be a bit embarrassing because, like, you know, when you look closer at it, you realize, like, the performance isn't actually that great.Corey: Oh, it was horrible when it launched. It would just slam to a halt because you got the IOPS scaled with how much data you stored on it. The documentation explicitly said to use dd to start loading a bunch of data onto it to increase the performance. It's like, “Look, just sandbag the thing so it does what you'd want.” And all that stuff got fixed, but at the time it looked like it was clown shoes.Evelyn: Yeah, and that reminds me of, like, EBS's, like, gp2 when we're, like you know, we're talking, like, okay, provision IOPS with gp2. We just kept saying, like, just give yourself really big volume for performance. And it feel like they just kind of kept that with EFS. And it took years for them to really iterate off of that. Yeah, so, like, EFS was a huge thing, and I see us, we're still using it now today, and like, we're trying to integrate, especially for, like, data center migrations, but yeah, you always see that a lot of these were first more for, like, you know, data centers to the cloud, you know. So, first I had, like, EC2 classic. That's where I started. And I always like to tell a story that in my team, we're talking about using AWS, I was the only person fiercely against it because we did basically large data processing—sorry, I forget the right words—data analytics. There we go [laugh].Corey: I remember that, too. When it first came out, it was, “This sounds dangerous and scary, and it's going to be a flash in the pan because who would ever trust their core compute infrastructure to some random third-party company, especially a bookstore?” And yeah, I think I got that one very wrong.Evelyn: Yeah, exactly. I was just like, no way. You know, I see all these articles talking about, like, terrible disk performance, and here I am, where it's like, it's my bread and butter. I'm specialized in it, you know? I write code in my sleep and such.[Yeah, the interesting thing is, I was like, first, it was like, I can 00:06:03] launch services, you know, to kind of replicate when you get in a data center to make it feature comparable, and then it was taking all this complex services and wrapping it up in a pretty bow for—as a managed service. Like, EKS, I think, was the biggest one, if we're looking at managed services. Technically Elasticsearch, but I feel like that was the redheaded stepchild for quite some time.Corey: Yeah, there was—Elasticsearch was a weird one, and still is. It's not a pleasant service to run in any meaningful sense. Like, what people actually want as the next enhancement that would excite everyone is, I want a serverless version of this thing where I can just point it at a bunch of data, I hit an API that I don't have to manage, and get Elasticsearch results back from. They finally launched a serverless offering that's anything but. You have to still provision compute units for it, so apparently, the word serverless just means managed service over at AWS-land now. And it just, it ties into the increasing sense of disappointment I've had with almost all of their recent launches versus what I felt they could have been.Evelyn: Yeah, the interesting thing about Elasticsearch is, a couple of years ago, they came out with OpenSearch, a competing Elasticsearch after [unintelligible 00:07:08] kind of gave us the finger and change the licensing. I mean, OpenSearch actually become a really great offering if you run it yourself, but if you use their managed service, it can kind—you lose all the benefits, in a way.Corey: I'm curious, as well, to get your take on what I've been seeing that I think could only be described as an internal shift, where it's almost as if there's been a decree passed down that every service has to run its own P&L or whatnot, and as a result, everything that gets put out seems to be monetized in weird ways, even when I'd argue it shouldn't be. The classic example I like to use for this is AWS Config, where it charges you per evaluation, and that happens whenever a cloud resource changes. What that means is that by using the cloud dynamically—the way that they supposedly want us to do—we wind up paying a fee for that as a result. And it's not like anyone is using that service in isolation; it is definitionally being used as people are using other cloud resources, so why does it cost money? And the answer is because literally everything they put out costs money.Evelyn: Yep, pretty simple. Oftentimes, there's, like, R&D that goes into it, but the charges seem a bit… odd. Like from an S3 lens, was, I mean, that's, like, you know, if you're talking about services, that was actually a really nice one, very nice holistic overview, you know, like, I could drill into a data lake and, like, look into things. But if you actually want to get anything useful, you have to pay for it.Corey: Yeah. Everything seems to, for one reason or another, be stuck in this place where, “Well, if you want to use it, it's going to cost.” And what that means is that it gets harder and harder to do anything that even remotely resembles being able to wind up figuring out where's the spend going, or what's it going to cost me as time goes on? Because it's not just what are the resources I'm spinning up going to cost, what are the second, third, and fourth-order effects of that? And the honest answer is, well, nobody knows. You're going to have to basically run an experiment and find out.Evelyn: Yeah. No, true. So, what I… at AutoScout, we actually ended up doing is—because we're trying to figure out how to tackle these costs—is they—we built an in-house cost allocation solution so we could track all of that. Now, AWS has actually improved Cost Explorer quite a bit, and even, I think, Billing Conductor was one that came out [unintelligible 00:09:21], kind of like, do a custom tiered and account pricing model where you can kind of do the same thing. But even that also, there is a cost with it.I think that was trying to compete with other, you know, vendors doing similar solutions. But it still isn't something where we see that either there's, like, arbitrarily low pricing there, or the costs itself doesn't really quite make sense. Like, AWS [unintelligible 00:09:45], as you mentioned, it's a terrific service. You know, we try to use it for compliance enforcement and other things, catching bad behavior, but then as soon as people see the price tag, we just run away from it. So, a lot of the security services themselves, actually, the costs, kind of like, goes—skyrockets tremendously when you start trying to use it across a large organization. And oftentimes, the organization isn't actually that large.Corey: Yeah, it gets to this point where, especially in small environments, you have to spend more energy and money chasing down what the cost is than you're actually spending on the thing. There were blog posts early on that, “Oh, here's how you analyze your bill with Redshift,” and that was a minimum 750 bucks a month. It's, well, I'm guessing that that's not really for my $50 a month account.Evelyn: Yeah. No, precisely. I remember seeing that, like, entire ETL process is just, you know, analyze your invoice. Cost [unintelligible 00:10:33], you know, is fantastic, but at the end of the day, like, what you're actually looking at [laugh], is infinitesimally small compared to all the data in that report. Like, I think oftentimes, it's simply, you know, like, I just want to look at my resources and allocate them in a multidimensional way. Which actually isn't really that multidimensional, when you think about it [laugh].Corey: Increasingly, Cost Explorer has gotten better. It's not a new service, but every iteration seems to improve it to a point now where I'm talking to folks, and they're having a hard time justifying most of the tools in the cost optimization space, just because, okay, they want a percentage of my spend on AWS to basically be a slightly better version of a thing that's already improving and works for free. That doesn't necessarily make sense. And I feel like that's what you get trapped into when you start going down the VC path in the cost optimization space. You've got to wind up having a revenue model and an offering that scales through software… and I thought, originally, I was going to be doing something like that. At this point, I'm unconvinced that anything like that is really tenable.Evelyn: Yeah. When you're a small organization you're trying to optimize, you might not have the expertise and the knowledge to do so, so when one of these small consultancies comes along, saying, “Hey, we're going to charge you a really small percentage of your invoice,” like, okay, great. That's, like, you know, like, a few $100 a month to make sure I'm fully optimized, and I'm saving, you know, far more than that. But as soon as your invoice turns into, you know, it's like $100,000, or $300,000 or more, that percentage becomes rather significant. And I've had vendors come to me and, like, talk to me and is like, “Hey, we can, you know, for a small percentage, you know, we're going to do this machine learning, you know, AI optimization for you. You know, you don't have to do anything. We guaranteed buybacks your RIs.” And as soon as you look at the price tag with it, we just have to walk away. Or oftentimes we look at it, and there are truly very simple ways to do it on your own, if you just kind of put some thought into it.Corey: While we want to talking a bit before this show, you taught me something new about GameLift, which I think is a different problem that AWS has been dealing with lately. I've never paid much attention to it because it is the—as I assume from what it says on the tin, oh, it's a service for just running a whole bunch of games at scale, and I'm not generally doing that. My favorite computer game remains to be Twitter at this point, but that's okay. What is GameLift, though, because you want to shining a different light on it, which makes me annoyed that Amazon Marketing has not pointed this out.Evelyn: Yeah, so I'll preface this by saying, like, I'm not an expert on GameLift. I haven't even spun it up myself because there's quite a bit of price. I learned this fall while chatting with an SA who works in the gaming space, and it kind of like, I went, like, “Back up a second.” If you think about, like, I'm, you know, like, World of Warcraft, all you have are thousands of game clients all over the world, playing the same game, you know, on the same server, in the same instance, and you need to make sure, you know, that when I'm running, and you're running, that we know that we're going to reach the same point the same time, or if there's one object in that room, that only one of us can get it. So, all these servers are doing is tracking state across thousands of clients.And GameLift, when you think about your dedicated game service, it really is just multi-region distributed state management. Like, at the basic, that's really what it is. Now, there's, you know, quite a bit more happening within GameLift, but that's what I was going to explain is, like, it's just state management. And there are far more use cases for it than just for video games.Corey: That's maddening to me because having a global session state store, for lack of a better term, is something that so many customers have built themselves repeatedly. They can build it on top of primitives like DynamoDB global tables, or alternately, you have a dedicated region where that thing has to live and everything far away takes forever to round-trip. If they've solved some of those things, why on earth would they bury it under a gaming-branded service? Like, offer that primitive to the rest of us because that's useful.Evelyn: No, absolutely. And honestly, I wouldn't be surprised if you peeled back the curtain with GameLift, you'll find a lot of—like, several other you know, AWS services that it's just built on top of. I kind of mentioned earlier is, like, what I see now with innovation, it's like we just see other services packaged together and releases a new product.Corey: Yeah, IoT had the same problem going on for years where there was a lot of really good stuff buried in there, like IOT events. People were talking about using that for things like browser extensions and whatnot, but you need to be explicitly told that that's a thing that exists and is handy, but otherwise you'd never know it was there because, “Well, I'm not building anything that's IoT-related. Why would I bother?” It feels like that was one direction that they tended to go in.And now they take existing services that are, mmm, kind of milquetoast, if I'm being honest, and then saying, “Oh, like, we have Comprehend that does, effectively detection of themes, keywords, and whatnot, from text. We're going to wind up re-releasing that as Comprehend Medical.” Same type of thing, but now focused on a particular vertical. Seems to me that instead of being a specific service for that vertical, just improve the baseline the service and offer HIPAA compliance if it didn't exist already, and you're mostly there. But what do I know? I'm not a product manager trying to get promoted.Evelyn: Yeah, that's true. Well, I was going to mention that maybe it's the HIPAA compliance, but actually, a lot of their services already have HIPAA compliance. And I've stared far too long at that compliance section on AWS's site to know this, but you know, a lot of them actually are HIPAA-compliant, they're PCI-compliant, and ISO-compliant, and you know, and everything. So, I'm actually pretty intrigued to know why they [wouldn't 00:16:04] take that advantage.Corey: I just checked. Amazon Comprehend is itself HIPAA-compliant and is qualified and certified to hold Personal Health Information—PHI—Private Health Information, whatever the acronym stands for. Now, what's the difference, then, between that and Medical? In fact, the HIPAA section says for Comprehend Medical, “For guidance, see the previous section on Amazon Comprehend.” So, there's no difference from a regulatory point of view.Evelyn: That's fascinating. I am intrigued because I do know that, like, within AWS, you know, they have different segments, you know? There's, like, Digital Native Business, there's Enterprise, there's Startup. So, I am curious how things look over the engineering side. I'm going to talk to somebody about this now [laugh].Corey: Yeah, it's the—like, I almost wonder, on some level, it feels like, “Well, we wound to building this thing in the hopes that someone would use it for something. And well, if we just use different words, it checks a box in some analyst's chart somewhere.” I don't know. I mean, I hate to sound that negative about it, but it's… increasingly when I talk to customers who are active in these spaces around the industry vertical targeted stuff aimed at their industry, they're like, “Yeah, we took a look at it. It was adorable, but we're not using it that way. We're going to use either the baseline version or we're going to work with someone who actively gets our industry.” And I've heard that repeated about three or four different releases that they've put out across the board of what they've been doing. It feels like it is a misunderstanding between what the world needs and what they're able to or willing to build for us.Evelyn: Not sure. I wouldn't be surprised, if we go far enough, it could probably be that it's just a product manager saying, like, “We have to advertise directly to the industry.” And if you look at it, you know, in the backend, you know, it's an engineer, you know, kicking off a build and just changing the name from Comprehend to Comprehend Medical.Corey: And, on some level, too, they're moving a lot more slowly than they used to. There was a time where they were, in many cases, if not the first mover, the first one to do it well. Take Code Whisperer, their AI powered coding assistant. That would have been a transformative thing if GitHub Copilot hadn't beaten them every punch, come out with new features, and frankly, in head-to-head experiments that I've run, came out way better as a product than what Code Whisperer is. And while I'd like to say that this is great, but it's too little too late. And when I talk to engineers, they're very excited about what Copilot can do, and the only people I see who are even talking about Code Whisperer work at AWS.Evelyn: No, that's true. And so, I think what's happening—and this is my opinion—is that first you had AWS, like, launching a really innovative new services, you know, that kind of like, it's like, “Ah, it's a whole new way of running your workloads in the cloud.” Instead of you know, basically, hiring a whole team, I just click a button, you have your instance, you use it, sell software, blah, blah, blah, blah. And then they went towards serverless, and then IoT, and then it started targeting large data lakes, and then eventually that kind of run backwards towards security, after the umpteenth S3 data leak.Corey: Oh, yeah. And especially now, like, so they had a hit in some corners with SageMaker, so now there are 40 services all starting with the word SageMaker. That's always pleasant.Evelyn: Yeah, precisely. And what I kind of notice is… now they're actually having to run it even further back because they caught all the corporations that could pivot to the cloud, they caught all the startups who started in the cloud, and now they're going for the larger behemoths who have massive data centers, and they don't want to innovate. They just want to reduce this massive sysadmin team. And I always like to use the example of a Bare Metal. When that came out in 2019, everybody—we've all kind of scratched your head. I'm like, really [laugh]?Corey: Yeah, I could see where it makes some sense just for very specific workloads that involve things like specific capabilities of processors that don't work under emulation in some weird way, but it's also such a weird niche that I'm sure it's there for someone. My default assumption, just given the breadth of AWS's customer base, is that whenever I see something that they just announced, well, okay, it's clearly not for me; that doesn't mean it's not meeting the needs of someone who looks nothing like me. But increasingly as I start exploring the industry in these services have time to percolate in the popular imagination and I still don't see anything interesting coming out with it, it really makes you start to wonder.Evelyn: Yeah. But then, like, I think, like, roughly a year or something, right after Bare Metal came out, they announced Outposts. So, then it was like, another way to just stay within your data center and be in the cloud.Corey: Yeah. There's a bunch of different ways they have that, okay, here's ways you can run AWS services on-prem, but still pay us by the hour for the privilege of running things that you have living in your facility. And that doesn't seem like it's quite fair.Evelyn: That's exactly it. So, I feel like now it's sort of in diminishing returns and sort of doing more cloud-native work compared to, you know, these huge opportunities, which is everybody who still has a data center for various reasons, or they're cloud-native, and they grow so big, that they actually start running their own data centers.Corey: I want to call out as well before we wind up being accused of being oblivious, that we're recording this before re:Invent. So, it's entirely possible—I hope this happens—that they announce something or several some things that make this look ridiculous, and we're embarrassed to have had this conversation. And yeah, they're totally getting it now, and they have completely surprised us with stuff that's going to be transformative for almost every customer. I've been expecting and hoping for that for the last three or four re:Invents now, and I haven't gotten it.Evelyn: Yeah, that's right. And I think there's even a new service launches that actually are missing fairly obvious things in a way. Like, mine is the Managed Workflow for Amazon—it's Managed Airflow, sorry. So, we were using Data Pipeline for, you know, big ETL processing, so it was an in-house tool we kind of built at Autoscout, we do platform engineering.And it was deprecated, so we looked at a new—what to replace it with. And so, we looked at Airflow, and we decided this is the way to go, we want to use managed because we don't want to maintain our own infrastructure. And the problem we ran into is that it doesn't have support for shared VPCs. And we actually talked to our account team, and they were confused. Because they said, like, “Well, every new service should support it natively.” But it just didn't have it. And that's, kind of, what, I kind of found is, like, there's—it feels—sometimes it's—there's a—it's getting rushed out the door, and it'll actually have a new managed service or new service launched out, but they're also sort of cutting some corners just to actually make sure it's packaged up and ready to go.Corey: When I'm looking at this, and seeing how this stuff gets packaged, and how it's built out, I start to understand a pattern that I've been relatively down on across the board. I'm curious to get your take because you work at a fairly sizable company as an engineering manager, running teams of people who do this sort of thing. Where do you land on the idea of companies building internal platforms to wrap around the offerings that the cloud service providers that they use make available to them?Evelyn: So, my opinion is that you need to build out some form of standardized tool set in order to actually be able to innovate quickly. Now, this sounds counterintuitive because everyone is like, “Oh, you know, if I want to innovate, I should be able to do this experiment, and try out everything, and use what works, and just release it.” And that greatness [unintelligible 00:23:14] mentality, you know, it's like five talented engineers working to build something. But when you have, instead of five engineers, you have five teams of five engineers each, and every single team does something totally different. You know, one uses Scala, and other on TypeScript, another one, you know .NET, and then there could have been a [last 00:23:30] one, you know, comes in, you know, saying they're still using Ruby.And then next thing you know, you know, you have, like, incredibly diverse platforms for services. And if you want to do any sort of like hiring or cross-training, it becomes incredibly difficult. And actually, as the organization grows, you want to hire talent, and so you're going to have to hire, you know, a developer for this team, you going to have to hire, you know, Ruby developer for this one, a Scala guy here, a Node.js guy over there.And so, this is where we say, “Okay, let's agree. We're going to be a Scala shop. Great. All right, are we running serverless? Are we running containerized?” And you agree on those things. So, that's already, like, the formation of it. And oftentimes, you start with DevOps. You'll say, like, “I'm a DevOps team,” you know, or doing a DevOps culture, if you do it properly, but you always hit this scaling issue where you start growing, and then how do you maintain that common tool set? And that's where we start looking at, you know, having a platform… approach, but I'm going to say it's Platform-as-a-Product. That's the key.Corey: Yeah, that's a good way of framing it because originally, the entire world needed that. That's what RightScale was when EC2 first came out. It was a reimagining of the EC2 console that was actually usable. And in time, AWS improved that to the point where RightScale didn't really have a place anymore in a way that it had previously, and that became a business challenge for them. But you have, what is it now, 2, 300 services that AWS has put out, and out, and okay, great. Most companies are really only actively working with a handful of those. How do you make those available in a reasonable way to your teams, in ways that aren't distracting, dangerous, et cetera? I don't know the answer on that one.Evelyn: Yeah. No, that's true. So, full disclosure. At AutoScout, we do platform engineering. So, I'm part of, like, the platform engineering group, and we built a platform for our product teams. It's kind of like, you need to decide to [follow 00:25:24] those answers, you know? Like, are we going to be fully containerized? Okay, then, great, we're going to use Fargate. All right, how do we do it so that developers don't actually—don't need to think that they're running Fargate workloads?And that's, like, you know, where it's really important to have those standardized abstractions that developers actually enjoy using. And I'd even say that, before you start saying, “Ah, we're going to do platform,” you say, “We should probably think about developer experience.” Because you can do a developer experience without a platform. You can do that, you know, in a DevOps approach, you know? It's basically build tools that makes it easy for developers to write code. That's the first step for anything. It's just, like, you have people writing the code; make sure that they can do the things easily, and then look at how to operate it.Corey: That sure would be nice. There's a lack of focus on usability, especially when it comes to a number of developer tools that we see out there in the wild, in that, they're clearly built by people who understand the problem space super well, but they're designing these things to be used by people who just want to make the website work. They don't have the insight, the knowledge, the approach, any of it, nor should they necessarily be expected to.Evelyn: No, that's true. And what I see is, a lot of the times, it's a couple really talented engineers who are just getting shit done, and they get shit done however they can. So, it's basically like, if they're just trying to run the website, they're just going to write the code to get things out there and call it a day. And then somebody else comes along, has a heart attack when see what's been done, and they're kind of stuck with it because there is no guardrails or paved path or however you want to call it.Corey: I really hope—truly—that this is going to be something that we look back and laugh when this episode airs, that, “Oh, yeah, we just got it so wrong. Look at all the amazing stuff that came out of re:Invent.” Are you going to be there this year?Evelyn: I am going to be there this year.Corey: My condolences. I keep hoping people get to escape.Evelyn: This is actually my first one in, I think, five years. So, I mean, the last time I was there was when everybody's going crazy over pins. And I still have a bag of them [laugh].Corey: Yeah, that did seem like a hot-second collectable moment, didn't it?Evelyn: Yeah. And then at the—I think, what, the very last day, as everybody's heading to re:Play, you could just go into the registration area, and they just had, like, bags of them lying around to take. So, all the competing, you know, to get the requirements for a pin was kind of moot [laugh].Corey: Don't you hate it at some point where it's like, you feel like I'm going to finally get this crowning achievement, it's like or just show up at the buffet at the end and grab one of everything, and wow, that would have saved me a lot of pain and trouble.Evelyn: Yeah.Corey: Ugh, scavenger hunts are hard, as I'm about to learn to my own detriment.Evelyn: Yeah. No, true. Yeah. But I am really hoping that re:Invent proves me wrong. Embarrassingly wrong, and then all my colleagues can proceed to mock me for this ridiculous podcast that I made with you. But I am a fierce skeptic. Optimistic nihilist, but still a nihilist, so we'll see how re:Invent turns out.Corey: So, I am curious, given your experience at more large companies than I tend to be embedded with for any period of time, how have you found that these large organizations tend to pick up new technologies? What does the adoption process look like? And honestly, if you feel like throwing some shade, how do they tend to get it wrong?Evelyn: In most cases, I've seen it go… terrible. Like, it just blows up in their face. And I say that is because a lot of the time, an organization will say, “Hey, we're going to adopt this new way of organizing teams or developing products,” and they look at all the practices. They say, “Okay, great. Product management is going to bring it in, they're going to structure things, how we do the planning, here's some great charts and diagrams,” but they don't really look at the culture aspect.And that's always where I've seen things fall apart. I've been in a room where, you know, our VP was really excited about team topologies and say, “Hey, we're going to adopt it.” And then an engineering manager proceeded to say, “Okay, you're responsible for this team, you're responsible for that team, you're responsible for this team talking to, like, a team of, like, five engineers,” which doesn't really work at all. Or, like, I think the best example is DevOps, you know, where you say, “Ah, we're going to adopt DevOps, we're going to have a DevOps team, or have a DevOps engineer.”Corey: Step one: we're going to rebadge everyone with existing job titles to have the new fancy job titles that reflect it. It turns out that's not necessarily sufficient in and of itself.Evelyn: Not really. The Spotify model. People say, like, “Oh, we're going to do the Spotify model. We're going to do skills, tribes, you know, and everything. It's going to be awesome, it's going to be great, you know, and nice, cross-functional.”The reason I say it bails on us every single time is because somebody wants to be in control of the process, and if the process is meant to encourage collaboration and innovation, that person actually becomes a chokehold for it. And it could be somebody that says, like, “Ah, I need to be involved in every single team, and listen to know what's happening, just so I'm aware of it.” What ends up happening is that everybody differs to them. So, there is no collaboration, there is no innovation. DevOps, you say, like, “Hey, we're going to have a team to do everything, so your developers don't need to worry about it.” What ends up happening is you're still an ops team, you still have your silos.And that's always a challenge is you actually have to say, “Okay, what are the cultural values around this process?” You know, what is SRE? What is DevOps, you know? Is it seen as processes, is it a series of principles, platform, maybe, you know? We have to say, like—that's why I say, Platform-as-a-Product because you need to have that product mindset, that culture of product thinking, to really build a platform that works because it's all about the user journey.It's not about building a common set of tools. It's the user journey of how a person interacts with their code to get it into a production environment. And so, you need to understand how that person sits down at their desk, starts the laptop up, logs in, opens the IDE, what they're actually trying to get done. And once you understand that, then you know your requirements, and you build something to fill those things so that they are happy to use it, as opposed to saying, “This is our platform, and you're going to use it.” And they're probably going to say, “No.” And the next thing, you know, they're just doing their own thing on the side.Corey: Yeah, the rise of Shadow IT has never gone away. It's just, on some level, it's the natural expression, I think it's an immune reaction that companies tend to have when process gets in the way. Great, we have an outcome that we need to drive towards; we don't have a choice. Cloud empowered a lot of that and also has given tools to help rein it in, and as with everything, the arms race continues.Evelyn: Yeah. And so, what I'm going to continue now, kind of like, toot the platform horn. So, Gregor Hohpe, he's a [solutions architect 00:31:56]—I always f- up his name. I'm so sorry, Gregor. He has a great book, and even a talk, called The Magic of Platforms, that if somebody is actually curious about understanding of why platforms are nice, they should really watch that talk.If you see him at re:Invent, or a summit or somewhere giving a talk, go listen to that, and just pick his brain. Because that's—for me, I really kind of strongly agree with his approach because that's really how, like, you know, as he says, like, boost innovation is, you know, where you're actually building a platform that really works.Corey: Yeah, it's a hard problem, but it's also one of those things where you're trying to focus on—at least ideally—an outcome or a better situation than you currently find yourselves in. It's hard to turn down things that might very well get you there sooner, faster, but it's like trying to effectively cargo-cult the leadership principles from your last employer into your new one. It just doesn't work. I mean, you see more startups from Amazonians who try that, and it just goes horribly because without the cultural understanding and the supporting structures, it doesn't work.Evelyn: Exactly. So, I've worked with, like, organizations, like, 4000-plus people, I've worked for, like, small startups, consulted, and this is why I say, almost every single transformation, it fails the first time because somebody needs to be in control and track things and basically be really, really certain that people are doing it right. And as soon as it blows up in their face, that's when they realize they should actually take a step back. And so, even for building out a platform, you know, doing Platform-as-a-Product, I always reiterate that you have to really be willing to just invest upfront, and not get very much back. Because you have to figure out the whole user journey, and what you're actually building, before you actually build it.Corey: I really want to thank you for taking the time to speak with me today. If people want to learn more, where's the best place for them to find you?Evelyn: So, I used to be on Twitter, but I've actually got off there after it kind of turned a bit toxic and crazy.Corey: Feels like that was years ago, but that's beside the point.Evelyn: Yeah, precisely. So, I would even just say because this feels like a corporate show, but find me on LinkedIn of all places because I will be sharing whatever I find on there, you know? So, just look me up on my name, Evelyn Osman, and give me a follow, and I'll probably be screaming into the cloud like you are.Corey: And we will, of course, put links to that in the show notes. Thank you so much for taking the time to speak with me. I appreciate it.Evelyn: Thank you, Corey.Corey: Evelyn Osman, engineering manager at AutoScout24. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, and I will read it once I finish building an internal platform to normalize all of those platforms together into one.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started.
Anna Belak, Director of the Office of Cybersecurity Strategy at Sysdig, joins Corey on Screaming in the Cloud to discuss the newest benchmark for responding to security threats, 5/5/5. Anna describes why it was necessary to set a new benchmark for responding to security threats in a timely manner, and how the Sysdig team did research to determine the best practices for detecting, correlating, and responding to potential attacks. Corey and Anna discuss the importance of focusing on improving your own benchmarks towards a goal, as well as how prevention and threat detection are both essential parts of a solid security program. About AnnaAnna has nearly ten years of experience researching and advising organizations on cloud adoption with a focus on security best practices. As a Gartner Analyst, Anna spent six years helping more than 500 enterprises with vulnerability management, security monitoring, and DevSecOps initiatives. Anna's research and talks have been used to transform organizations' IT strategies and her research agenda helped to shape markets. Anna is the Director of Thought Leadership at Sysdig, using her deep understanding of the security industry to help IT professionals succeed in their cloud-native journey. Anna holds a PhD in Materials Engineering from the University of Michigan, where she developed computational methods to study solar cells and rechargeable batteries.Links Referenced: Sysdig: https://sysdig.com/ Sysdig 5/5/5 Benchmark: https://sysdig.com/555 TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I am joined again—for another time this year—on this promoted guest episode brought to us by our friends at Sysdig, returning is Anna Belak, who is their director of the Office of Cybersecurity Strategy at Sysdig. Anna, welcome back. It's been a hot second.Anna: Thank you, Corey. It's always fun to join you here.Corey: Last time we were here, we were talking about your report that you folks had come out with, the, “Cybersecurity Threat Landscape for 2022.” And when I saw you were doing another one of these to talk about something, I was briefly terrified. “Oh, wow, please tell me we haven't gone another year and the cybersecurity threat landscape is moving that quickly.” And it sort of is, sort of isn't. You're here today to talk about something different, but it also—to my understanding—distills down to just how quickly that landscape is moving. What have you got for us today?Anna: Exactly. For those of you who remember that episode, one of the key findings in the Threat Report for 2023 was that the average length of an attack in the cloud is ten minutes. To be clear, that is from when you are found by an adversary to when they have caused damage to your system. And that is really fast. Like, we talked about how that relates to on-prem attacks or other sort of averages from other organizations reporting how long it takes to attack people.And so, we went from weeks or days to minutes, potentially seconds. And so, what we've done is we looked at all that data, and then we went and talked to our amazing customers and our many friends at analyst firms and so on, to kind of get a sense for if this is real, like, if everyone is seeing this or if we're just seeing this. Because I'm always like, “Oh, God. Like, is this real? Is it just me?”And as it turns out, everyone's not only—I mean, not necessarily everyone's seeing it, right? Like, there's not really been proof until this year, I would say because there's a few reports that came out this year, but lots of people sort of anticipated this. And so, when we went to our customers, and we asked for their SLAs, for example, they were like, “Oh, yeah, my SLA for a [PCRE 00:02:27] cloud is like 10, 15 minutes.” And I was like, “Oh, okay.” So, what we set out to do is actually set a benchmark, essentially, to see how well are you doing. Like, are you equipped with your cloud security program to respond to the kind of attack that a cloud security attacker is going to—sorry, an anti-cloud security—I guess—attacker is going to perpetrate against you.And so, the benchmark is—drumroll—5/5/5. You have five seconds to detect a signal that is relevant to potentially some attack in the cloud—hopefully, more than one such signal—you have five minutes to correlate all such relevant signals to each other so that you have a high fidelity detection of this activity, and then you have five more minutes to initiate an incident response process to hopefully shut this down, or at least interrupt the kill chain before your environments experience any substantial damage.Corey: To be clear, that is from a T0, a starting point, the stopwatch begins, the clock starts when the event happens, not when an event shows up in your logs, not once someone declares an incident. From J. Random Hackerman, effectively, we're pressing the button and getting the response from your API.Anna: That's right because the attackers don't really care how long it takes you to ship logs to wherever you're mailing them to. And that's why it is such a short timeframe because we're talking about, they got in, you saw something hopefully—and it may take time, right? Like, some of the—which we'll describe a little later, some of the activities that they perform in the early stages of the attack are not necessarily detectable as malicious right away, which is why your correlation has to occur, kind of, in real time. Like, things happen, and you're immediately adding them, sort of like, to increase the risk of this detection, right, to say, “Hey, this is actually something,” as opposed to, you know, three weeks later, I'm parsing some logs and being like, “Oh, wow. Well, that's not good.” [laugh].Corey: The number five seemed familiar to me in this context, so I did a quick check, and sure enough, allow me to quote from chapter and verse from the CloudTrail documentation over an AWS-land. “CloudTrail typically delivers logs within an average of about five minutes of an API call. This time is not guaranteed.” So effectively, if you're waiting for anything that's CloudTrail-driven to tell you that you have a problem, it is almost certainly too late by the time that pops up, no matter what that notification vector is.Anna: That is, unfortunately or fortunately, true. I mean, it's kind of a fact of life. I guess there is a little bit of a veiled [unintelligible 00:04:43] at our cloud provider friends because, really, they have to do better ultimately. But the flip side to that argument is CloudTrail—or your cloud log source of choice—cannot be your only source of data for detecting security events, right? So, if you are operating purely on the basis of, “Hey, I have information in CloudTrail; that is my security information,” you are going to have a bad time, not just because it's not fast enough, but also because there's not enough data in there, right? Which is why part of the first, kind of, benchmark component is that you must have multiple data sources for the signals, and they—ideally—all will be delivered to you within five seconds of an event occurring or a signal being generated.Corey: And give me some more information on that because I have my own alerter, specifically, it's a ClickOps detector. Whenever someone in one of my accounts does something in the console, that has a write aspect to it rather than just a read component—which again, look at what you want in the console, that's fine—if you're changing things that is not being managed by code, I want to know that it's happening. It's not necessarily bad, but I want to at least have visibility into it. And that spits out the principal, the IP address it emits from, and the rest. I haven't had a whole lot where I need to correlate those between different areas. Talk to me more about the triage step.Anna: Yeah, so I believe that the correlation step is the hardest, actually.Corey: Correlation step. My apologies.Anna: Triage is fine. It's [crosstalk 00:06:06]—Corey: Triage, correlations, the words we use matter on these things.Anna: Dude, we argued about the words on this for so long, you could even imagine. Yeah, triage, correlation, detection, you name it, we are looking at multiple pieces of data, we're going to connect them to each other meaningfully, and that is going to provide us with some insight about the fact that a bad thing is happening, and we should respond to it. Perhaps automatically respond to it, but we'll get to that. So, a correlation, okay. The first thing is, like I said, you must have more than one data source because otherwise, I mean, you could correlate information from one data source; you actually should do that, but you are going to get richer information if you can correlate multiple data sources, and if you can access, for example, like through an API, some sort of enrichment for that information.Like, I'll give you an example. For SCARLETEEL, which is an attack we describe in the thread report, and we actually described before, this is—we're, like—on SCARLETEEL, I think, version three now because there's so much—this particular certain actor is very active [laugh].Corey: And they have a better versioning scheme than most companies I've spoken to, but that's neither here nor there.Anna: [laugh]. Right? So, one of the interesting things about SCARLETEEL is you could eventually detect that it had happened if you only had access to CloudTrail, but you wouldn't have the full picture ever. In our case, because we are a company that relies heavily on system calls and machine learning detections, we [are able to 00:07:19] connect the system call events to the CloudTrail events, and between those two data sources, we're able to figure out that there's something more profound going on than just what you see in the logs. And I'll actually tell you, which, for example, things are being detected.So, in SCARLETEEL, one thing that happens is there's a crypto miner. And a crypto miner is one of these events where you're, like, “Oh, this is obviously malicious,” because as we wrote, I think, two years ago, it costs $53 to mine $1 of Bitcoin in AWS, so it is very stupid for you to be mining Bitcoin in AWS, unless somebody else is—Corey: In your own accounts.Anna: —paying the cloud bill. Yeah, yeah [laugh] in someone else's account, absolutely. Yeah. So, if you are a sysadmin or a security engineer, and you find a crypto miner, you're like, “Obviously, just shut that down.” Great. What often happens is people see them, and they think, “Oh, this is a commodity attack,” like, people are just throwing crypto miners whatever, I shut it down, and I'm done.But in the case of this attack, it was actually a red herring. So, they deployed the miner to see if they could. They could, then they determined—presumably; this is me speculating—that, oh, these people don't have very good security because they let random idiots run crypto miners in their account in AWS, so they probed further. And when they probed further, what they did was some reconnaissance. So, they type in commands, listing, you know, like, list accounts or whatever. They try to list all the things they can list that are available in this account, and then they reach out to an EC2 metadata service to kind of like, see what they can do, right?And so, each of these events, like, each of the things that they do, like, reaching out to a EC2 metadata service, assuming a role, doing a recon, even lateral movement is, like, by itself, not necessarily a scary, big red flag malicious thing because there are lots of, sort of, legitimate reasons for someone to perform those actions, right? Like, reconnaissance, for one example, is you're, like, looking around the environment to see what's up, right? So, you're doing things, like, listing things, [unintelligible 00:09:03] things, whatever. But a lot of the graphical interfaces of security tools also perform those actions to show you what's, you know, there, so it looks like reconnaissance when your tool is just, like, listing all the stuff that's available to you to show it to you in the interface, right? So anyway, the point is, when you see them independently, these events are not scary. They're like, “Oh, this is useful information.”When you see them in rapid succession, right, or when you see them alongside a crypto miner, then your tooling and/or your process and/or your human being who's looking at this should be like, “Oh, wait a minute. Like, just the enumeration of things is not a big deal. The enumeration of things after I saw a miner, and you try and talk to the metadata service, suddenly I'm concerned.” And so, the point is, how can you connect those dots as quickly as possible and as automatically as possible, so a human being doesn't have to look at, like, every single event because there's an infinite number of them.Corey: I guess the challenge I've got is that in some cases, you're never going to be able to catch up with this. Because if it's an AWS call to one of the APIs that they manage for you, they explicitly state there's no guarantee of getting information on this until the show's all over, more or less. So, how is there… like, how is there hope?Anna: [laugh]. I mean, there's always a forensic analysis, I guess [laugh] for all the things that you've failed to respond to.Corey: Basically we're doing an after-action thing because humans aren't going to react that fast. We're just assuming it happened; we should know about it as soon as possible. On some level, just because something is too late doesn't necessarily mean there's not value added to it. But just trying to turn this into something other than a, “Yeah, they can move faster than you, and you will always lose. The end. Have a nice night.” Like, that tends not to be the best narrative vehicle for these things. You know, if you're trying to inspire people to change.Anna: Yeah, yeah, yeah, I mean, I think one clear point of hope here is that sometimes you can be fast enough, right? And a lot of this—I mean, first of all, you're probably not going to—sorry, cloud providers—you don't go into just the cloud provider defaults for that level of performance, you are going with some sort of third-party tool. On the, I guess, bright side, that tool can be open-source, like, there's a lot of open-source tooling available now that is fast and free. For example, is our favorite, of course, Falco, which is looking at system calls on endpoints, and containers, and can detect things within seconds of them occurring and let you know immediately. There is other EBPF-based instrumentation that you can use out there from various vendors and/or open-source providers, and there's of course, network telemetry.So, if you're into the world of service mesh, there is data you can get off the network, also very fast. So, the bad news or the flip side to that is you have to be able to manage all that information, right? So, that means—again, like I said, you're not expecting a SOC analyst to look at thousands of system calls and thousands of, you know, network packets or flow logs or whatever you're looking at, and just magically know that these things go together. You are expecting to build, or have built for you by a vendor or the open-source community, some sort of dissection content that is taking this into account and then is able to deliver that alert at the speed of 5/5/5.Corey: When you see the larger picture stories playing out, as far as what customers are seeing, what the actual impact is, what gave rise to the five-minute number around this? Just because that tends to feel like it's a… it is both too long and also too short on some level. I'm just wondering how you wound up at—what is this based on?Anna: Man, we went through so many numbers. So, we [laugh] started with larger numbers, and then we went to smaller numbers, then we went back to medium numbers. We align ourselves with the timeframes we're seeing for people. Like I said, a lot of folks have an SLA of responding to a P0 within 10 or 15 minutes because their point basically—and there's a little bit of bias here into our customer base because our customer base is, A, fairly advanced in terms of cloud adoption and in terms of security maturity, and also, they're heavily in let's say, financial industries and other industries that tend to be early adopters of new technology. So, if you are kind of a laggard, like, you probably aren't that close to meeting this benchmark as you are if you're saying financial, right? So, we asked them how they operate, and they basically pointed out to us that, like, knowing 15 minutes later is too late because I've already lost, like, some number of millions of dollars if my environment is compromised for 15 minutes, right? So, that's kind of where the ten minutes comes from. Like, we took our real threat research data, and then we went around and talked to folks to see kind of what they're experiencing and what their own expectations are for their incident response in SOC teams, and ten minutes is sort of where we landed.Corey: Got it. When you see this happening, I guess, in various customer environments, assuming someone has missed that five-minute window, is a game over effectively? How should people be thinking about this?Anna: No. So, I mean, it's never really game over, right? Like until your company is ransomed to bits, and you have to close your business, you still have many things that you can do, hopefully, to save yourself. And also, I want to be very clear that 5/5/5 as a benchmark is meant to be something aspirational, right? So, you should be able to meet this benchmark for, let's say, your top use cases if you are a fairly high maturity organization, in threat detection specifically, right?So, if you're just beginning your threat detection journey, like, tomorrow, you're not going to be close. Like, you're going to be not at all close. The point here, though, is that you should aspire to this level of greatness, and you're going to have to create new processes and adopt new tools to get there. Now, before you get there, I would argue that if you can do, like, 10-10-10 or, like, whatever number you start with, you're on a mission to make that number smaller, right? So, if today, you can detect a crypto miner in 30 minutes, that's not great because crypto miners are pretty detectable these days, but give yourself a goal of, like, getting that 30 minutes down to 20, or getting that 30 minutes down to 10, right?Because we are so obsessed with, like, measuring ourselves against our peers and all this other stuff that we sometimes lose track of what actually is improving our security program. So yes, compare it to yourself first. But ultimately, if you can meet the 5/5/5 benchmark, then you are doing great. Like, you are faster than the attackers in theory, so that's the dream.Corey: So, I have to ask, and I suspect I might know the answer to this, but given that it seems very hard to move this quickly, especially at scale, is there an argument to be made that effectively prevention obviates the need for any of this, where if you don't misconfigure things in ways that should be obvious, if you practice defense-in-depth to a point where you can effectively catch things that the first layer meets with successive layers, as opposed to, “Well, we have a firewall. Once we're inside of there, well [laugh], it's game over for us.” Is prevention sufficient in some ways to obviate this?Anna: I think there are a lot of people that would love to believe that that's true.Corey: Oh, I sure would. It's such a comforting story.Anna: And we've done, like, I think one of my opening sentences in the benchmark, kind of, description, actually, is that we've done a pretty good job of advertising prevention in Cloud as an important thing and getting people to actually, like, start configuring things more carefully, or like, checking how those things have been configured, and then changing that configuration should they discover that it is not compliant with some mundane standard that everyone should know, right? So, we've made great progress, I think, in cloud prevention, but as usual, like, prevention fails, right? Like I still have smoke detectors in my house, even though I have done everything possible to prevent it from catching fire and I don't plan to set it on fire, right? But like, threat detection is one of these things that you're always going to need because no matter what you do, A, you will make a mistake because you're a human being, and there are too many things, and you'll make a mistake, and B, the bad guys are literally in the business of figuring ways around your prevention and your protective systems.So, I am full on on defense-in-depth. I think it's a beautiful thing. We should only obviously do that. And I do think that prevention is your first step to a holistic security program—otherwise, what even is the point—but threat detection is always going to be necessary. And like I said, even if you can't go 5/5/5, you don't have threat detection at that speed, you need to at least be able to know what happened later so you can update your prevention system.Corey: This might be a dangerous question to get into, but why not, that's what I do here. This [could 00:17:27] potentially an argument against Cloud, by which I mean that if I compromise someone's Cloud account on any of the major cloud providers, once I have access of some level, I know where everything else in the environment is as a general rule. I know that you're using S3 or its equivalent, and what those APIs look like and the rest, whereas as an attacker, if I am breaking into someone's crappy data center-hosted environment, everything is going to be different. Maybe they don't have a SAN at all, for example. Maybe they have one that hasn't been patched in five years. Maybe they're just doing local disk for some reason.There's a lot of discovery that has to happen that is almost always removed from Cloud. I mean, take the open S3 bucket problem that we've seen as a scourge for 5, 6, 7 years now, where it's not that S3 itself is insecure, but once you make a configuration mistake, you are now in line with a whole bunch of other folks who may have much more valuable data living in that environment. Where do you land on that one?Anna: This is the ‘leave cloud to rely on security through obscurity' argument?Corey: Exactly. Which I'm not a fan of, but it's also hard to argue against from time-to-time.Anna: My other way of phrasing it is ‘the attackers are ripping up the stack' argument. Yeah, so—and there is some sort of truth in that, right? Part of the reason that attackers can move that fast—and I think we say this a lot when we talk about the threat report data, too, because we literally see them execute this behavior, right—is they know what the cloud looks like, right? They have access to all the API documentation, they kind of know what all the constructs are that you're all using, and so they literally can practice their attack and create all these scripts ahead of time to perform their reconnaissance because they know exactly what they're looking at, right? On-premise, you're right, like, they're going to get into—even to get through my firewall, whatever, they're getting into my data center, they don't do not know what disaster I have configured, what kinds of servers I have where, and, like, what the network looks like, they have no idea, right?In Cloud, this is kind of all gifted to them because it's so standard, which is a blessing and a curse. It's a blessing because—well for them, I mean, because they can just programmatically go through this stuff, right? It's a curse for them because it's a blessing for us in the same way, right? Like, the defenders… A, have a much easier time knowing what they even have available to them, right? Like, the days of there's a server in a closet I've never heard of are kind of gone, right? Like, you know what's in your Cloud account because, frankly, AWS tells you. So, I think there is a trade-off there.The other thing is—about the moving up the stack thing, right—like no matter what you do, they will come after you if you have something worth exploiting you for, right? So, by moving up the stack, I mean, listen, we have abstracted all the physical servers, all of the, like, stuff we used to have to manage the security of because the cloud just does that for us, right? Now, we can argue about whether or not they do a good job, but I'm going to be generous to them and say they do a better job than most companies [laugh] did before. So, in that regard, like, we say, thank you, and we move on to, like, fighting this battle at a higher level in the stack, which is now the workloads and the cloud control plane, and the you name it, whatever is going on after that. So, I don't actually think you can sort of trade apples for oranges here. It's just… bad in a different way.Corey: Do you think that this benchmark is going to be used by various companies who will learn about it? And if so, how do you see that playing out?Anna: I hope so. My hope when we created it was that it would sort of serve as a goalpost or a way to measure—Corey: Yeah, it would just be marketing words on a page and never mentioned anywhere, that's our dream here.Anna: Yeah, right. Yeah, I was bored. So, I wrote some—[laugh].Corey: I had a word minimum to get out the door, so there we are. It's how we work.Anna: Right. As you know, I used to be a Gartner analyst, and my desire is always to, like, create things that are useful for people to figure out how to do better in security. And my, kind of, tenure at the vendor is just a way to fund that [laugh] more effectively [unintelligible 00:21:08].Corey: Yeah, I keep forgetting you're ex-Gartner. Yeah, it's one of those fun areas of, “Oh, yeah, we just want to basically talk about all kinds of things because there's a—we have a chart to fill out here. Let's get after it.”Anna: I did not invent an acronym, at least. Yeah, so my goal was the following. People are always looking for a benchmark or a goal or standard to be like, “Hey, am I doing a good job?” Whether I'm, like a SOC analyst or director, and I'm just looking at my little SOC empire, or I'm a full on CSO, and I'm looking at my entire security program to kind of figure out risk, I need some way to know whether what is happening in my organization is, like, sufficient, or on par, or anything. Is it good or is it bad? Happy face? Sad face? Like, I need some benchmark, right?So normally, the Gartner answer to this, typically, is like, “You can only come up with benchmarks that are—” they're, like, “Only you know what is right for your company,” right? It's like, you know, the standard, ‘it depends' answer. Which is true, right, because I can't say that, like, oh, a huge multinational bank should follow the same benchmark as, like, a donut shop, right? Like, that's unreasonable. So, this is also why I say that our benchmark is probably more tailored to the more advanced organizations that are dealing with kind of high maturity phenomena and are more cloud-native, but the donut shops should kind of strive in this direction, right?So, I hope that people will think of it this way: that they will, kind of, look at their process and say, “Hey, like, what are the things that would be really bad if they happened to me, in terms of sort detection?” Like, “What are the threats I'm afraid of where if I saw this in my cloud environment, I would have a really bad day?” And, “Can I detect those threats in 5/5/5?” Because if I can, then I'm actually doing quite well. And if I can't, then I need to set, like, some sort of roadmap for myself on how I get from where I am now to 5/5/5 because that implies you would be doing a good job.So, that's sort of my hope for the benchmark is that people think of it as something to aspire to, and if they're already able to meet it, then that they'll tell us how exactly they're achieving it because I really want to be friends with them.Corey: Yeah, there's a definite lack of reasonable ways to think about these things, at least in ways that can be communicated to folks outside of the bounds of the security team. I think that's one of the big challenges currently facing the security industry is that it is easy to get so locked into the domain-specific acronyms, philosophies, approaches, and the rest, that even coming from, “Well, I'm a cloud engineer who ostensibly needs to know about these things.” Yeah, wander around the RSA floor with that as your background, and you get lost very quickly.Anna: Yeah, I think that's fair. I mean, it is a very, let's say, dynamic and rapidly evolving space. And by the way, like, it was really hard for me to pick these numbers, right, because I… very much am on that whole, ‘it depends' bandwagon of I don't know what the right answer is. Who knows what the right answer is [laugh]? So, I say 5/5/5 today. Like, tomorrow, the attack takes five minutes, and now it's two-and-a-half/two-and-a-half, right? Like it's whatever.You have to pick a number and go for it. So, I think, to some extent, we have to try to, like, make sense of the insanity and choose some best practices to anchor ourselves in or some, kind of like, sound logic to start with, and then go from there. So, that's sort of what I go for.Corey: So, as I think about the actual reaction times needed for 5/5/5 to actually be realistic, people can't reliably get a hold of me on the phone within five minutes, so it seems like this is not something you're going to have humans in the loop for. How does that interface with the idea of automating things versus giving automated systems too much power to take your site down as a potential failure mode?Anna: Yeah. I don't even answer the phone anymore, so that wouldn't work at all. That's a really, really good question, and probably the question that gives me the most… I don't know, I don't want to say lost sleep at night because it's actually, it's very interesting to think about, right? I don't think you can remove humans from the loop in the SOC. Like, certainly there will be things you can auto-respond to some extent, but there'd better be a human being in there because there are too many things at stake, right?Some of these actions could take your entire business down for far more hours or days than whatever the attacker was doing before. And that trade-off of, like, is my response to this attack actually hurting the business more than the attack itself is a question that's really hard to answer, especially for most of us technical folks who, like, don't necessarily know the business impact of any given thing. So, first of all, I think we have to embrace other response actions. Back to our favorite crypto miners, right? Like there is no reason to not automatically shut them down. There is no reason, right? Just build in a detection and an auto-response: every time you see a crypto miner, kill that process, kill that container, kill that node. I don't care. Kill it. Like, why is it running? This is crazy, right?I do think it gets nuanced very quickly, right? So again, in SCARLETEEL, there are essentially, like, five or six detections that occur, right? And each of them theoretically has a potential auto-response that you could have executed depending on your, sort of, appetite for that level of intervention, right? Like, when you see somebody assuming a role, that's perfectly normal activity most of the time. In this case, I believe they actually assumed a machine role, which is less normal. Like, that's kind of weird.And then what do you do? Well, you can just, like, remove the role. You can remove that person's ability to do anything, or remove that role's ability to do anything. But that could be very dangerous because we don't necessarily know what the full scope of that role is as this is happening, right? So, you could take, like, a more mitigated auto-response action and add a restrictive policy to that rule, for example, to just prevent activity from that IP address that you just saw, right, because we're not sure about this IP address, but we're sure about this role, right?So, you have to get into these, sort of, risk-tiered response actions where you say, “Okay, this is always okay to do automatically. And this is, like, sometimes, okay, and this is never okay.” And as you develop that muscle, it becomes much easier to do something rather than doing nothing and just, kind of like, analyzing it in forensics and being, like, “Oh, what an interesting attack story,” right? So, that's step one, is just start taking these different response actions.And then step two is more long-term, and it's that you have to embrace the cloud-native way of life, right? Like this immutable, ephemeral, distributed religion that we've been selling, it actually works really well if you, like, go all-in on the religion. I sound like a real cult leader [laugh]. Like, “If you just go all in, it's going to be great.” But it's true, right?So, if your workflows are immutable—that means they cannot change as they're running—then when you see them drifting from their original configuration, like, you know, that is bad. So, you can immediately know that it's safe to take an auto-respon—well, it's safe, relatively safe, take an auto-response action to kill that workload because you are, like, a hundred percent certain it is not doing the right things, right? And then furthermore, if all of your deployments are defined as code, which they should be, then it is approximately—[though not entirely 00:27:31]—trivial to get that workload back, right? Because you just push a button, and it just generates that same Kubernetes cluster with those same nodes doing all those same things, right? So, in the on-premise world where shooting a server was potentially the, you know, fireable offense because if that server was running something critical, and you couldn't get it back, you were done.In the cloud, this is much less dangerous because there's, like, an infinite quantity of servers that you could bring back and hopefully Infrastructure-as-Code and, kind of, Configuration-as-Code in some wonderful registry, version-controlled for you to rely on to rehydrate all that stuff, right? So again, to sort of TL;DR, get used to doing auto-response actions, but do this carefully. Like, define a scope for those actions that make sense and not just, like, “Something bad happened; burn it all down,” obviously. And then as you become more cloud-native—which sometimes requires refactoring of entire applications—by the way, this could take years—just embrace the joy of Everything-as-Code.Corey: That's a good way of thinking about it. I just, I wish there were an easier path to get there, for an awful lot of folks who otherwise don't find a clear way to unlock that.Anna: There is not, unfortunately [laugh]. I mean, again, the upside on that is, like, there are a lot of people that have done it successfully, I have to say. I couldn't have said that to you, like, six, seven years ago when we were just getting started on this journey, but especially for those of you who were just at KubeCon—however, long ago… before this airs—you see a pretty robust ecosystem around Kubernetes, around containers, around cloud in general, and so even if you feel like your organization's behind, there are a lot of folks you can reach out to to learn from, to get some help, to just sort of start joining the masses of cloud-native types. So, it's not nearly as hopeless as before. And also, one thing I like to say always is, almost every organization is going to have some technical debt and some legacy workload that they can't convert to the religion of cloud.And so, you're not going to have a 5/5/5 threat detection SLA on those workloads. Probably. I mean, maybe you can, but probably you're not, and you may not be able to take auto-response actions, and you may not have all the same benefits available to you, but like, that's okay. That's okay. Hopefully, whatever that thing is running is, you know, worth keeping alive, but set this new standard for your new workloads. So, when your team is building a new application, or if they're refactoring an application, can't afford the new world, set the standard on them and don't, kind of like, torment the legacy folks because it doesn't necessarily make sense. Like, they're going to have different SLAs for different workloads.Corey: I really want to thank you for taking the time to speak with me yet again about the stuff you folks are coming out with. If people want to learn more, where's the best place for them to go?Anna: Thanks, Corey. It's always a pleasure to be on your show. If you want to learn more about the 5/5/5 benchmark, you should go to sysdig.com/555.Corey: And we will, of course, put links to that in the show notes. Thank you so much for taking the time to speak with me today. As always, it's appreciated. Anna Belak, Director at the Office of Cybersecurity Strategy at Sysdig. I'm Cloud Economist Corey Quinn, and this has been a promoted guest episode brought to us by our friends at Sysdig. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, insulting comment that I will read nowhere even approaching within five minutes.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started.
Sysdig's Alessandro Brucato and Michael Clark join Dave to discuss their work on "AWS's Hidden Threat: AMBERSQUID Cloud-Native Cryptojacking Operation." Attackers are targeting what are typically considered secure AWS services, like AWS Fargate and Amazon SageMaker. This means that defenders generally aren't as concerned with their security from end-to-end. The research states "The AMBERSQUID operation was able to exploit cloud services without triggering the AWS requirement for approval of more resources, as would be the case if they only spammed EC2 instances." This poses additional challenges targeting multiple services since it requires finding and killing all miners in each exploited service. The research can be found here: AWS's Hidden Threat: AMBERSQUID Cloud-Native Cryptojacking Operation Learn more about your ad choices. Visit megaphone.fm/adchoices
