Podcasts about cockroachdb

  • 80PODCASTS
  • 113EPISODES
  • 50mAVG DURATION
  • 1MONTHLY NEW EPISODE
  • Dec 10, 2024LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about cockroachdb

Latest podcast episodes about cockroachdb

Screaming in the Cloud
Looking at the Current State of Resilience with Spencer Kimball

Screaming in the Cloud

Play Episode Listen Later Dec 10, 2024 38:35


Spencer Kimball, CEO of Cockroach Labs, joins Corey Quinn to discuss the evolving challenges of database resilience in 2025. They discuss the State of Resilience 2025 report, revealing widespread operational concerns, costly outages, and gaps in failover preparedness. Modern resilience strategies, like active-active configurations and consensus replication, reduce risks but require expertise and investment. Spencer highlights growing regulatory pressures, such as the EU's Digital Operational Resilience Act, and the rising complexity of distributed systems. Despite challenges, Cockroach Labs aims to simplify resilience, enabling organizations to modernize while balancing risk, cost, and customer trust.Show Highlights(0:00) Intro(0:36) Cockroach Labs sponsor read(3:14) The foundational nature of databases(3:55) Cockroach Labs' State of Resilience 2025 report(8:55) CrowdStrike as an example of why database resilience is so important(11:04) What Spencer found most surprising in the report's results(15:13) Understanding the multi-cloud strategy as safety in numbers(18:29) Cockroach Labs sponsor read(19:23) Why cost isn't the Achilles' heel of the multi-cloud strategy that some people think(23:52) Executives are blaming IT people for outages as much(28:21) The importance of active-active configurations(32:01) Why anxiety about operational resiliency will never fully go away(37:52) How to access the State of Resilience 2025 reportAbout Spencer KimballSpencer Kimball is the CEO and co-founder of Cockroach Labs, a company dedicated to building resilient, cloud-native databases. Before founding Cockroach Labs, Spencer had a distinguished career in technology, including contributions to Google's Colossus file system. Alongside co-founders Peter Mattis and Ben Darnell, he launched CockroachDB, a globally distributed SQL database designed to handle modern data challenges like resilience, multi-cloud deployment, and compliance with evolving data sovereignty laws. CockroachDB is renowned for its innovative architecture, enabling consistent and scalable database performance across regions and clouds. Under Spencer's leadership, the company continues to redefine operational resilience for enterprises worldwide.LinksCockroach Labs: https://www.cockroachlabs.com/The State of Resilience 2025 report https://www.cockroachlabs.com/guides/the-state-of-resilience-2025/SponsorCockroach Labs: cockroachlabs.com/lastweek

Spring Office Hours
S3E41 - Spring's Hidden Powers with Greg Turnquist

Spring Office Hours

Play Episode Listen Later Dec 3, 2024 63:22


Join Dan Vega and DaShaun Carter as they welcome Greg Turnquist, Senior Staff Technical Content Engineer at CockroachDB and former Spring Data JPA lead. In this episode, dive deep into Spring's powerful yet often overlooked features for building high-performance, highly available systems. Greg shares expert insights on combining JdbcTemplate, Spring Retry, and transaction management, demonstrating why Spring Boot's balance of convenience and power makes it a lasting force in enterprise development. You can participate in our live stream to ask questions or catch the replay on your preferred podcast platform.Show NotesCockroach DBProcoder

Software Defined Talk
Episode 481: There Never Was a Rug

Software Defined Talk

Play Episode Listen Later Aug 23, 2024 69:25


This week, we discuss CockroachDB's relicensing, the ongoing debate about remote work, and platform engineering. Plus, some thoughts on the use of speakerphones in public. Watch the YouTube Live Recording of Episode (https://www.youtube.com/watch?v=m1iHd2XPB48) 481 (https://www.youtube.com/watch?v=m1iHd2XPB48) Runner-up Titles Put in your AirPods People's lives are boring More than zero Know your risks Violating the social contract Keeping my “Rug Pull” card Love of the code I call it a waiting room Rundown Cockroach Labs shakes up its licensing to force bigger companies to pay (https://techcrunch.com/2024/08/15/cockroach-labs-shakes-up-its-licensing-to-force-bigger-companies-to-pay/) Justin Warren's newsletter this week (https://pivotnine.com/the-crux/). Oxide: Whither CockroachDB? (RFD) (https://rfd.shared.oxide.computer/rfd/0508) and Whither CockroachDB? (Podcast) (https://oxide.computer/podcasts/oxide-and-friends/2052742) Remote Work Eric Schmidt Walks Back Claim Google Is Behind on AI Because of Remote Work (https://www.wsj.com/tech/ai/google-eric-schmidt-ai-remote-work-stanford-f92f4ca5?st=wq34bupg4eqific&reflink=article_copyURL_share) New Starbuck's CEO gets remote office (https://www.threads.net/@matthewsamuelphillips/post/C-scpeqRuVb/?xmt=AQGzZAiR7spKgmL8b-wyyZDacAbjVSQJcg4-qsOzEivroA) Platform engineering problems: can ops actually do product management? (https://newsletter.cote.io/p/platform-engineering-problems-can?r=2d4o&utm_campaign=post&utm_medium=web) Relevant to your Interests Exclusive: Sonos considers relaunching its old app (https://www.theverge.com/2024/8/14/24220421/sonos-s2-app-relaunch) Sonos lays off 100 employees as its app crisis continues (https://www.theverge.com/2024/8/14/24220357/sonos-layoffs-august-2024-app) Palo Alto Networks apologizes as sexist marketing misfires (https://www.theregister.com/2024/08/14/palo_alto_networks_execs_apologize/) NIST Releases First 3 Finalized Post-Quantum Encryption Standards (https://www.nist.gov/news-events/news/2024/08/nist-releases-first-3-finalized-post-quantum-encryption-standards?_bhlid=1ff5eef8914205413c93c758a30c7afce5305655) Threads is testing a slew of new features like scheduling and analytics (https://www.theverge.com/2024/8/15/24220224/meta-threads-features-scheduling-insights-drafts) Google and Meta ignored their own rules in secret teen-targeting ad deals (https://arstechnica.com/tech-policy/2024/08/google-and-meta-ignored-their-own-rules-in-secret-teen-targeting-ad-deals/) FTC finalizes rule to prohibit sale or purchase of fake reviews (https://www.retaildive.com/news/ftc-prohibit-sale-purchase-fake-reviews/724333/) Goodfire raises $7M for its ‘brain surgery'-like AI observability platform (https://venturebeat.com/ai/goodfire-raises-7m-for-its-brain-surgery-like-ai-observability-platform/) Microsoft is finally removing the FAT32 partition size limit in Windows 11 (https://www.theverge.com/2024/8/16/24221635/microsoft-fat32-partition-size-limit-windows-11) The US lays out a road safety plan that will see cars 'talk' to each other (https://www.engadget.com/transportation/the-us-lays-out-a-road-safety-plan-that-will-see-cars-talk-to-each-other-170043265.html) Jeff Bezos' famed leadership rules are being tested inside Amazon (https://fortune.com/2024/08/05/amazon-leadership-principles-changes-jeff-bezos/) Banned TED Talk: Nick Hanauer "Rich people don't create jobs" (https://www.youtube.com/watch?v=CKCvf8E7V1g) Swiss Startup Connects 16 Human Mini-Brains to Create Low Energy 'Biocomputer' (https://www.sciencealert.com/swiss-startup-connects-16-human-mini-brains-to-create-low-energy-biocomputer) AMD to acquire server builder ZT Systems for $4.9 billion in cash and stock (https://www.cnbc.com/2024/08/19/amd-to-acquire-server-builder-zt-systems.html) The unique promise of 'biological computers' made from living things (https://www.newscientist.com/article/mg25834422-100-the-unique-promise-of-biological-computers-made-from-living-things/) Former a16z VC Balaji Srinivasan obtained a private island for his new longevity 'technocapitalist' school (https://techcrunch.com/2024/08/19/former-a16z-vc-balaji-srinivasan-obtained-a-private-island-for-his-new-longevity-technocapitalist-school/) Inside the Snowflake-Databricks Rivalry, and Why Both Fear Microsoft (https://www.bloomberg.com/news/articles/2024-08-14/inside-the-snowflake-databricks-rivalry-and-why-both-fear-microsoft) Assessing Broadcom VMware Eight Months On (https://thecuberesearch.com/243-breaking-analysis-assessing-broadcom-vmware-eight-months-on/) Morpheus Data Origin Story (https://morpheusdata.com/about/) Hewlett Packard Enterprise to acquire Morpheus Data (https://www.hpe.com/us/en/newsroom/press-release/2024/08/hewlett-packard-enterprise-to-acquire-morpheus-data.html) Nonsense Waymos honking (https://x.com/ajtourville/status/1823509421357719763?s=09) Saudi man earns world record for 444 game consoles hooked to one TV (https://arstechnica.com/gaming/2024/08/how-to-hook-a-record-setting-444-game-consoles-to-a-single-tv/) Conferences SpringOne (https://springone.io/?utm_source=cote&utm_campaign=devrel&utm_medium=newsletter&utm_content=newsletterUpcoming)/VMware Explore US (https://blogs.vmware.com/explore/2024/04/23/want-to-attend-vmware-explore-convince-your-manager-with-these/?utm_source=cote&utm_campaign=devrel&utm_medium=newsletter&utm_content=newsletterUpcoming), Aug 26-29, 2024 DevOpsDays Antwerp (https://devopsdays.org/events/2024-antwerp/welcome/), Sept 4–5, 2024, 15th anniversary Civo Navigate Europe, Berlin (https://www.civo.com/navigate/europe), Sept 10-11, 2024 SREday London 2024 (https://sreday.com/2024-london/), Sept 19–20, 2024. Coté speaking, 20% off with code SRE20DAY Cloud Foundry Day EU (https://events.linuxfoundation.org/cloud-foundry-day-europe/), Karlsruhe, GER, Oct 9, 2024, 20% off with code CFEU24VMW SDT News & Community Join our Slack community (https://softwaredefinedtalk.slack.com/join/shared_invite/zt-1hn55iv5d-UTfN7mVX1D9D5ExRt3ZJYQ#/shared-invite/email) Email the show: questions@softwaredefinedtalk.com (mailto:questions@softwaredefinedtalk.com) Free stickers: Email your address to stickers@softwaredefinedtalk.com (mailto:stickers@softwaredefinedtalk.com) Follow us on social media: Twitter (https://twitter.com/softwaredeftalk), Threads (https://www.threads.net/@softwaredefinedtalk), Mastodon (https://hachyderm.io/@softwaredefinedtalk), LinkedIn (https://www.linkedin.com/company/software-defined-talk/), BlueSky (https://bsky.app/profile/softwaredefinedtalk.com) Watch us on: Twitch (https://www.twitch.tv/sdtpodcast), YouTube (https://www.youtube.com/channel/UCi3OJPV6h9tp-hbsGBLGsDQ/featured), Instagram (https://www.instagram.com/softwaredefinedtalk/), TikTok (https://www.tiktok.com/@softwaredefinedtalk) Book offer: Use code SDT for $20 off "Digital WTF" by Coté (https://leanpub.com/digitalwtf/c/sdt) Sponsor the show (https://www.softwaredefinedtalk.com/ads): ads@softwaredefinedtalk.com (mailto:ads@softwaredefinedtalk.com) Recommendations Brandon: Furiosa: A Mad Max Saga (https://www.rottentomatoes.com/m/furiosa_a_mad_max_saga). Coté: Long walks with VLC (https://www.videolan.org/vlc/). Photo Credits Header (https://unsplash.com/photos/a-man-standing-in-front-of-a-tv-holding-a-cell-phone-jfEXaUYUjp8) Artwork (https://unsplash.com/photos/pair-of-white-shoes-hX3SLYPe3f0)

Oxide and Friends
Whither CockroachDB?

Oxide and Friends

Play Episode Listen Later Aug 21, 2024 94:07 Transcription Available


Lots of engineering decisions get made on vibes. Popularity, anecdotes—they can lead to expedient decisions rather than rigorous ones. At Oxide, our choice to go with CockroachDB was hardly hasty! Dave Pacheco joins Bryan and Adam to talk about why we choose CRDB… and how Cockroach Lab's recent switch to a proprietary license impacts that.In addition to Bryan Cantrill and Adam Leventhal, our special guest was Dave Pacheco.Some of the topics we hit on, in the order that we hit them:TechCrunch: Cockroach Labs shakes up its licensing to force bigger companies to payKelsey's TweetOxide RFD 53: Control plane data storage requirementsOxide RFD 110: CockroachDB for the control plane databaseOxide RFD 508: Whither CockroachDBJoyent blog post on the outage due to postgres autovacuumJepsenDave's CRDB exploration repoChronyOxF: A Debugging Odyssey -- debugging an issue that manifested in CRDBThe Liberation of RethinkDBIf we got something wrong or missed something, please file a PR! Our next show will likely be on Monday at 5p Pacific Time on our Discord server; stay tuned to our Mastodon feeds for details, or subscribe to this calendar. We'd love to have you join us, as we always love to hear from new speakers!

Hacker News Recap
August 15th, 2024 | CEOs are running companies from afar even as workers return to office

Hacker News Recap

Play Episode Listen Later Aug 16, 2024 13:09


This is a recap of the top 10 posts on Hacker News on August 15th, 2024.This podcast was generated by wondercraft.ai(00:38): Kim Dotcom's extradition to the U.S. given green light by New ZealandOriginal post: https://news.ycombinator.com/item?id=41254989&utm_source=wondercraft_ai(01:48): Galois TheoryOriginal post: https://news.ycombinator.com/item?id=41255456&utm_source=wondercraft_ai(03:05): Nomad, communicate off-grid mesh, forward secrecy and extreme privacyOriginal post: https://news.ycombinator.com/item?id=41253922&utm_source=wondercraft_ai(04:18): CEOs are running companies from afar even as workers return to officeOriginal post: https://news.ycombinator.com/item?id=41261986&utm_source=wondercraft_ai(05:28): CockroachDB license changeOriginal post: https://news.ycombinator.com/item?id=41256222&utm_source=wondercraft_ai(06:37): Google is a monopoly – the fix isn't obviousOriginal post: https://news.ycombinator.com/item?id=41254976&utm_source=wondercraft_ai(07:50): Exact Polygonal Filtering: Using Green's Theorem and Clipping for Anti-AliasingOriginal post: https://news.ycombinator.com/item?id=41253461&utm_source=wondercraft_ai(09:11): Markdown is meant to be shown (2021)Original post: https://news.ycombinator.com/item?id=41254936&utm_source=wondercraft_ai(10:17): WriteFreely: An open source platform for building a writing space on the webOriginal post: https://news.ycombinator.com/item?id=41253870&utm_source=wondercraft_ai(11:20): It's the land, stupid: How the homebuilder cartel drives high housing pricesOriginal post: https://news.ycombinator.com/item?id=41259229&utm_source=wondercraft_aiThis is a third-party project, independent from HN and YC. Text and audio generated using AI, by wondercraft.ai. Create your own studio quality podcast with text as the only input in seconds at app.wondercraft.ai. Issues or feedback? We'd love to hear from you: team@wondercraft.ai

GOTO - Today, Tomorrow and the Future
Patterns of Distributed Systems • Unmesh Joshi & James Lewis

GOTO - Today, Tomorrow and the Future

Play Episode Listen Later May 17, 2024 41:29 Transcription Available


This interview was recorded for the GOTO Book Club.http://gotopia.tech/bookclubRead the full transcription of the interview hereUnmesh Joshi - Principal Consultant at Thoughtworks & Author of "Patterns of Distributed Systems"James Lewis - Principal Consultant & Technical Director at ThoughtworksRESOURCESUnmeshhttps://twitter.com/unmeshjoshihttps://www.linkedin.com/in/unmesh-joshi-9487635https://www.thoughtworks.com/profiles/u/unmesh-joshiJameshttps://twitter.com/boicyhttps://linkedin.com/in/james-lewis-microserviceshttps://github.com/boicyhttps://www.bovon.orghttps://www.thoughtworks.com/profiles/j/james-lewisDESCRIPTIONA Patterns Approach to Designing Distributed Systems and Solving Common Implementation ProblemsMore and more enterprises today are dependent on cloud services from providers like AWS, Microsoft Azure, and GCP. They also use products, such as Kafka and Kubernetes, or databases, such as YugabyteDB, Cassandra, MongoDB, and Neo4j, that are distributed by nature. Because these distributed systems are inherently stateful systems, enterprise architects and developers need to be prepared for all the things that can and will go wrong when data is stored on multiple servers--from process crashes to network delays and unsynchronized clocks."Patterns of Distributed Systems" describes a set of patterns that have been observed in mainstream open-source distributed systems. Studying the common problems and the solutions that are embodied by the patterns in this guide will give you a better understanding of how these systems work, as well as a solid foundation in distributed system design principles.* Book description: © O'ReillyRECOMMENDED BOOKSUnmesh Joshi • Patterns of Distributed SystemsDarnell, Harrison & Seldess • CockroachDB: The Definitive GuideGuy Harrison • Next Generation DatabasesBurns, Beda & Hightower • Kubernetes: Up & RunningJez Humble & Dave Farley • Continuous DeliveryTwitterInstagramLinkedInFacebookLooking for a unique learning experience?Attend the next GOTO conference near you! Get your ticket: gotopia.techSUBSCRIBE TO OUR YOUTUBE CHANNEL - new videos posted daily!

The Tech Blog Writer Podcast
2865: Navigating the Multi-Cloud Maze with CockroachDB

The Tech Blog Writer Podcast

Play Episode Listen Later Apr 16, 2024 46:19


Are you curious how businesses can thrive in the rapidly evolving landscape of cloud computing and multi-cloud environments? In the upcoming episode of Tech Talks Daily, we're diving into the world of resilient databases with Spencer Kimball, CEO of Cockroach Labs. Spencer will share insights on CockroachDB, a pioneering technology designed to ensure business continuity even during outages that would cripple traditional databases. In our discussion, we'll explore how CockroachDB's ability to replicate data across regions and cloud providers maximizes uptime and facilitates massive scaling. We'll discuss the strategic importance of data locality in improving performance and complying with regulatory demands. We'll also discuss how CockroachDB's flexible architecture helps businesses avoid vendor lock-in and seamlessly manage data across multiple clouds. Originally founded by three former Googlers, Cockroach Labs has become a key player in the database market, challenging giants like Oracle and cloud provider databases. With high-profile users like Netflix, Bose, and Comcast, CockroachDB stands out for its robust data replication capabilities and distributed architecture, which were once confined to single data centers. Join me as Spencer elucidates on the evolution of Cockroach Labs in the competitive database market, the growing trend towards multi-cloud strategies among large enterprises, and the future of cloud portability. How is CockroachDB enabling companies to build above the cloud and avoid restrictive vendor lock-ins? As businesses continue to navigate the complexities of digital transformation, understanding the tools and technologies that facilitate this shift is more important than ever. What challenges and opportunities do you think lie ahead in the journey toward multi-cloud adoption? Please share your thoughts with us after the episode.

Software Engineering Daily
CockroachDB with Jordan Lewis

Software Engineering Daily

Play Episode Listen Later Jan 4, 2024 48:53


SQL databases were built for data consistency and vertical scalability. They did this very well for the long era of monolithic applications running in dedicated, single-server environments. However, their design presented a problem when the paradigm changed to distributed applications in the cloud. This shift eventually ushered in the rise of distributed SQL databases. One The post CockroachDB with Jordan Lewis appeared first on Software Engineering Daily.

Cloud Engineering – Software Engineering Daily
CockroachDB with Jordan Lewis

Cloud Engineering – Software Engineering Daily

Play Episode Listen Later Jan 4, 2024 48:53


SQL databases were built for data consistency and vertical scalability. They did this very well for the long era of monolithic applications running in dedicated, single-server environments. However, their design presented a problem when the paradigm changed to distributed applications in the cloud. This shift eventually ushered in the rise of distributed SQL databases. One The post CockroachDB with Jordan Lewis appeared first on Software Engineering Daily.

Podcast – Software Engineering Daily
CockroachDB with Jordan Lewis

Podcast – Software Engineering Daily

Play Episode Listen Later Jan 4, 2024 48:53


SQL databases were built for data consistency and vertical scalability. They did this very well for the long era of monolithic applications running in dedicated, single-server environments. However, their design presented a problem when the paradigm changed to distributed applications in the cloud. This shift eventually ushered in the rise of distributed SQL databases. One The post CockroachDB with Jordan Lewis appeared first on Software Engineering Daily.

DMRadio Podcast
The Indestructible Database? You Decide!

DMRadio Podcast

Play Episode Listen Later Nov 30, 2023 44:48


“We have literally been unable to kill this thing. No matter what we've thrown at it." So said Dylan O'Mahony, then of Bose. What is this thing, this unkillable database? It's CockroachDB! Learn more by checking out this special Holiday Episode of DM Radio, as Host @eric_kavanagh interviews CEO Spener Kimball. They'll talk all things distributed SQL, cloud native, and bulletproof architectures!

World of DaaS
Spencer Kimball, CEO of Cockroach Labs: Future of Open Source

World of DaaS

Play Episode Listen Later Nov 7, 2023 43:30


Spencer Kimball is the founder and CEO of Cockroach Labs, a $5 billion company that makes the CockroachDB database used by Netflix, Nubank, Shipt, and many other leading tech companies. Spencer was building and contributing to major open source projects before he graduated from college. In this episode of World of DaaS, Spencer and Auren do a deep dive on open source companies: How they make money, what the ecosystem looks like in 2023, and competing with the giants like AWS and Oracle. Auren and Spencer open with an in-depth history lesson on the software industry and open source's role in it. Spencer explains how open source and open core companies evolved and how the rise of cloud giants like AWS has changed the industry. They also discuss how belief and optimism affects founders, Spencer's “hierarchy of success,” and essential advice for first time founders.  World of DaaS is brought to you by SafeGraph & Flex Capital. For more episodes, visit safegraph.com/podcasts.You can find Auren Hoffman on Twitter at @auren and Spencer on LinkedIn. 

Data Engineering Podcast
Building An Internal Database As A Service Platform At Cloudflare

Data Engineering Podcast

Play Episode Listen Later Aug 28, 2023 61:09


Summary Data persistence is one of the most challenging aspects of computer systems. In the era of the cloud most developers rely on hosted services to manage their databases, but what if you are a cloud service? In this episode Vignesh Ravichandran explains how his team at Cloudflare provides PostgreSQL as a service to their developers for low latency and high uptime services at global scale. This is an interesting and insightful look at pragmatic engineering for reliability and scale. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack (https://www.dataengineeringpodcast.com/rudderstack) This episode is brought to you by Datafold – a testing automation platform for data engineers that finds data quality issues before the code and data are deployed to production. Datafold leverages data-diffing to compare production and development environments and column-level lineage to show you the exact impact of every code change on data, metrics, and BI tools, keeping your team productive and stakeholders happy. Datafold integrates with dbt, the modern data stack, and seamlessly plugs in your data CI for team-wide and automated testing. If you are migrating to a modern data stack, Datafold can also help you automate data and code validation to speed up the migration. Learn more about Datafold by visiting dataengineeringpodcast.com/datafold (https://www.dataengineeringpodcast.com/datafold) You shouldn't have to throw away the database to build with fast-changing data. You should be able to keep the familiarity of SQL and the proven architecture of cloud warehouses, but swap the decades-old batch computation model for an efficient incremental engine to get complex queries that are always up-to-date. With Materialize, you can! It's the only true SQL streaming database built from the ground up to meet the needs of modern data products. Whether it's real-time dashboarding and analytics, personalization and segmentation or automation and alerting, Materialize gives you the ability to work with fresh, correct, and scalable results — all in a familiar SQL interface. Go to dataengineeringpodcast.com/materialize (https://www.dataengineeringpodcast.com/materialize) today to get 2 weeks free! Your host is Tobias Macey and today I'm interviewing Vignesh Ravichandran about building an internal database as a service platform at Cloudflare Interview Introduction How did you get involved in the area of data management? Can you start by describing the different database workloads that you have at Cloudflare? What are the different methods that you have used for managing database instances? What are the requirements and constraints that you had to account for in designing your current system? Why Postgres? optimizations for Postgres simplification from not supporting multiple engines limitations in postgres that make multi-tenancy challenging scale of operation (data volume, request rate What are the most interesting, innovative, or unexpected ways that you have seen your DBaaS used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on your internal database platform? When is an internal database as a service the wrong choice? What do you have planned for the future of Postgres hosting at Cloudflare? Contact Info LinkedIn (https://www.linkedin.com/in/vigneshravichandran28/) Website (https://viggy28.dev/) Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. To help other people find the show please leave a review on Apple Podcasts (https://podcasts.apple.com/us/podcast/data-engineering-podcast/id1193040557) and tell your friends and co-workers Links Cloudflare (https://www.cloudflare.com/) PostgreSQL (https://www.postgresql.org/) Podcast Episode (https://www.dataengineeringpodcast.com/postgresql-with-jonathan-katz-episode-42/) IP Address Data Type in Postgres (https://www.postgresql.org/docs/current/datatype-net-types.html) CockroachDB (https://www.cockroachlabs.com/) Podcast Episode (https://www.dataengineeringpodcast.com/cockroachdb-with-peter-mattis-episode-35/) Citus (https://www.citusdata.com/) Podcast Episode (https://www.dataengineeringpodcast.com/citus-data-with-ozgun-erdogan-and-craig-kerstiens-episode-13/) Yugabyte (https://www.yugabyte.com/) Podcast Episode (https://www.dataengineeringpodcast.com/yugabytedb-planet-scale-sql-episode-115/) Stolon (https://github.com/sorintlab/stolon) pg_rewind (https://www.postgresql.org/docs/current/app-pgrewind.html) PGBouncer (https://www.pgbouncer.org/) HAProxy Presentation (https://www.youtube.com/watch?v=HIOo4j-Tiq4) Etcd (https://etcd.io/) Patroni (https://patroni.readthedocs.io/en/latest/) pg_upgrade (https://www.postgresql.org/docs/current/pgupgrade.html) Edge Computing (https://en.wikipedia.org/wiki/Edge_computing) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)

The MAD Podcast with Matt Turck
Cockroach Labs: A Cloud SQL Database Built for Survival with CEO Spencer Kimball

The MAD Podcast with Matt Turck

Play Episode Listen Later Aug 2, 2023 44:05


Relational databases, data cloud's effect on infrastructure, serverless databases, and GTM strategies: Matt Turck and CockroachDB's Spencer Kimball cover it all in today's episode.

The CTO Advisor
Career Changes from Salesforce to Twilio to CockroachDB

The CTO Advisor

Play Episode Listen Later Jun 28, 2023


About the session This episode features a conversation between Keith Townsend and Kevin Schwaba, an account executive for Cockroach DB. They discuss Kevin's career journey, his experiences working in different sales roles, and the challenges and insights he has gained along the way. It's a unique journey from talking to direct business value to the [...]

VMware Podcasts
S2:E22 - Building for Indestructible, with Spencer Kimball

VMware Podcasts

Play Episode Listen Later Apr 27, 2023 56:37


Spencer Kimball, co-founder and CEO of Cockroach Labs, discusses building a next-generation distributed database that can efficiently use commodity hardware found in the cloud and scale rapidly. He emphasizes the importance of distributed databases for businesses operating across multiple regions with diverse databases to maintain continuity and protect information. Kimball also talks about the importance of consistent replication and how CockroachDB can help push the envelope for multi-region operations. He advises starting off at a company where there's a lot to learn.

Tech Disruptors
Cockroach Labs Cracking Into Database Industry

Tech Disruptors

Play Episode Listen Later Mar 22, 2023 42:05


Cockroach Labs is disrupting the legacy model of databases by embracing distributing computing in the cloud, providing scale to modern applications, co-founder and CEO Spencer Kimball explains to Bloomberg Intelligence. In this Tech Disruptors podcast episode, Kimball sits down with BI analyst Anurag Rana for an in-depth conversation about the driving forces behind migration to the cloud and operational database workloads, as well as how CockroachDB's technology differentiates it from other Database-as-a-Service companies and hyperscale providers.

Software Sessions
Luca Casonato on Deno

Software Sessions

Play Episode Listen Later Mar 2, 2023 80:27


Luca Casonato is the tech lead for Deno Deploy and a TC39 delegate. Deno is a JavaScript runtime from the original creator of NodeJS, Ryan Dahl. Topics covered: What's a JavaScript runtime How V8 is used Why Deno was created The W3C WinterCG for server-side JavaScript Why it's difficult to ship new features in Node The benefits of web standards Creating an all-inclusive toolset like Rust and Go Deno's node compatibility layer Use cases for WebAssembly Benefits and implementation of Deno Deploy Reasons to deploy on the edge What's coming next Luca Luca Casonato @lcasdev Deno Homepage Deploy Showcase Subhosting Fresh web framework The anatomy of an Isolate Cloud Deno Users Netlify Edge Functions Deno at Slack GitHub Flat Data Shopify Oxygen Other related links Cache Web API V8 (JavaScript and WebAssembly engine) TC39 (JavaScript specification group) Web-interoperable Runtimes Community Group (WinterCG) Cloudflare Workers (Deno Deploy competitor) How Cloudflare KV works CockroachDB (Distributed database) XKCD Standards Comic Transcript You can help edit this transcript on GitHub. [00:00:07] Jeremy: Today I'm talking to Luca Casonato. He's a member of the Deno Core team and a TC 39 Delegate. [00:00:06] Luca: Hey, thanks for having me. What's a runtime? [00:00:07] Jeremy: So today we're gonna talk about Deno, and on the website it says, Deno is a runtime for JavaScript and TypeScript. So I thought we could start with defining what a runtime is. [00:00:21] Luca: Yeah, that's a great question. I think this question actually comes up a lot. It's, it's like sometimes we also define Deno as a headless browser, or I don't know, a, a JavaScript script execution tool. what actually defines runtime? I, I think what makes a runtime a runtime is that it is a, it's implemented in native code. It cannot be self-hosted. Like you cannot self-host a JavaScript runtime. and it executes JavaScript or TypeScript or some other scripting language, without relying on, well, yeah, I guess it's the self-hosting thing. Like it's, it's essentially a, a JavaScript execution engine, which is not self-hosted. So yeah, it, it maybe has IO bindings, but it doesn't necessarily need to like, it. Maybe it allows you to read the, from the file system or, or make network calls. Um, but it doesn't necessarily have to. It's, I think the, the primary definition is something which can execute JavaScript without already being written in JavaScript. How V8 and JavaScript runtimes are related [00:01:20] Jeremy: And when we hear about JavaScript run times, whether it's Deno or Node or Bun, or anything else, we also hear about it in the context of v8. Could you explain the relationship between V8 and a JavaScript run time? [00:01:36] Luca: Yeah. So V8 and, and JavaScript core and Spider Monkey, these are all JavaScript engines. So these are the low level virtual machines that can execute or that can parse your JavaScript code. turn it into byte code, maybe turn it into, compiled machine code, and then execute that code. But these engines, Do not implement any IO functions. They do not. They implement the JavaScript spec as is written. and then they provide extension hooks for, they call these host environments, um, like environments that embed these engines to provide custom functionalities to essentially poke out of the sandbox, out of the, out of the virtual machine. Um, and this is used in browsers. Like browsers have, have these engines built in. This is where they originated from. Um, and then they poke holes into this, um, sandbox virtual machine to do things like, I don't know, writing to the dom or, or console logging or making fetch calls and all these kinds of things. And what a runtime essentially does, a JavaScript runtime is it takes one of these engines and. It then provides its own set of host APIs, like essentially its own set of holes. It pokes into the sandbox. and depending on what the runtime is trying to do, um, the weight will do. This is gonna be different and, and the sort of API that is ultimately exposed to the end user is going to be different. For example, if you compare Deno and node, like node is very loosey goosey, about how it pokes holds into the sandbox, it sort of just pokes them everywhere. And this makes it difficult to enforce things like, runtime permissions for example. Whereas Deno is much more strict about how it, um, pokes holds into its sandbox. Like everything is either a web API or it's behind in this Deno name space, which means that it's, it's really easy to find, um, places where, where you're poking out of the sandbox. and really you can also compare these to browsers. Like browsers are also JavaScript run times. Um, they're just not headless. JavaScript run times, but JavaScript run times that also have a ui. and. . Yeah. Like there, there's, there's a whole Bunch of different kinds of JavaScript run times, and I think we're also seeing a lot more like embedded JavaScript run times. Like for example, if you've used React Native before, you, you may be using Hermes as a, um, JavaScript engine in your Android app, which is like a custom JavaScript engine written just for, for, for React native. Um, and this also is embedded within a, like react native run time, which is specific to React native. so it's also possible to have run times, for example, that are, that can be where the, where the back backing engine can be exchanged, which is kind of cool. [00:04:08] Jeremy: So it sounds like V8's role, one way to look at it is it can execute JavaScript code, but only pure functions. I suppose you [00:04:19] Luca: Pretty much. Yep. [00:04:21] Jeremy: Do anything that doesn't interact with IO so you think about browsers, you were mentioning you need to interact with a DOM or if you're writing a server side application, you probably need to receive or make HTTP requests, that sort of thing. And all of that is not handled by v8. That has to be handled by an external runtime. [00:04:43] Luca: Exactly Like, like one, one. There's, there's like some exceptions to this. For example, JavaScript technically has some IO built in with, within its standard library, like math, random. It's like random number. Generation is technically an IO operation, so, Technically V8 has some IO built in, right? And like getting the current date from the user, that's also technically IO So, like there, there's some very limited edge cases. It's, it's not that it's purely pure, but V8 for example, has a flag to turn it completely deterministic. which means that it really is completely pure. And this is not something which run times usually have. This is something like the feature of an engine because the engine is like so low level that it can essentially, there's so little IO that it's very easy to make deterministic where a runtime higher level, um, has, has io, um, much more difficult to make deterministic. [00:05:39] Jeremy: And, and for things like when you're working with JavaScript, there's, uh, asynchronous programming [00:05:46] Luca: mm-hmm. Concurrent JavaScript execution [00:05:47] Jeremy: So you have concurrency and things like that. Is that a part of V8 or is that the responsibility of the run time? [00:05:54] Luca: That's a great question. So there's multiple parts to this. There's the part, um, there, there's JavaScript promises, um, and sort of concurrent Java or well, yes, concurrent JavaScript execution, which is sort of handled by v8, like v8. You can in, in pure v8, you can create a promise, and you can execute some code within that promise. But without IO there's actually no way to defer time, uh, which means that in with pure v8, you can either, you can create a promise. Which executes right now. Or you can create a promise that never executes, but you can't create a promise that executes in 10 seconds because there's no way to measure 10 seconds asynchronously. What run times do is they add something called an event loop on top of this, um, on top of the base engine and that event loop, for example, like a very simple event loop, for example, might have a timer in it, which every second looks at if there's a timer schedule to run within that second. And if it does, if, if that timer exists, it'll go call out to V8 and say, you can now execute that promise. but V8 is still the one that's keeping track of, of like which promises exist, and the code that is meant to be invoked when they resolve all that kind of thing. Um, but the underlying infrastructure that actually invokes which promises get resolved at what point in time, like the asynchronous, asynchronous IO is what this is called. This is driven by the event loop, um, which is implemented by around time. So Deno, for example, it uses, Tokio for its event loop. This is a, um, an event loop written in Rust. it's very popular in the Rust ecosystem. Um, node uses libuv. This is a relatively popular runtime or, or event loop, um, implementation for c uh, plus plus. And, uh, libuv was written for Node. Tokio was not written for Deno. But um, yeah, Chrome has its own event loop implementation. Bun has its own event loop implementation. [00:07:50] Jeremy: So we, we might go a little bit more into that later, but I think what we should probably go into now is why make Deno, because you have Node that's, uh, currently very popular. The co-creator of Deno, to my understanding, actually created Node. So maybe you could explain to our audience what was missing or what was wrong with Node, where they decided I need to create, a new runtime. Why create a new runtime? (standards compliance) [00:08:20] Luca: Yeah. So the, the primary point of concern here was that node was slowly diverging from browser standards with no real path to, to, to, re converging. Um, like there was nothing that was pushing node in the direction of standards compliance and there was nothing, that was like sort of forcing node to innovate. and we really saw this because in the time between, I don't know, 2015, 2018, like Node was slowly working on esm while browsers had already shipped ESM for like three years. , um, node did not have fetch. Node hasn't had, or node only at, got fetch last year. Right? six, seven years after browsers got fetch. Node's stream implementation is still very divergent from, from standard web streams. Node was very reliant on callbacks. It still is, um, like promises in many places of the Node API are, are an afterthought, which makes sense because Node was created in a time before promises existed. Um, but there was really nothing that was pushing Node forward, right? Like nobody was actively investing in, in, in improving the API of Node to be more standards compliant. And so what we really needed was a new like Greenfield project, which could demonstrate that actually writing a new server side run. Is A viable, and b is totally doable with an API that is more standards combined. Like essentially you can write a browser, like a headless browser and have that be an excellent to use JavaScript runtime, right? And then there was some things that were I on top of that, like a TypeScript support because TypeScript was incredibly, or is still incredibly popular. even more so than it was four years ago when, when Deno was created or envisioned, um, this permission system like Node really poked holes into the V8 sandbox very early on with, with like, it's gonna be very difficult for Node to ever, ever, uh, reconcile this, this. Especially cuz the, some, some of the APIs that it, that it exposes are just so incredibly low level that like, I don't know, you can mutate random memory within your process. Um, which like if you want to have a, a secure sandbox like that just doesn't work. Um, it's not compatible. So there was really needed to be a place where you could explore this, um, direction and, and see if it worked. And Deno was that. Deno still is that, and I think Deno has outgrown that now into something which is much more usable as, as like a production ready runtime. And many people do use it, in production. And now Deno is on the path of slowly converging back with Node, um, in from both directions. Like Node is slowly becoming more standards compliant. and depending on who you ask this was, this was done because of Deno and some people said it would had already been going on and Deno just accelerated it. but that's not really relevant because the point is that like Node is becoming more standard compliant and, and the other direction is Deno is becoming more node compliant. Like Deno is implementing node compatibility layers that allow you to run code that was originally written for the node ecosystem in the standards compliant run time. so through those two directions, the, the run times are sort of, um, going back towards each other. I don't think they'll ever merge. but we're, we're, we're getting to a point here pretty soon, I think, where it doesn't really matter what runtime you write for, um, because you'll be able to write code written for one runtime in the other runtime relatively easily. [00:12:03] Jeremy: If you're saying the two are becoming closer to one another, becoming closer to the web standard that runs in the browser, if you're talking to someone who's currently developing in node, what's the incentive for them to switch to Deno versus using Node and then hope that eventually they'll kind of meet in the middle. [00:12:26] Luca: Yeah, so I think, like Deno is a lot more than just a runtime, right? Like a runtime executes JavaScript, Deno executes JavaScript, it executes type script. But Deno is so much more than that. Like Deno has a built-in format, or it has a built-in linter. It has a built-in testing framework, a built-in benching framework. It has a built-in Bundler, it, it like can create self-hosted, um, executables. yeah, like Bundle your code and the Deno executable into a single executable that you can trip off to someone. Um, it has a dependency analyzer. It has editor integrations. it has, Yeah. Like I could go on for hours, (laughs) about all of the auxiliary tooling that's inside of Deno, that's not a JavaScript runtime. And also Deno as a JavaScript runtime is just more standards compliant than any of the other servers at Runtimes right now. So if, if you're really looking for something which is standards complaint, which is gonna like live on forever, then it's, you know, like you cannot kill off the Fetch API ever. The Fetch API is going to live forever because Chrome supports it. Um, and the same goes for local storage and, and like, I don't know, the Blob API and all these other web APIs like they, they have shipped and browsers, which means that they will be supported until the end of time. and yeah, maybe Node has also reached that with its api probably to some extent. but yeah, don't underestimate the power of like 3 billion Chrome users. that would scream immediately if the Fetch API stopped working Right? [00:13:50] Jeremy: Yeah, I, I think maybe what it sounds like also is that because you're using the API that's used in the browser places where you deploy JavaScript applications in the future, you would hope that those would all settle on using that same API so that if you were using Deno, you could host it at different places and not worry about, do I need to use a special API maybe that you would in node? WinterCG (W3C group for server side JavaScript) [00:14:21] Luca: Yeah, exactly. And this is actually something which we're specifically working towards. So, I don't know if you've, you've heard of WinterCG? It's a, it's a community group at the W3C that, um, CloudFlare and, and Deno and some others including Shopify, have started last year. Um, we're essentially, we're trying to standardize the concept of what a server side JavaScript runtime is and what APIs it needs to have available to be standards compliant. Um, and essentially making this portability sort of written down somewhere and like write down exactly what code you can write and expect to be portable. And we can see like that all of the big, all of the big players that are involved in, in, um, building JavaScript run times right now are, are actively, engaged with us at WinterCG and are actively building towards this future. So I would expect that any code that you write today, which runs. in Deno, runs in CloudFlare, workers runs on Netlify Edge functions, runs on Vercel's Edge, runtime, runs on Shopify Oxygen, is going to run on the other four. Um, of, of those within the next couple years here, like I think the APIs of these is gonna converge to be essentially the same. there's obviously gonna always be some, some nuances. Um, like, I don't know, Chrome and Firefox and Safari don't perfectly have the same API everywhere, right? Like Chrome has some web Bluetooth capabilities that Safari doesn't, or Firefox has some, I don't know, non-standard extensions to the error object, which none of the other runtimes do. But overall you can expect these front times to mostly be aligned. yeah, and I, I think that's, that's really, really, really excellent and that, that's I think really one of the reasons why one should really consider, like building for, for this standard runtime because it, it just guarantees that you'll be able to host this somewhere in five years time and 10 years time, with, with very little effort. Like even if Deno goes under or CloudFlare goes under, or, I don't know, nobody decides to maintain node anymore. It'll be easy to, to run somewhere else. And also I expect that the big cloud vendors will ultimately, um, provide, manage offerings for, for the standards compliant JavaScript on time as well. Is Node part of WinterCG? [00:16:36] Jeremy: And this WinterCG group is Node a part of that as well? [00:16:41] Luca: Um, yes, we've invited Node, um, to join, um, due to the complexities of how node's, internal decision making system works. Node is not officially a member of WinterCG. Um, there is some individual members of the node, um, technical steering committee, which are participating. for example, um, James m Snell is, is the co-chair, is my co-chair on, on WinterCG. He also works at CloudFlare. He's also a node, um, TSC member, Mateo Colina, who has been, um, instrumental to getting fetch landed in Node, um, is also actively involved. So Node is involved, but because Node is node and and node's decision making process works the way it does, node is not officially listed anywhere as as a member. but yeah, they're involved and maybe they'll be a member at some point. But, yeah, let's. , see (laughs) [00:17:34] Jeremy: Yeah. And, and it, so it, it sounds like you're thinking that's more of a, a governance or a organizational aspect of note than it is a, a technical limitation. Is that right? [00:17:47] Luca: Yeah. I obviously can't speak for the node technical steering committee, but I know that there's a significant chunk of the node technical steering committee that is, very favorable towards, uh, standards compliance. but parts of the Node technical steering committee are also not, they are either indifferent or are actively, I dunno if they're still actively working against this, but have actively worked against standards compliance in the past. And because the node governance structure is very, yeah, is, is so, so open and let's, um, and let's, let's all these voices be heard, um, that just means that decision making processes within Node can take so long, like. . This is also why the fetch API took eight years to ship. Like this was not a technical problem. and it is also not a technical problem. That Node does not have URL pattern support or, the file global or, um, that the web crypto API was not on this, on the global object until like late last year, right? Like, these are not technical problems, these are decision making problems. Um, and yeah, that was also part of the reason why we started Deno as, as like a separate thing, because like you can try to innovate node, from the inside, but innovating node from the inside is very slow, very tedious, and requires a lot of fighting. And sometimes just showing somebody, from the outside like, look, this is the bright future you could have, makes them more inclined to do something. Why it takes so long to ship new features in Node [00:19:17] Jeremy: Do, do you have a sense for, you gave the example of fetch taking eight years to, to get into node. Do you, do you have a sense of what the typical objection is to, to something like that? Like I, I understand there's a lot of people involved, but why would somebody say, I, I don't want this [00:19:35] Luca: Yeah. So for, for fetch specifically, there was a, there was many different kinds of concerns. Um, one of the, I, I can maybe list two of them. One of them was for example, that the fetch API is not a good API and as such, node should not have it. which is sort of. missing the point of, because it's a standard API, how good or bad the API is is much less relevant because if you can share the API, you can also share a wrapper that's written around the api. Right? and then the other concern was, node does need fetch because Node already has an HTTP API. Um, so, so these are both kind of examples of, of concerns that people had for a long time, which it took a long time to either convince these people or, or to, push the change through anyway. and this is also the case for, for other things like, for example, web, crypto, um, like why do we need web crypto? We already have node crypto, or why do we need yet another streams? Implementation node already has four different streams implementations. Like, why do we need web streams? and the, the. Like, I don't know if you know this XKCD of, there's 14 competing standards. so let's write a 15th standard, to unify them all. And then at the end we just have 15 competing standards. Um, so I think this is also the kind of concern that people were concerned about, but I, I think what we've seen here is that this is really not a concern that one needs to have because it ends up that, or it turns out in the end that if you implement web APIs, people will use web APIs and will use web APIs only for their new code. it takes a while, but we're seeing this with ESM versus require like new code written with require much less common than it was two years ago. And, new code now using like Xhr, whatever it's called, form request or. You know, the one, I mean, compared to using Fetch, like nobody uses that name. Everybody uses Fetch. Um, and like in Node, if you write a little script, like you're gonna use Fetch, you're not gonna use like Nodes, htp, dot get API or whatever. and we're gonna see the same thing with Readable Stream. We're gonna see the same thing with Web Crypto. We're gonna see, see the same thing with Blob. I think one of the big ones where, where Node is still, I, I, I don't think this is one that's ever gonna get solved, is the, the Buffer global and Node. like we have the Uint8, this Uint8 global, um, and like all the run times including browsers, um, and Buffer is like a super set of that, but it's in global scope. So it, it's sort of this non-standard extension of unit eight array that people in node like to use and it's not compatible with anything else. Um, but because it's so easy to get at, people use it anyway. So those are, those are also kind of problems that, that we'll have to deal with eventually. And maybe that means that at some point the buffer global gets deprecated and I don't know, probably can never get removed. But, um, yeah, these are kinds of conversations that the no TSE is going have to have internally in, I don't know, maybe five years. Write once, have it run on any hosting platform [00:22:37] Jeremy: Yeah, so at a high level, What's shipped in the browser, it went through the ECMAScript approval process. People got it into the browser. Once it's in the browser, probably never going away. And because of that, it's safe to build on top of that for these, these server run times because it's never going away from the browser. And so everybody can kind of use it into the future and not worry about it. Yeah. [00:23:05] Luca: Exactly. Yeah. And that's, and that's excluding the benefit that also if you have code that you can write once and use in both the browser and the server side around time, like that's really nice. Um, like that, that's the other benefit. [00:23:18] Jeremy: Yeah. I think that's really powerful. And that right now, when someone's looking at running something in CloudFlare workers versus running something in the browser versus running something in. it's, I think a lot of people make the assumption it's just JavaScript, so I can use it as is. But it, it, there are at least currently, differences in what APIs are available to you. [00:23:43] Luca: Yep. Yep. Why bundle so many things into Deno? [00:23:46] Jeremy: Earlier you were talking about how Deno is more than just the runtime. It has a linter, formatter, file watcher there, there's all sorts of stuff in there. And I wonder if you could talk a little bit to the, the reasoning behind that [00:24:00] Luca: Mm-hmm. [00:24:01] Jeremy: Having them all be separate things. [00:24:04] Luca: Yeah, so the, the reasoning here is essentially if you look at other modern run time or mo other modern languages, like Rust is a great example. Go is a great example. Even though Go was designed around the same time as Node, it has a lot of these same tools built in. And what it really shows is that if the ecosystem converges, like is essentially forced to converge on a single set of built-in tooling, a that built-in tooling becomes really, really excellent because everybody's using it. And also, it means that if you open any project written by any go developer, any, any rest developer, and you look at the tests, you immediately understand how the test framework works and you immediately understand how the assertions work. Um, and you immediately understand how the build system works and you immediately understand how the dependency imports work. And you immediately understand like, I wanna run this project and I wanna restart it when my file changes. Like, you immediately know how to do that because it's the same everywhere. Um, and this kind of feeling of having to learn one tool and then being able to use all of the projects, like being able to con contribute to open source when you're moving jobs, whatever, like between personal projects that you haven't touched in two years, you know, like being able to learn this once and then use it everywhere is such an incredibly powerful tool. Like, people don't appreciate this until they've used a runtime or, or, or language which provides this to them. Like, you can go to any go developer and ask them if they would like. There, there's this, there's this saying in the Go ecosystem, um, that Go FMT is nobody's favorite, but, or, uh, wait, no, I don't remember what the, how the saying goes, but the saying essentially implies that the way that go FMT formats code, maybe not everybody likes, but everybody loves go F M T anyway, because it just makes everything look the same. And like, you can read your friend's code, your, your colleagues code, your new jobs code, the same way that you did your code from two years ago. And that's such an incredibly powerful feeling. especially if it's like well integrated into your IDE you clone a repository, open that repository, and like your testing panel on the left hand side just populates with all the tests, and you can click on them and run them. And if an assertion fails, it's like the standard output format that you're already familiar with. And it's, it's, it's a really great feeling. and if you don't believe me, just go try it out and, and then you will believe me, (laughs) [00:26:25] Jeremy: Yeah. No, I, I'm totally with you. I, I think it's interesting because with JavaScript in particular, it feels like the default in the community is the opposite, right? There's so many different ways. Uh, there are so many different build tools and testing frameworks and, formatters, and it's very different than, like you were mentioning, a go or a Rust that are more recent languages where they just include that, all Bundled in. Yeah. [00:26:57] Luca: Yeah, and I, I think you can see this as well in, in the time that average JavaScript developer spends configuring their tooling compared to a rest developer. Like if I write Rust, I write Rust, like all day, every day. and I spend maybe two, 3% of my time configuring Rust tooling like. Doing dependency imports, opening a new project, creating a format or config file, I don't know, deleting the build directory, stuff like that. Like that's, that's essentially what it means for me to configure my rest tooling. Whereas if you compare this to like a front-end JavaScript project, like you have to deal with making sure that your React version is compatible with your React on version, it's compatible with your next version is compatible with your ve version is compatible with your whatever version, right? this, this is all not automatic. Making sure that you use the right, like as, as a front end developer, you developer. You don't have just NPM installed, no. You have NPM installed, you have yarn installed, you have PNPM installed. You probably have like, Bun installed. And, and, and I don't know to use any of these, you need to have corepack enabled in Node and like you need to have all of their global bin directories symlinked into your or, or, or, uh, included in your path. And then if you install something and you wanna update it, you don't know, did I install it with yarn? Did I install it with N pNPM? Like this is, uh, significant complexity and you, you tend to spend a lot of time dealing with dependencies and dealing with package management and dealing with like tooling configuration, setting up esent, setting up prettier. and I, I think that like, especially Prettier, for example, really showed, was, was one of the first things in the JavaScript ecosystem, which was like, no, we're not gonna give you a config where you, that you can spend like six hours configuring, it's gonna be like seven options and here you go. And everybody used it because, Nobody likes configuring things. It turns out, um, and even though there's always the people that say, oh, well, I won't use your tool unless, like, we, we get this all the time. Like, I'm not gonna use Deno FMT because I can't, I don't know, remove the semicolons or, or use single quotes or change my tab width to 16. Right? Like, wait until all of your coworkers are gonna scream at you because you set the tab width to 16 and then see what they change it to. And then you'll see that it's actually the exact default that, everybody uses. So it'll, it'll take a couple more years. But I think we're also gonna get there, uh, like Node is starting to implement a, a test runner. and I, I think over time we're also gonna converge on, on, on, on like some standard build tools. Like I think ve, for example, is a great example of this, like, Doing a front end project nowadays. Um, like building new front end tooling that's not built on Vite Yeah. Don't like, Vite's it's become the standard and I think we're gonna see that in a lot more places. We should settle on what tools to use [00:29:52] Jeremy: Yeah, though I, I think it's, it's tricky, right? Because you have so many people with their existing projects. You have people who are starting new projects and they're just searching the internet for what they should use. So you're, you're gonna have people on web pack, you're gonna have people on Vite, I guess now there's gonna be Turbo pack, I think is another one that's [00:30:15] Luca: Mm-hmm. [00:30:16] Jeremy: There's, there's, there's all these different choices, right? And I, I think it's, it's hard to, to really settle on one, I guess, [00:30:26] Luca: Yeah, [00:30:27] Jeremy: uh, yeah. [00:30:27] Luca: like I, I, I think this is, this is in my personal opinion also failure of the Node Technical Steering committee, for the longest time to not decide that yes, we're going to bless this as the standard format for Node, and this is the standard package manager for Node. And they did, they sort of did, like, they, for example, node Blessed NPM as the standard, package manager for N for for node. But it didn't innovate on npm. Like no, the tech nodes, tech technical steering committee did not force NPM to innovate NPMs, a private company ultimately bought by GitHub and they had full control over how the NPM cli, um, evolved and nobody forced NPM to, to make sure that package install times are six times faster than they were. Three years ago, like nobody did that. so it didn't happen. And I think this is, this is really a failure of, of the, the, the, yeah, the no technical steering committee and also the wider JavaScript ecosystem of not being persistent enough with, with like focus on performance, focus on user experience, and, and focus on simplicity. Like things got so out of hand and I'm happy we're going in the right direction now, but, yeah, it was terrible for some time. (laughs) Node compatibility layer [00:31:41] Jeremy: I wanna talk a little bit about how we've been talking about Deno in the context of you just using Deno using its own standard library, but just recently last year you added a compatibility shim where people are able to use node libraries in Deno. [00:32:01] Luca: Mm-hmm. [00:32:01] Jeremy: And I wonder if you could talk to, like earlier you had mentioned that Deno has, a different permissions model. on the website it mentions that Deno's HTTP server is two times faster than node in a Hello World example. And I'm wondering what kind of benefits people will still get from Deno if they choose to use packages from Node. [00:32:27] Luca: Yeah, it's a great question. Um, so I think a, again, this is sort of a like, so just to clarify what we actually implemented, like what we have is we have support for you to import NPM packages. Um, so you can import any NPM package from NPM, from your type script or JavaScript ECMAScript module, um, that you have, you already have for your Deno code. Um, and we will under the hood, make sure that is installed somewhere in some directory globally. Like PNPM does. There's no local node modules folder you have to deal with. There's no package of Jason you have to deal with. Um, and there's no, uh, package. Jason, like versioning things you need to deal with. Like what you do is you do import cowsay from NPM colon cowsay at one, and that will import cowsay with like the semver tag one. Um, and it'll like do the sim resolution the same way node does, or the same way NPM does rather. And what you get from that is that essentially it gives you like this backdoor to a callout to all of the existing node code that Isri been written, right? Like you cannot expect that Deno developers, write like, I don't know. There was this time when Deno did not really have that many, third party modules yet. It was very early on, and I don't know the, you either, if you wanted to connect to Postgres and there was no Postgres driver available, then the solution was to write your own Postgres driver. And that is obviously not great. Um, (laughs) . So the better solution here is to let users for these packages where there's no Deno native or, or, or web native or standard native, um, package for this yet that is importable with url. Um, specifiers, you can import this from npm. Uh, so it's sort of this like backdoor into the existing NPM ecosystem. And we explicitly, for example, don't allow you to, create a package.json file or, import bare node specifiers because we don't, we, we want to stay standards compliant here. Um, but to make this work effectively, we need to give you this little back door. Um, and inside of this back door. All hell is like, or like everything is terrible inside there, right? Like inside there you can do bare specifiers and inside there you can like, uh, there's package.json and there's crazy node resolution and underscore underscore DIRNAME and common js. And like all of that stuff is supported inside of this backdoor to make all the NPM packages work. But on the outside it's exposed as this nice, ESM only, NPM specifiers. and the, the reason you would want to use this over, like just using node directly is because again, like you wanna use TypeScript, no config, like necessary. You want to use, you wanna have a formatter you wanna have a linter, you wanna have tooling that like does testing and benchmarking and compiling or whatever. All of that's built in. You wanna run this on the edge, like close to your users and like 30 different, 35 different, uh, points of presence. Um, it's like, Okay, push it to your git repository. Go to this website, click a button two times, and it's running in 35 data centers. like this is, this is the kind of ex like developer experience that you can, you do not get. You, I will argue that you cannot get with Node right now. Like even if you're using something like ts-node, it is not possible to get the same level of developer experience that you do with Deno. And the, the, the same like speed at which you can iterate, iterate on your projects, like create new projects, iterate on them is like incredibly fast in Deno. Like, I can open a, a, a folder on my computer, create a single file, may not ts, put some code in there and then call Deno Run may not. And that's it. Like I don't, I did not need to do NPM install I did not need to do NPM init -y and remove the license and version fields and from, from the generated package.json and like set private to true and whatever else, right? It just all works out of the box. And I think that's, that's what a lot of people come to deno for and, and then ultimately stay for. And also, yeah, standards compliance. So, um, things you build in Deno now are gonna work in five, 10 years, with no hassle. Node shims and testing [00:36:39] Jeremy: And so with this compatibility layer or this, this shim, is it where the node code is calling out to node APIs and you're replacing those with Deno compatible equivalents? [00:36:54] Luca: Yeah, exactly. Like for example, we have a shim in place that shims out the node crypto API on top of the web crypto api. Like sort of, some, some people may be familiar with this in the form of, um, Browserify shims. if anybody still remembers those, it's essentially. , your front end tooling, you were able to import from like node crypto in your front end projects and then behind the scenes your web packs or your browser replies or whatever would take that import from node crypto and would replace it with like the shim that was essentially exposed the same APIs node crypto, but under the hood, wasn't implemented with native calls, but was implemented on top of web crypto, or implemented in user land even. And Deno does something similar. there's a couple edge cases of APIs that there's, where, where we do not expose the underlying thing that we shim to, to end users, outside of the node shim. So like there's some, some APIs that I don't know if I have a good example, like node nextTick for example. Um, like to properly be able to shim node nextTick, you need to like implement this within the event loop in the runtime. and. , you don't need this in Deno, because Deno, you use the web standard queueMicrotask to, to do this kind of thing. but to be able to shim it correctly and run node applications correctly, we need to have this sort of like backdoor into some ugly APIs, um, which, which natively integrate in the runtime, but, yeah, like allow, allow this node code to run. [00:38:21] Jeremy: A, anytime you're replacing a component with a, a shim, I think there's concerns about additional bugs or changes in behavior that can be introduced. Is that something that you're seeing and, and how are you accounting for that? [00:38:38] Luca: Yeah, that's, that's an excellent question. So this is actually a, a great concern that we have all the time. And it's not just even introducing bugs, sometimes it's removing bugs. Like sometimes there's bugs in the node standard library which are there, and people are relying on these bugs to be there for the applications to function correctly. And we've seen this a lot, and then we implement this and we implement from scratch and we don't make that same bug. And then the test fails or then the application fails. So what we do is, um, we actually run node's test suite against Deno's Shim layer. So Node has a very extensive test suite for its own standard library, and we can run this suite against, against our shims to find things like this. And there's still edge cases, obviously, which node, like there was, maybe there's a bug which node was not even aware of existing. Um, where maybe this, like it's is, it's now standard, it's now like intended behavior because somebody relies on it, right? Like the second somebody relies on, on some non-standard or some buggy behavior, it becomes intended. Um, but maybe there was no test that explicitly tests for this behavior. Um, so in that case we'll add our own tests to, to ensure that. But overall we can already catch a lot of these by just testing, against, against node's tests. And then the other thing is we run a lot of real code, like we'll try run Prisma and we'll try run Vite and we'll try run NextJS and we'll try run like, I don't know, a bunch of other things that people throw at us and, check that they work and they work and there's no bugs. Then we did our job well and our shims are implemented correctly. Um, and then there's obviously always the edge cases where somebody did something absolutely crazy that nobody thought possible. and then they'll open an issue on the Deno repo and we scratch our heads for three days and then we'll fix it. And then in the next release there'll be a new bug that we added to make the compatibility with node better. so yeah, but I, yeah. Running tests is the, is the main thing running nodes test. Performance should be equal or better [00:40:32] Jeremy: Are there performance implications? If someone is running an Express App or an NextJS app in Deno, will they get any benefits from the Deno runtime and performance? [00:40:45] Luca: Yeah. It's actually, there is performance implications and they're usually. The opposite of what people think they are. Like, usually when you think of performance implications, it's always a negative thing, right? It's always okay. Like you, it's like a compromise. like the shim layer must be slower than the real node, right? It's not like we can run express faster than node can run, express. and obviously not everything is faster in Deno than it is in node, and not everything is faster in node than it is in Deno. It's dependent on the api, dependent on, on what each team decided to optimize. Um, and this also extends to other run times. Like you can always cherry pick results, like, I don't know, um, to, to make your runtime look faster in certain benchmarks. but overall, what really matters is that you do not like, the first important step for for good node compatibility is to make sure that if somebody runs your code or runs their node code in Deno or your other run type or whatever, It performs at least the same. and then anything on top of that great cherry on top. Perfect. but make sure the baselines is at least the same. And I think, yeah, we have very few APIs where we behave, where we, where, where like there's a significant performance degradation in Deno compared to Node. Um, and like we're actively working on these things. like Deno is not a, a, a project that's done, right? Like we have, I think at this point, like 15 or 16 or 17 engineers working on Deno, spanning across all of our different projects. And like, we have a whole team that's dedicated to performance, um, and a whole team that's dedicated node compatibility. so like these things get addressed and, and we make patch releases every week and a minor release every four weeks. so yeah, it's, it's not a standstill. It's, uh, constantly improving. What should go into the standard library? [00:42:27] Jeremy: Uh, something that kind of makes Deno stand out as it's standard library. There's a lot more in there than there is in in the node one. [00:42:38] Luca: Mm-hmm. [00:42:39] Jeremy: Uh, I wonder if you could speak to how you make decisions on what should go into it. [00:42:46] Luca: Yeah, so early on it was easier. Early on, the, the decision making process was essentially, is this something that a top 100 or top 1000 NPM library implements? And if it is, let's include it. and the decision making is still short of based on that. But right now we've already implemented most of the low hanging fruit. So things that we implement now are, have, have discussion around them whether we should implement them. And we have a process where, well we have a whole team of engineers on our side and we also have community members that, that will review prs and, and, and make comments. Open issues and, and review those issues, to sort of discuss the pros and cons of adding any certain new api. And sometimes it's also that somebody opens an issue that's like, I want, for example, I want an API to, to concatenate two unit data arrays together, which is something you can really easily do node with buffer dot con cat, like the scary buffer thing. and there's no standards way of doing that right now. So we have to have a little utility function that does that. But in parallel, we're thinking about, okay, how do we propose, an addition to the web standards now that makes it easy to concatenate iterates in the web standards, right? yeah, there's a lot to it. Um, but it's, it's really, um, it's all open, like all of our, all of our discussions for, for, additions to the standard library and things like that. It's all, all, uh, public on GitHub and the GitHub issues and GitHub discussions and GitHub prs. Um, so yeah, that's, that's where we do that. [00:44:18] Jeremy: Yeah, cuz to give an example, I was a little surprised to see that there is support for markdown front matter built into the standard library. But when you describe it as we look at the top a hundred thousand packages, are people looking at markdown? Are they looking at front matter? I, I'm sure there's a fair amount that are so that that makes sense. [00:44:41] Luca: Yeah, like it sometimes, like that one specifically was driven by, like, our team was just building a lot of like little blog pages and things like that. And every time it was either you roll your own front matter part or you look for one, which has like a subtle bug here and the other one has a subtle bug there and really not satisfactory with any of them. So, we, we roll that into the standard library. We add good test coverage for it good, add good documentation for it, and then it's like just a resource that people can rely on. Um, and you don't, you then don't have to make the choice of like, do I use this library to do my front meta parsing or the other library? No, you just use the one that's in the standard library. It's, it's also part of this like user experience thing, right? Like it's just a much nicer user experience, not having to make a choice, about stuff like that. Like completely inconsequential stuff. Like which library do we use to do front matter parsing? (laughs) [00:45:32] Jeremy: yeah. I mean, I think when, when that stuff is not there, then I think the temptation is to go, okay, let me see what node modules there are that will let me parse the front matter. Right. And then it, it sounds like probably ideally you want people to lean more on what's either in the standard library or what's native to the Deno ecosystem. Yeah. [00:46:00] Luca: Yeah. Like the, the, one of the big benefits is that the Deno Standard Library is implemented on top of web standards, right? Like it's, it's implemented on top of these standard APIs. so for example, there's node front matter libraries which do not run in the browser because the browser does not have the buffer global. maybe it's a nice library to do front matter pricing with, but. , you choose it and then three days later you decide that actually this code also needs to run in the browser, and then you need to go switch your front matter library. Um, so, so those are also kind of reasons why we may include something in Strand Library, like maybe there's even really good module already to do something. Um, but if there's certain reliance on specific node features that, um, we would like that library to also be compatible with, with, with web standards, we'll, uh, we might include in the standard library, like for example, YAML Parser, um, or the YAML Parser in the standard library is, is a fork of, uh, of the node YAML module. and it's, it's essentially that, but cleaned up and, and made to use more standard APIs rather than, um, node built-ins. [00:47:00] Jeremy: Yeah, it kind of reminds me a little bit of when you're writing a front end application, sometimes you'll use node packages to do certain things and they won't work unless you have a compatibility shim where the browser can make use of certain node APIs. And if you use the APIs that are built into the browser already, then you won't, you won't need to deal with that sort of thing. [00:47:26] Luca: Yeah. Also like less Bundled size, right? Like if you don't have to shim that, that's less, less code you have to ship to the client. WebAssembly use cases [00:47:33] Jeremy: Another thing I've seen with Deno is it supports running web assembly. [00:47:40] Luca: Mm-hmm. [00:47:40] Jeremy: So you can export functions and call them from type script. I was curious if you've seen practical uses of this in production within the context of Deno. [00:47:53] Luca: Yeah. there's actually a Bunch of, of really practical use cases, so probably the most executed bit of web assembly inside of Deno right now is actually yes, build like, yes, build has a web assembly, build like yeses. Build is something that's written and go. You have the choice of either running. Um, natively in machine code as, as like an ELF process on, on Linux or on on Windows or whatever. Or you can use the web assembly build and then it runs in web assembly. And the web assembly build is maybe 50% slower than the, uh, native build, but that is still significantly faster than roll up or, or, or, or I don't know, whatever else people use nowadays to do JavaScript Bun, I don't know. I, I just use es build always, um, So, um, for example, the Deno website, is running on Deno Deploy. And Deno Deploy does not allow you to run Subprocesses because it's, it's like this edge run time, which, uh, has certain security permissions that it's, that are not granted, one of them being sub-processes. So it needs to execute ES build. And the way it executes es build is by running them inside a web assembly. Um, because web assembly is secure, web assembly is, is something which is part of the JavaScript sandbox. It's inside the JavaScript sandbox. It doesn't poke any holes out. Um, so it's, it's able to run within, within like very strict security context. . Um, and then other examples are, I don't know, you want to have a HTML sanitizer, which is actually built on the real HTML par in a browser. we, we have an hdml sanitizer called com or, uh, ammonia, I don't remember. There's, there's an HTML sanitizer library on denoland slash x, which is built on the html parser from Firefox. Uh, which like ensures essentially that your html, like if you do HTML sanitization, you need to make sure your HTML par is correct, because if it's not, you might like, your browser might parse some HTML one way and your sanitizer pauses it another way and then it doesn't sanitize everything correctly. Um, so there's this like the Firefox HTML parser compiled to web assembly. Um, you can use that to. HTML sanitization, or the Deno documentation generation tool, for example. Uh, Deno Doc, there's a web assembly built for it that allows you to programmatically, like generate documentation for, for your type script modules. Um, yeah, and, and also like, you know, deno fmt is available as a WebAssembly module for programmatic access and a Bunch of other internal Deno, programs as well. Like, or, uh, like components, not programs. [00:50:20] Jeremy: What are some of the current limitations of web assembly and Deno for, for example, from web assembly, can I make HTTP requests? Can I read files? That sort of thing. [00:50:34] Luca: Mm-hmm. . Yeah. So web assembly, like when you spawn as web assembly, um, they're called instances, WebAssembly instances. It runs inside of the same vm, like the same, V8 isolate is what they're called, but. it does not have it, it's like a completely fresh sandbox, sort of, in the sense that I told you that between a runtime and like an engine essentially implements no IO calls, right? And a runtime does, like a runtime, pokes holds into the, the, the engine. web assembly by default works the same way that there is no holes poked into its sandbox. So you have to explicitly poke some holes. Uh, if you want to do HTTP calls, for example, when, when you create web assembly instance, it gives you, or you can give it something called imports, uh, which are essentially JavaScript function bindings, which you can call from within the web assembly. And you can use those function bindings to do anything you can from JavaScript. You just have to pass them through explicitly. and. . Yeah. Depending on how you write your web assembly, like if you write it in Rust, for example, the tooling is very nice and you can just call some JavaScript code from your Rust, and then the build system will automatically make sure that the right function bindings are passed through with the right names. And like, you don't have to deal with anything. and if you're writing go, it's slightly more complicated. And if you're writing like raw web assembly, like, like the web assembly, text format and compiling that to a binary, then like you have to do everything yourself. Right? It's, it's sort of the difference between writing C and writing JavaScript. Like, yeah. What level of abstraction do you want? It's definitely possible though, and that's for limitations. it, the same limitations as, as existing browsers apply. like the web assembly support in Deno is equivalent to the web assembly support in Chrome. so you can do, uh, many things like multi-threading and, and stuff like that already. but especially around, shared mutable memory, um, and having access to that memory from JavaScript. That's something which is a real difficulty with web assembly right now. yeah, growing web assembly memory is also rather difficult right now. There's, there's a, there's a couple inherent limitations right now with web assembly itself. Um, but those, those will be worked out over time. And, and Deno is like very up to date with the version of, of the standard, it, it implements, um, through v8. Like we're, we're, we're up to date with Chrome Beta essentially all the time. So, um, yeah. Any, anything you see in, in, in Chrome beta is gonna be in Deno already. Deno Deploy [00:52:58] Jeremy: So you talked a little bit about this before, the Deno team, they have their own, hosting. Platform called Deno Deploy. So I wonder if you could explain what that is. [00:53:12] Luca: Yeah, so Deno has this really nice, this really nice concept of permissions which allow you to, sorry, I'm gonna start somewhere slightly, slightly unrelated. Maybe it sounds like it's unrelated, but you'll see in a second. It's not unrelated. Um, Deno has this really nice permission system which allows you to sandbox Deno programs to only allow them to do certain operations. For example, in Deno, by default, if you try to open a file, it'll air out and say you don't have read permissions to read this file. And then what you do is you specify dash, dash allow read um, maybe you have to give it. they can either specify, allow, read, and then it'll grant to read access to the entire file system. Or you can explicitly specify files or folders or, any number of things. Same goes for right permissions, same goes for network permissions. Um, same goes for running subprocesses, all these kind of things. And by limiting your permissions just a little bit. Like, for example, by just disabling sub-processes and foreign function interface, but allowing everything else, allowing reeds and allowing network access and all that kind of stuff. we can run Deno programs in a way that is significantly more cost effective to you as the end user than, and, and like we can cold start them much faster than, like you may be able to with a, with a more conventional container based, uh, system. So what, what do you, what Deno Deploy is, is a way to run JavaScript or Deno Code, on our data centers all across the world with very little latency. like you can write some JavaScript code which execute, which serves HTTP requests deploy that to our platform, and then we'll make sure to spin that code up all across the world and have your users be able to access it through some URL or, or, or some, um, custom domain or something like that. and this is some, this is very similar to CloudFlare workers, for example. Um, and it's like Netlify Edge functions is built on top of Deno Deploy. Like Netlify Edge functions is implemented on top of Deno Deploy, um, through our sub hosting product. yeah, essentially Deno Deploy is, is, um, yeah, a cloud hosting service for JavaScript, um, which allows you to execute arbitrary JavaScript. and there there's a couple, like different directions we're going there. One is like more end user focused, where like you link your GitHub repository and. Like, we'll, we'll have a nice experience like you do with Netlify and Versace, that word like your commits automatically get deployed and you get preview deployments and all that kind of thing. for your backend code though, rather than for your front end websites. Although you could also write front-end websites and you know, obviously, and the other direction is more like business focused. Like you're writing a SaaS application and you want to allow the user to customize, the check like you're writing a SaaS application that provides users with the ability to write their own online store. Um, and you want to give them some ability to customize the checkout experience in some way. So you give them a little like text editor that they can type some JavaScript into. And then when, when your SaaS application needs to hit this code path, it sends a request to us with the code, we'll execute that code for you in a secure way. In a secure sandbox. You can like tell us you, this code only has access to like my API server and no other networks to like prevent data exfiltration, for example. and then you do, you can have all this like super customizable, code in inside of your, your SaaS application without having to deal with any of the operational complexities of scaling arbitrary code execution, or even just doing arbitrary code execution, right? Like it's, this is a very difficult problem and give it to someone else and we deal with it and you just get the benefits. yeah, that's Deno Deploy, and it's built by the same team that builds the Deno cli. So, um, all the, all of your favorite, like Deno cli, or, or Deno APIs are available in there. It's just as web standard is Deno, like you have fetch available, you have blob available, you have web crypto available, that kind of thing. yeah. Running code in V8 isolates [00:56:58] Jeremy: So when someone ships you their, their code and you run it, you mentioned that the, the cold start time is very low. Um, how, how is the code being run? Are people getting their own process? It sounds like it's not, uh, using containers. I wonder if you could explain a little bit about how that works. [00:57:20] Luca: Yeah, yeah, I can, I can give a high level overview of how it works. So, the way it works is that we essentially have a pool of, of Deno processes ready. Well, it's not quite Deno processes, it's not the same Deno CLI that you download. It's like a modified version of the Deno CLI based on the same infrastructure, that we have spun up across all of our different regions across the world, uh, across all of our different data centers. And then when we get a request, we'll route that request, um, the first time we get request for that, that we call them deployments, that like code, right? We'll take one of these idle Deno processes and will assign that code to run in that process, and then that process can go serve the requests. and these process, they're, they're, they're isolated and they're, you. it's essentially a V8 isolate. Um, and it's a very, very slim, it's like, it's a much, much, much slimmer version of the Deno cli essentially. Uh, which the only thing it can do is JavaScript execution and like, it can't even execute type script, for example, like type script is we pre-process it up front to make the the cold start faster. and then what we do is if you don't get a request for some amount of. , we'll, uh, spin down that, um, that isolate and, uh, we'll spin up a new idle one in its place. And then, um, if you get another request, I don't know, an hour later for that same deployment, we'll assign it to a new isolate. And yeah, that's a cold start, right? Uh, if you have an isolate which receives, or a, a deployment rather, which receives a Bunch of traffic, like let's say you receive a hundred requests per second, we can send a Bunch of that traffic to the same isolate. Um, and we'll make sure that if, that one isolate isn't able to handle that load, we'll spin it out over multiple isolates and we'll, we'll sort of load balance for you. Um, and we'll make sure to always send to the, to the point of present that's closest to, to the user making the request. So they get very minimal latency. and they get we, we've these like layers of load balancing in place and, and, and. I'm glossing over a Bunch of like security related things here about how these, these processes are actually isolated and how we monitor to ensure that you don't break out of these processes. And for example, Deno Deploy does, it looks like you have a file system cuz you can read files from the file system. But in reality, Deno Deploy does not have a file system. Like the file system is a global virtual file system. which is, is, uh, yeah, implemented completely differently than it is in Deno cli. But as an end user you don't have to care about that because the only thing you care about is that it has the exact same API as the Deno cli and you can run your code locally and if it works there, it's also gonna work in deploy. yeah, so that's, that's, that's kind of. High level of Deno Deploy. If, if any of this sounds interesting to anyone, by the way, uh, we're like very actively hiring on, on Deno Deploy. I happen to be the, the tech lead for, for a Deno Deploy product. So I'm, I'm always looking for engineers, to, to join our ranks and, and build cool distributed systems. Deno.com/jobs. [01:00:15] Jeremy: for people who aren't familiar with the isolates, are these each run in their own processes, or do you have a single process and that has a whole Bunch of isolates inside it? [01:00:28] Luca: in, in the general case, you can say that we run, uh, one isolate per process. but there's many asterisks on that. Um, because, it's, it's very complicated. I'll just say it's very complicated. Uh, in, in the general case though, it's, it's one isolate per process. Yeah. Configuring permissions [01:00:45] Jeremy: And then you touched a little bit on the permissions system. Like you gave the example of somebody could have a website where they let their users give them code to execute. how does it look in terms of specifying what permissions people have? Like, is that a configuration file? Are those flags you pass in? What, what does that look? [01:01:08] Luca: Yeah. So, so that product is called sub hosting. It's, um, slightly different from our end user platform. Um, it's essentially a service that allows you to, like, you email us, well, we'll send you a, um, onboard you, and then what you can do is you can send HTTP requests to a certain end point with a, authentication token and. a reference to some code to execute. And then what we'll do is, we'll, um, when we receive that HTTP request, we'll fetch the code, it's spin up and isolate, execute the code. execute the code. We serve the request, return you the response, um, and then we'll pipe logs to you and, and stuff like that. and the, and, and part of that is also when we, when we pull the, um, the, the code for to spin up the isolate, that code doesn't just include the code that we're executing, but also includes things like permissions, and, and various other, we call this isolate configuration. Um, you can inspect, this is all public. we have public docs for this at Deno.com/subhosting. I think. Yes, Deno.com/subhosting. [01:02:08] Jeremy: And is that built on top of something that's a part of the public Deno project, the open source part? Or is this specific to this sub hosting

PodRocket - A web development podcast from LogRocket
CockroachDB with Aydrian Howard

PodRocket - A web development podcast from LogRocket

Play Episode Listen Later Feb 21, 2023 27:26


DevRel at CockroachDB, Aydrian Howard, joins us to talk about how CockroachDB gives all of your apps effortless scale, bulletproof resilience, and more. Links https://twitter.com/itsaydrian https://beacons.ai/aydrian https://www.twitch.tv/itsaydrian https://twitter.com/cockroachdb https://www.cockroachlabs.com Tell us what you think of PodRocket We want to hear from you! We want to know what you love and hate about the podcast. What do you want to hear more about? Who do you want to see on the show? Our producers want to know, and if you talk with us, we'll send you a $25 gift card! If you're interested, schedule a call with us (https://podrocket.logrocket.com/contact-us) or you can email producer Kate Trahan at kate@logrocket.com (mailto:kate@logrocket.com) Follow us. Get free stickers. Follow us on Apple Podcasts, fill out this form (https://podrocket.logrocket.com/get-podrocket-stickers), and we'll send you free PodRocket stickers! What does LogRocket do? LogRocket combines frontend monitoring, product analytics, and session replay to help software teams deliver the ideal product experience. Try LogRocket for free today. (https://logrocket.com/signup/?pdr) Special Guest: Aydrian Howard.

Don’t Break the Bank: Run IT, Change IT
Building for Indestructible, with Spencer Kimball

Don’t Break the Bank: Run IT, Change IT

Play Episode Listen Later Feb 2, 2023 56:37


Spencer talks with us about building a next generation database with Cockroach Labs. He discusses the importance of distributed databases to use effectively and efficiently the commodity hardware found in the cloud, that can't be specialized and can scale rapidly. Spencer explains about the new realities of doing business across multiple regions with diverse databases, and what is being done to protect information and maintain continuity. He also sheds light on how CockroachDB on what they can do to help them push the envelope.3 Takeaways:In the modern era, certainly since the advent of the web, the scale that many applications have to grapple with rapidly exceeds what used to be considered enterprise scale. The use cases that demand distributed database capabilities are the norm today, as opposed to a newly ascendant luxury kind of capability.Consistent replication is something that you really need, and the ultimate differentiator comes in if you care about multi-region operation.Key Quotes: “What really defines the sort of flavor of distributed database, which Cockroach is, is a need to really use effectively and efficiently the sort of commodity hardware that you find in the cloud. That's the critical thing, so it can't be specialized hardware. The distributed database is just sort of the natural evolution of any of these architectures when you realize that you can only scale vertically so far. And, as you scale vertically, you're really limited to a single location where that hardware sits, and it might be a big, supercomputer, very expensive, but it's still only one location.” “Ultimately you do have to distribute, if you want to have the kind of business continuity that's required in most mission critical applications, because a data center can go down. So, that means you have to be re replicating your data to another location. So, you immediately have redundancy if you've got a failover or a distributed, you know, just a de facto distributed database.”“When you're talking about just scale, and data intensivity scale, and how much data is under management, and how quickly is it growing? It's big tech. I mean, some of these use cases are mind blowing with the total amount of data that they're starting to write every day. And, it's accelerating in many cases.”“What we've seen is that you can pretty much do anything, but you're going have to make some trade off. And, those trade-offs often are quite reasonable. So, that's sort of the art really, of composing these systems.”Best Career Advice:Start off at a company where there's a lot for you to learn at the best company that you can find.-------Bio:Spencer KimballCo-Founder and CEO of Cockroach LabsSpencer Kimball is the co-founder and CEO of Cockroach Labs, where he maintains a delicate balance between a love for programming, distributed systems, and the excitement of helping the company grow smoothly. While attending the University of California at Berkeley, he was one of the original authors of the GNU Image Manipulation Program (GIMP). He worked on databases during the heyday of the dot com era, and worked for Google during much of their biggest growth and impact.After college he co-founded WeGo, a company providing tools for building web communities and served as the company's co-CTO. In 2000, he created a web-based version of GIMP, OnlinePhotoLab.com, with the technology subsequently folded into Ofoto's online image manipulation tools. Kimball started work with Google in 2002 where he helped spearhead Colossus, a new version of the Google File System. He also worked on the Google Servlet Engine. In January 2012, Kimball launched the company Viewfinder which developed an app that allowed social media users to share photos, chat privately, and search photo history without leaving the app. The company was acquired by Square, Inc. in December 2013. He then formed CockroachDB, an open source project he started on GitHub in February 2014.-------For more information:https://www.linkedin.com/in/spencerwkimball/-------About the HostsMatthew O'Neill describes himself as husband, dad, geek and IT Exec.  He an Industry Managing Director within VMware's Strategic Ecosystem & Industry Solutions (SEIS) team.You can find Matthew on LinkedIn and Twitter.Brian Hayes is an audiophile, dad, builder of sheds, maker of mirth, world traveler and Financial Services Industry Lead at VMware.You can find Brian on LinkedIn.

The New Stack Podcast
What LaunchDarkly Learned from 'Eating Its Own Dog Food'

The New Stack Podcast

Play Episode Listen Later Jan 4, 2023 28:37


Feature flags — the on/off toggles, written in conditional statements, that allow organizations greater control over the user experience once code has been deployed —  are proliferating and growing more complex, and demand robust feature management, said Karishma Irani, head of product at LaunchDarkly, in this episode of The New Stack Makers. In a November survey by LaunchDarkly, which queried more than 1,000 DevOps professionals,  69% of participants said that feature flags are “must-have, mission-critical and/or high priority” for their organizations. “Feature management, we believe, is a modern practice that's becoming more and more common with companies that want to deploy more frequently, innovate faster, and just keep a healthy engineering team,” Irani said. The idea of feature management, Irani said, is to “maximize value while minimizing risk.” LaunchDarkly uses its own software, she said, and eating its own dog food, as the saying goes, has paid off in gaining insights into user needs. As part of LaunchDarkly's virtual conference Trajectory in November, Irani joined Heather Joslyn, features editor of The New Stack, for a wide-ranging conversation about the latest developments in feature management. This episode of Makers was sponsored by LaunchDarkly.Automating ApprovalsAs an example of the benefits of having first-hand knowledge of how their company's products are used, Irani pointed to an internal project in mid-2022. When the company migrated from [sponsor_inline_mention slug="mongodb" ]MongoDB[/sponsor_inline_mention] to CockroachDB, it used new capabilities in its Feature Workflows product, which allow users to define a workflow that can schedule the gradual release of a feature flag for a future date and time, and automate approval requests. “All of these async processes around approvals schedules, they're critical to releasing software, but they do slow you down and add more potential for manual error or human error,” Irani said. “And so our goal with Feature Workflows was to essentially automate the entire process of a feature release.”Overhauling ExperimentationThis past June, the company also revised its Experimentation offering, she said. Led by James Frost, LaunchDarkly's head of experimentation, the team did “a complete overhaul of our stats engine, they enhanced the integration path of our customers' existing data sets and metrics,” Irani said. “They redesigned our UX and the codified model and experimentation best practices into the product itself.” For instance, a new metric import API helps prevent the problem of multiple teams or users within a company using different tools for A/B and other experiments. It “significantly cuts down on manual duplicate work when importing metrics for experimentation,” said Irani. “So you can get set up faster.” Another addition to the Experimentation product is a sample ratio mismatch test, she said, so “you can be confident that all of your experiments are correctly allocating traffic to each variant.” These innovations, along with new capabilities to the company's Core Flagging Platform, are in general availability. On the horizon — and now available through LaunchDarkly's early access program, is Accelerate, which lets users track and visualize key engineering metrics, such as deployment frequency, release frequency, lead time for code changes, and flag coverage. “I'm sure you've caught on already,” Irani said, “but a few of these are Dora metrics, which obviously are extremely critical to our users.” Check out the entire episode for more details on what's new from LaunchDarkly and the problems that innovators in the feature management space still need to solve.

Code Comments
Ben Darnell, Cockroach Labs: Avoiding Failure In Distributed Databases

Code Comments

Play Episode Listen Later Dec 13, 2022 28:34


Ever been so frustrated with the options available that you build your own? Ben Darnell, Chief Architect and Co-Founder of Cockroach Labs, shares how his dissatisfaction with distributed databases led to the creation of CockroachDB. To build a distributed database that not only plans for but expects failures, they needed to implement the Raft consensus algorithm. Getting it up and running was a tough technical challenge. But the result was an incredibly resilient database.Find out why Netflix uses CockroachDB for their databases. Can you have access to a globally available database at the speed of a regional one? Check out how Cockroach Labs accomplishes this with global tables.  

programmier.bar – der Podcast für App- und Webentwicklung
Deep Dive 111 – CockroachDB mit Patrick Schulz

programmier.bar – der Podcast für App- und Webentwicklung

Play Episode Listen Later Oct 21, 2022 67:25


CockroachDB – Yet another NoSQL-Datenbank? Die ganzen Vorzüge von CockroachDB wie beispielsweise ihre Skalierung lassen das zwar vermuten, aber bei CockroachDB handelt es sich tatsächlich um eine SQL-Datenbank! Wir unterhalten uns mit Patrick Schulz, Sales Engineer bei Cockroach Labs, über die Kompatibilität von CockroachDB zu PostgreSQL, wie es Cockroach trotzdem schafft eure Datenbank global verteilt zu skalieren und ob ihr beim Aufsetzen einer Datenbank etwas zu beachten habt, damit sie performant bleibt.Am Ende der Folge gibt uns Patrick eine super Eselsbrücke, mit der sich leicht verinnerlichen lässt, was CockroachDB eigentlich ausmacht. Dafür muss man sich nur den Namen der Firma merken: CockRoach LabS (CRLS).C = ConsistencyR = ResiliencyL = LocalityS = ScalabilityJojo und Fabi juckt es nach diesem Talk auf jeden Fall wieder in den Fingern, ein neues Projekt aufzusetzen, in dem sie CockroachDB einsetzen können. Wir haben hier definitiv einen heißen Datenbank-Kandidaten für euer nächstes Projekt gefunden!Picks of the Day: Jojo: TablePlus – Ein Datenbank-Client für deinen Mac, der viele verschiedene Datenbanken unterstützt. Jojos absoluter Liebling unter den Datenbank-Clients! Patrick: Molt – CockroachDB Migration – Unser Speaker Patrick hat diesen Pick mitgebracht und uns damit ein einfaches Tool gezeigt, mit dem man bestehende Datenbanken auf CockroachDB migrieren kann. Fabi: Gitignore.io – Ein kleines Web-Tool, das dir beim Erstellen deines .gitignore-Files hilft. Aber Achtung: Jojo sagt, dass der GitHub Copilot das auch unterstützt. Dann wäre dieser Pick wohl obsolet.

GOTO - Today, Tomorrow and the Future
CockroachDB: The Definitive Guide • Ben Darnell & Guy Harrison

GOTO - Today, Tomorrow and the Future

Play Episode Listen Later Aug 19, 2022 51:46 Transcription Available


This interview was recorded for the GOTO Book Club.gotopia.tech/bookclubRead the full transcription of the interview hereBen Darnell - Co-Author of "CockroachDB: The Definitive Guide" and CTO at Cockroach Labs  Guy Harrison - Co-Author of "CockroachDB: The Definitive Guide" and CEO at alwaysNFT.cloud, CTO at ProvenDB  DESCRIPTIONHow do modern data platforms integrate into today's world? Join Guy Harrison and Ben Darnell, the authors of "CockroachDB: The Definitive Guide", to learn about the different use cases and unique functions of CockroachDB. Take a deep dive into the migration to the cloud and the different requirements for analytical and transactional data platforms.The interview is based on Ben & Guy's book "CockroachDB: The Definitive Guide".RECOMMENDED BOOKSDarnell, Harrison & Seldess • CockroachDB: The Definitive GuideGuy Harrison • Next Generation DatabasesGuy Harrison & Steven Feuerstein • MySQL Stored Procedure ProgrammingGuy Harrison & Michael Harrison • MongoDB Performance TuningKishen Das Kondabagilu Rajanna • Getting Started with CockroachDBRegina Obe & Leo Hsu • PostgreSQLSimon Riggs & Gianni Ciolli • PostgreSQL 14 Administration CookbookTwitterLinkedInFacebookLooking for a unique learning experience?Attend the next GOTO conference near you! Get your ticket at gotopia.techSUBSCRIBE TO OUR YOUTUBE CHANNEL - new videos posted almost daily.Discovery MattersA collection of stories and insights on matters of discovery that advance life...Listen on: Apple Podcasts Spotify Health, Wellness & Performance Catalyst w/ Dr. Brad CooperLooking for a catalyst to optimize your health, wellness & performance? You've found it!!Listen on: Apple Podcasts Spotify

The Stack Overflow Podcast
A conversation with Spencer Kimball, creator of GIMP and CockroachDB

The Stack Overflow Podcast

Play Episode Listen Later Aug 12, 2022 30:01


Spencer was one of the original creators of open-source, cross-platform image editing software GIMP (GNU Image Manipulation Program), authored while he was still in college. He went on to spend a decade at Google, plus two years as CTO of Viewfinder, later acquired by Square.In 2014, he cofounded Cockroach Labs to back his creation CockroachDB, a cloud-native distributed SQL database.Database sharding is essential for CockroachDB: “a critical part of how Cockroach achieves virtually everything,” says Spencer. Read up on how sharding a database can make it faster.Like many engineers who find themselves in the C-suite, Spencer went from full-time programmer to full-time CEO. He says it's been a “relatively gentle” evolution, but he can always go back.Like lots of you out there, Spencer started programming on a TI-99/4, the world's first 16-bit home computer.Connect with Spencer on LinkedIn or learn more about him.Today's Lifeboat badge goes to user Hughes M. for their answer to the question Multiple keys pointing to a single value in Redis (Cache) with Java.

Misreading Chat
#95: CockroachDB: The Resilient Geo-Distributed SQL Database

Misreading Chat

Play Episode Listen Later Jul 10, 2022 46:33


ちょっとやそっとで死なない分散 RDB の論文を向井が読みました。

Break Things On Purpose
Exploration and Resiliency with Mauricio Galdieri

Break Things On Purpose

Play Episode Listen Later Jun 28, 2022 30:42


In this episode, we cover: Mauricio talks about his background and his role at Pismo (1:14) Jason and Mauricio discuss tech and reliability with regards to financial institutions (5:59) Mauricio talks about the work he has done in Chaos Engineering with reliability (10:36) Mauricio discusses things he and his team have done to maximize success (19:44) Mauricio talks about new technologies his team has been utilizing (22:59) Links Referenced: Pismo: https://pismo.io/ LinkedIn: https://www.linkedin.com/company/pismo/ TranscriptMauricio: That's why the name Cockroach, I guess, if there's a [laugh] a world nuclear war here, all that will survive would be cockroaches in our client's data. [laugh]. So, I guess that's the gist of it.Jason: Welcome to Break Things on Purpose, a podcast about Chaos Engineering and reliability. In this episode, we chat with Mauricio Galdieri, a staff engineer at Pismo about testing versus exploration, reliability and resiliency, and the challenges of bringing new technologies to the financial sector.Jason: Welcome to the show.Mauricio: Hey, thank you. Welcome. Thanks for having me here, Jason.Jason: Yeah. So, Mauricio, you and I have chatted before in the past. We were at Chaos Conf, and you are part of a panel. So, I'm curious, I guess to kick things off, can you tell folks a little bit more about yourself and what you do at Pismo? And then we can maybe pick up from our conversations previously?Mauricio: Okay, awesome. I work as a staff engineer here at Pismo. I work in a squad called staff engineering squad, so we're a bunch of—five squad engineers there. And we're mostly responsible for coming up with new ways of using the existing technology, new technologies for us to have, and also standardize things like how we use those technologies here? How does it fit the whole processes we have here? And how does it fit in the pipelines we have here, also?And so, we do lots of documentation, lots of POCs, and try different things, and we talk to different people from different companies and see how they're solving problems that we also have. So, this is basically our day-to-day activities here. Before that, well, I have a kind of a different story, I guess. Most people that work in this field, have a degree in something like a technical degree or something like that. But I actually graduated as an architect in urban planning, so I came from a completely different field.But I've always worked as a software developer since a long time ago, more than [laugh] willing to disclose. So, at that time when I started working with software development, I like to say that startups were called dotcoms that back then, so, [laugh] there was a lots of job opportunities back then, so I worked as a software developer at that time. And things evolved. I grew less and less as an architect and more as an engineer, so after I graduated, I started to look for a second degree, but on the more technical college, so I went to an engineering college and graduated as a system analyst.So, from then on, I've always worked as a software developer and never, never have done any house planning or house project or something like that. And I really doubt if I could do that right now [laugh] so I may be a lousy architect [in that sense 00:03:32]. But anyway, I've worked in different companies for both in private and public sectors. And I've worked with consultancy firms and so on. But just before I came to Pismo, I went working with a FinTech.So, this is where I was my first contact with the world of finance in a software context. Since then, I've digged deep into this industry, and here I am now working at Pismo, it's for almost five years now.Jason: Wow. That quite a journey. And although it's a unique journey, it's also one that I feel like a lot of folks in tech come from different backgrounds and maybe haven't gone down the traditional computer science route. With that said, you know, one of the things you mentioned FinTech. Can you give us a little bit of a description of Prismo, just so folks understand the company that you're working at now?Mauricio: Oh, yeah. Well, Pismo, it's a company that has about six years now. And we provide infrastructure for financial services. So, we're not banks ourselves, but we provide the infrastructure for banks to build their financial projects with this. So basically, what we do is we manage accounts, we manage those accounts' balances, we have connections with credit card networks, so we process—we're also a credit card processor.We issue cards, although we're not the issuer in this in the strict sense, but we issue cards here and manage all the lifecycle of those cards. And basically, that's it. But we have a very broad offering of products, from account management to accounting management, and transactions management, and spending control limits and stuff. So, we have a very broad product portfolio. But basically, what we do is provide infrastructure for financial services.Jason: That's fascinating to me. So, if I were to sum that up, would it be accurate to say that you're basically like Software as a Service for financial institutions? You do all the heavy lifting?Mauricio: Yeah, yeah. I could say that, yeah.Jason: It's interesting to me because, you know, traditionally, we always think of banks because they need to be regulated and there needs to be a whole lot more security and reliability around finances, we always think of banks as being very slow when it comes to technology. And so, I think it's interesting that, in essence, what you've said with trying the latest technology and getting to play around with new technology and how it applies, especially within your staff engineering group, it's almost the exact opposite. You're sort of this forefront, this leading edge within the world of finance and technology.Mauricio: Yeah. And that actually is, it's something that—it's the most difficult part to sell banks to sign up with us, you know? Because they have those ancient systems running on-premises and most likely running on top of COBOL programs and so on. But at the same time, it's highly, highly reliable. That they've been running those systems for, like, 40 years, even more than that, so it's a very highly reliable.And as you said, it's a very regulated industry, so it's very hard to sell them this kind of new approach to banking. And actually, we consider this as almost an innovation for them. And it's a little bit strange to talk about innovation in a sense that we're proposing other companies to run in the cloud. This doesn't sound innovating at all nowadays. So, every company runs their systems in the cloud nowadays, so it's difficult to [laugh] realize that this is actually innovation in the banking system because they're not used to running those things.And as you said, they're slow in adopting new technologies because of security concerns, and so on. So, we're trying to bring these new things to the table and prove them. And we had to prove banks and other financial institutions that it is possible to run a banking system a hundred percent in the cloud while maintaining security standards and security compliances and governance compliance and all that stuff. It's very hard to do so and we have a very stringent process to evaluate and assess new technologies because we have to make sure it complies with those standards and all those certifications that we need to have in order to operate in this industry. So, it's very hard, but it doesn't—at that same time, we have lots of new technologies and different ways we can provide the same services to those banks.And then I think the most difficult part in this is to map what traditional banks were doing into this new way of doing things in the cloud. So, this mapping, it's sometimes it gets a little confusing and we have to be very patient and very clear with our clients what they should expect from us and how we will provide the same services they already have now, but using different technologies and different ways. For instance, they are used to these communications with different services, they're used to things like webhooks. But webhooks are not reliable; they can fail and if they fail, you lose that connection, you lose connectivity, and you may lose data and you may have things out of sync using webhooks. So, now we have things like event streaming, or queues and other stuff that you can use to [replay 00:09:47] things and not lose any data.But at the same time, you have to process this, and then offline in an asynchronous manner. So, you have to map those synchronous things that they did before to this asynchronous world and this world where things are—we have an eventual consistency. But it's very difficult but it's also at the same time, it's a very fascinating industry.Jason: Yeah, that is fascinating. But I do love how you mentioned taking the idea of the new technology and what it does, and really trying to map that back to previously—you know, those previous practices that they had. And so, along with that, for folks who are listening again, Mauricio and I had a chat during Chaos Conf a while back, and he was sharing some of the practices that Pisma has done for Chaos Engineering. And I always liken that back to, you know, Chaos Engineering really is very similar to traditional disaster recovery testing, in many ways, other than oftentimes, your disaster recovery would never actually, you know, take things down. Mauricio, I'm curious, can you share a little bit more about what you've been doing with Chaos Engineering and in general, with reliability. Are there any new programs or processes that you've worked on within Prismo around Chaos Engineering and reliability?Mauricio: Well, I think that the first thing to realize, and I think this is the most important point that you need to have very clear in your mind when we're talking about Chaos Engineering is that we're not testing something when we're doing Chaos Engineering; we're experimenting with something. And there's a subtle but very important distinction between those two concepts. When you test for something, you're testing for something that you knew what will happen; you have an idea of how it should behave. You're asserting a certain behavior. You know how the system must behave and you assert that, and it makes sure the system doesn't deviate on that by having an automated test, for instance, a unit or integrated test, or even functional tests and such.But Chaos Engineering is more about experimenting. So, it's designed for the unknowns. You don't know what will happen. You're basically experimenting. It's like a lab, you're working in a laboratory, you're trying different stuff and see what happens, you have an idea of what should happen and we call this a hypothesis, but you're not sure if that is how we will behave.And actually, it doesn't matter if it complies with your expectations. Even if it doesn't behave the way you expect it to behave or the way you want it to behave, you're still gaining knowledge about your system. So, it's much more about experimenting new things instead of actually testing for some something that you know about. And our journey here into Chaos Engineering at Pismo, it all began about a year-and-a-half ago when we got a very huge outage on one of our major cloud providers here. And we went down with them; they were out for about almost an hour.But not only we were affected by it, but other digital banks here in Brazil, but also many other services like Slack, Datadog, other observability tools that were running at that time, using that cloud provider went down, together with them. So, it was a major, major outage here. And then we were actually caught off guard on this because we have lots of different ways to make sure the system doesn't go down if something bad happens. But that was so bad that we went down and we couldn't do anything. We were desperate because we couldn't do anything. And also we can even communicate properly because we use Slack as our communication hub, so Slack was down at that time, also, so we cannot communicate properly with our official channels.Also, Datadog that we were using at a time also went down and we couldn't even see what was happening in the system because we didn't have any observability running at the time. So, that was a major, major outage we had there. So, we started thinking about ways we could experiment with those major outages and see how we could find ways of still operating at least partially and not go down entirely or at least have ways to see what was happening even in the face of a major disaster. And those traditional disaster recovery measures that were valid at the time, even those couldn't cope with the kind of outages we were facing at that time. So, we were trying to look for different ways that we can improve the reliability of our services as a whole.So, I guess that's when we started looking into Chaos Engineering and started looking for different tools to make that work, and different partnerships we could find, and even different ways we could experiment this with our existing technology and platform.Jason: I really love how you characterized that difference between testing and Chaos Engineering. And I think the idea of being more experimental puts you into a mindset of having this concept of, you know, kind of blamelessness, right, around failure. The idea that, like, failure is going to happen and we want to be open to seeing that and to learning from it. More so than a test, right? When we test things, then there's the notion of a pass-fail and fails are bad, whereas with an experiment, that learning is, if it didn't happen the way you expect, there's learning around that and that's a good thing rather than a bad thing, such as failing a test.Mauricio: Yeah, and that works in a higher framework, I guess, which is resilience itself. So, I guess, chaos experiment, chaos engineering, and all that stuff, it's an important part of a bigger whole that we call resilience. And I guess a key to understand resilience is that this point exactly, the systems never work in unexpected ways. They always behave the way it is expected to behave. They're deterministic in nature. So, we're talking about machines here, computers. We told them what we want them to do.And even if we have complexity and randomness involved, say if a network connection goes down, it still will behave the way we programmed them to behave. So, every failure should be expected. What we have here is that sometimes they behave in ways we don't want them to behave. And sometimes they behave in ways we want them to behave. So, it's more of a matter of desire, you know? You want something, you want the system to behave a certain way.So, in that sense, success should be measured as a performance variability, you know? So, sometimes it will work the way you want and sometimes it will work your way in ways that you don't want it to behave. And I guess, realizing that, it's key also to understand another point that is, in that sense, success is the flip side of failure. So, either it works the way you want it or it works the way you don't want it. And what we can do to move the scale towards a more successful operation, the ways you can do this, you must first realize also that—let's go back a little bit then say, if you have a failure and you look at why it happened, almost never it is the result of one single thing.Sometimes it is, but this is very rare. Most of the failures and even mainly when we're talking about major failures, they're most likely the result of a context of things that happened that led to this failure. And you can see that the same thing, it's valid for successes. When you have a success at one point, it's almost never the result of one thing that you did that led to a successful scenario. Most of the time is a context of different things you did that maximizes your chances of success.So, to turn this scale towards success, you should create an environment of several things, of a context of things. And this could be tooling, this could be your organizational culture and stuff, all of those things that you do in your company to maximize their chances of success. It's not, you cannot plan for success in the sense because planning is one thing you can do, and planning doesn't involve strategy, for instance. Because planning should be done thinking about things you can do, tasks you can perform, while strategy, you should be turning tables to [laugh] think in terms of strategy. So, you have to put all of this in the same way in a table and try to organize your company and your culture, your tools and your technology in ways you maximize your chances of success and minimize your chances of failures.Jason: That's such an interesting insight. So, I'm curious, can you dive into some of the things that you and your team have done to maximize your chances of success?Mauricio: Okay. When we started working with Chaos Engineering, it was in this sense of trying to do one more thing to maximize our chances of success. And we partnered up with Gremlin and we saw that working with Chaos Engineering, using Gremlin mainly, it's so easy—that is, it's also easy to lose track of what you're doing. It's easy for you to go just for the fun of it and break things down and have fun with it and stuff. So, we had to come up with a way to bring structure to this process.And by doing so, we should also not be too bureaucratic in the sense of creating a set of steps you should take in order to run a chaos session. So, one way we thought about was to come up with a document. That is the bureaucratic part, so this was a step you should take in order to plan for your chaos session, but there is one part of it—and I think it's one of the most important parts of this chaos session planning—is that you should describe what you're going to test, but more importantly, why you're going to test this. And this is one of the most important questions because this is a fundamental question: why you're doing this kind of experiment. And to answer that, you have to think about all the things in context.What are the technologies you're using? Why it fails in the first place? Do the fails that I expect to see are actually fails or is it just different ways of behaving? And sometimes we consider failure in a business rule that was not complied, that was not met. So, this is an opportunity to think about, are those business rules correct? Should we make it more flexible? Should we change those business logic?So, when you start asking why you're doing something, you're asking fundamental questions, and I think that puts you in context. And this is one of the major starting points to maximize our chances of success because it makes every engineer involved in running a chaos session, think about their role in the whole process and the role of their services in the whole company. So, I think this is one powerful question to ask before starting any chaos session, and I think this contributes a lot to a successful outcome.Jason: Yeah, I think that's a really great perspective on how to approach Chaos Engineering. Beyond the Chaos Engineering, you mentioned that the staff engineering group that you're part of that Prismo is really responsible for seeing new technologies and new trends and really trying to bring those in and see how they can be used and applied within the financial services sector. Are there any new technologies that you've used recently or that you're looking at right now that has really been fruitful or really applied to finding more success as you've mentioned?Mauricio: Yeah, there are some things we're researching. One of those already went past research and we're already using it in production, which is data—cloud-based, multi-region databases and multi-cloud—also—databases. And we're working with CockroachDB as one of our new database technologies we use. And it's a database built from the ground up to be ultra resilient. And that's why the name Cockroach, I guess, if there's a [laugh] a world nuclear war here, all that will survive would be cockroaches in our client's data. [laugh]. So, I guess that's the gist of it.And we have to think about that in different ways of how we approach this because we're talking about multi-cloud data stores and multi-region and how we deal with data in different regions. And should we replicate all the data between regions and how we do partition data. So, we have to think in different ways, how we approach data modeling with those new cloud-based and multi-region and globally distributed databases. Another one that we're—this is more like of a research, is having a sharded processing. And that is, how we can deal with, how we group different parts of the data to be processed separately but using the same logic.And this is a way to scale processing in ways that horizontal scaling in a more traditional way doesn't solve in some instances. Like, when we have—for instance, let me describe one scenario that we have that we're exploring things along those lines. We have a system here called ‘The Ledger,' which keeps track of all of the accounts' balances. And for this system, if we have multiple requests or lots of requests for different accounts, there's no problem because we're updating balances for different accounts, and that works fine. And we can deal with lots and lots of requests. We have a very good performance on that.But when we have lots of requests coming in from one particular accounts, and they're all grouped for this particular account, then we cannot—there's no way around locking at some place. So, you have to lock it either at the database level, or at a distributed locking mechanism level, or at the business logic layer. At some point, you have to lock the access to this account balance. So, this degrades performance because you have to wait for this processing to finish and start another. And how can we deal with that without using locks?And this was the challenge we put that to ourselves. And we're exploring different ways, lots of different ways, and different approaches to that. And we have lots of restrictions on that because this system has to respond quickly, has to respond online, and cannot be in an asynchronous process; it has to be synchronous. So, we have very little space for double-checking it and stuff. So, we're exploring a sharded processing for this one in which we can have a small subset of accounts being routed to one specific consumer to process this transaction, and by doing so, we may have things like a queue of order transactions so we can give up locking at the database and maybe improve on performance. But we're still on the POC on that, so let's see what we come up with [laugh] in the next few months.Jason: I think that's really fascinating. Both from a, you know, having been there, having worked on systems where, you know, very transaction-driven, and having locks be an issue. And so, you know, back in my day of doing this, you know, was traditionally MySQL or Postgres, trying to figure out, like, how do you structure the database. So, I think it's interesting that you're sort of tackling this in two ways, right? You've got CockroachDB, which is more oriented towards reliability, but a lot of the things that you're doing there around, you know, sharding and multi-cloud also have effects for this new work that you're doing on how do you eliminate that locking and try to do sharded processes as well. So, that's all super fascinating to me.Mauricio: Exactly. Yeah, yeah. This is one of the things that makes you do better the end of the day, you know? [laugh].Jason: Yeah, definitely. As an engineer, you know, if anybody's listening and you're thinking of, “Wow, this all sounds fascinating and really cool stuff,” right, “Really cool technologies to be working with and really interesting challenges to solve,” I know, Mauricio, you said that Pismo is hiring. Do you want to share a little bit more about ways that folks can engage with you? Or maybe even join your team?Mauricio: Yeah, sure. We're hiring; we have lots of jobs open for application. You can go to pismo.io and we have a section for that. And also, you can find us on LinkedIn; just search for Pismo and then find us there.And I think if you're an engineer and looking for some cool challenges on that, be sure to check our open positions because we do have lots and lots of cool stuff going on here. And since we're growing global, you have a chance to work from wherever you are. And this also imposes some major challenges for [laugh] for new technologies and making our products, our existing products, work in a globally distributed banking system. So, be sure to check out our channels there.Jason: Fantastic. Before we wrap up, is there anything else that you'd like to promote or share?Mauricio: Oh no, I think those are the main channels. You can find us: LinkedIn and our own website, pismo.io. Also, you can find us in some GopherCon conferences, KubeCon, and other—Money20/20; we're attending all of those conferences, be it in the software industry or in the financial industry. You can find this there with a booth there or just visiting or participating in some conferences and so on. So, be sure to check that out there also. I guess that's it.Jason: Very cool well thanks, Mauricio for joining us. It's been a pleasure to chat with you again.Mauricio: Thank you, Jason. And thanks for having me here.Jason: For links to all the information mentioned, visit our website at gremlin.com/podcast. If you liked this episode, subscribe to the Break Things on Purpose podcast on Spotify, Apple Podcasts, or your favorite podcast platform. Our theme song is called “Battle of Pogs” by Komiku, and it's available on loyaltyfreakmusic.com.

Message à caractère informatique
#72 – La bienveillance des nombres typés est impossible

Message à caractère informatique

Play Episode Listen Later Jun 24, 2022 67:52


Dans cet épisode il on chasse les trolls avec 1xEngineer avant de rendre possible l'impossible, puis de typer avec Rust. Nous parlons également de CockroachDB, de battements de coeur, de casseurs de circuits, de transformation de paquets et de Pixelation pour finir comme des voyous en musique. 00:00:00 Introduction 00:03:15 1xEngineer (Yannick) https://1x.engineer/ De la bonne bienveillance 00:09:00 "Making Impossible States Impossible" by Richard Feldman (Hubert) https://www.youtube.com/watch?v=IcgmSRJHu_8  00:22:35 Le typage en Rust (Yannick) https://fasterthanli.me/articles/the-curse-of-strong-typing  00:34:50 Enabling the Next Generation of Multi-Region Applications with CockroachDB (PZ) https://www.cockroachlabs.com/blog/sigmod-2022-cockroachdb-multi-region-paper/  00:42:50 Phi φ Accrual Failure Detection (François) https://medium.com/@arpitbhayani/phi-%CF%86-accrual-failure-detection-79c21ce53a7ahttps://www.researchgate.net/profile/Xavier-Defago/publication/29682135_The_ph_accrual_failure_detector/links/0a85e53ce412e3b069000000/The-ph-accrual-failure-detector.pdf  00:47:20 Will circuit breakers solve my problems? (François) https://brooker.co.za/blog/2022/02/16/circuit-breakers.htmlhttps://radlab.cs.berkeley.edu/people/fox/static/pubs/pdf/c18.pdf  00:50:47 Comment transformer un paquet de bytes en un double (Yannick) https://blog.m-ou.se/floats/ 01:00:00  Never, Ever, Ever Use Pixelation for Redacting Text (Hubert) https://bishopfox.com/blog/unredacter-tool-never-pixelation  01:05:44 Musiques de fin Rogue Legacy 2 OST - Axis Mundi 1 https://www.youtube.com/watch?v=f_QIN57e48A

Thinking Elixir Podcast
103: Vaxine.io and CRDT DBs with James Arthur

Thinking Elixir Podcast

Play Episode Listen Later Jun 14, 2022 54:27


James Arthur shares his project Vaxine.io, an Elixir layer built on top of a CRDT based distributed Erlang database called Antidote DB. We cover what CRDTs are and introduce how they work. We learn more about Antidote DB, the CURE protocol and especially the Vaxine.io project that adds Ecto types and makes it more approachable to Elixir applications. As applications become more global, the need for strongly consistent distributed writes becomes much more important. Show Notes online - http://podcast.thinkingelixir.com/103 (http://podcast.thinkingelixir.com/103) Elixir Community News - https://www.elixirconf.eu/talks/typecheck-effortless-runtime-type-checking/ (https://www.elixirconf.eu/talks/typecheck-effortless-runtime-type-checking/) – Marten shared an update on the TypeCheck project from ElixirConf.EU (June 9-10) - https://podcast.thinkingelixir.com/72 (https://podcast.thinkingelixir.com/72) – Episode with Martin about TypeCheck - https://twitter.com/elixirphoenix/status/1532707770415325185 (https://twitter.com/elixirphoenix/status/1532707770415325185) - https://twitter.com/wojtekmach/status/1532662628077785088 (https://twitter.com/wojtekmach/status/1532662628077785088) – Screenshot showing the single-file LiveView page - https://github.com/wojtekmach/mixinstallexamples/blob/main/phoenixliveview.exs (https://github.com/wojtekmach/mix_install_examples/blob/main/phoenix_live_view.exs) - Mix Install Examples - Phoenix LiveView app in ~70 LOC (Mix Install Examples - Phoenix LiveView app in ~70 LOC) - https://twitter.com/polvalente/status/1532439823964946432 (https://twitter.com/polvalente/status/1532439823964946432) – New Nx library called nx-signal was shared by the author, Paulo Valente - https://github.com/polvalente/nx-signal (https://github.com/polvalente/nx-signal) - https://twitter.com/josevalim/status/1533136904736198656 (https://twitter.com/josevalim/status/1533136904736198656) – José's cryptic tweet about Torchvision, ONNX, and a LiveView app - https://pytorch.org/vision/stable/index.html (https://pytorch.org/vision/stable/index.html) – Torchvision docs - https://onnx.ai/ (https://onnx.ai/) – ONNX a format for transporting trained machine learning models - https://github.com/thehaigo/live_onnx (https://github.com/thehaigo/live_onnx) – LiveOnnx project that combines the previous things with Axon and LiveView - https://github.com/oestrich/aino (https://github.com/oestrich/aino) – Aino released 0.5 - https://twitter.com/ericoestrich/status/1533995968793919488 (https://twitter.com/ericoestrich/status/1533995968793919488) – Eric explained v0.5 Aino changes - https://twitter.com/josevalim/status/1533907809942880261 (https://twitter.com/josevalim/status/1533907809942880261) – José Valim tweeted a new graphic, teasing something new in Nx land. - https://twitter.com/josevalim/status/1534120503182602240 (https://twitter.com/josevalim/status/1534120503182602240) – José mentioned that there are 3 major announcements this month starting at ElixirConfEU. Stay tuned! Do you have some Elixir news to share? Tell us at @ThinkingElixir (https://twitter.com/ThinkingElixir) or email at show@thinkingelixir.com (mailto:show@thinkingelixir.com) Discussion Resources - https://vaxine.io (https://vaxine.io) – Vaxine.io website - https://antidotedb.eu (https://antidotedb.eu) – Antidote DB website - https://crdt.tech (https://crdt.tech) – CRDT information website - https://vaxine.io/tech/how-it-works (https://vaxine.io/tech/how-it-works) - https://github.com/vaxine-io (https://github.com/vaxine-io) - https://github.com/AntidoteDB/antidote (https://github.com/AntidoteDB/antidote) – Erlang project by a different group "A planet scale, highly available, transactional database built on CRDT technology" - https://www.antidotedb.eu/ (https://www.antidotedb.eu/) - https://github.com/vaxine-io/vaxine (https://github.com/vaxine-io/vaxine) - https://github.com/vaxine-io/vax (https://github.com/vaxine-io/vax) – Data access library and Ecto integration - https://github.com/vaxine-io/examples (https://github.com/vaxine-io/examples) – Example and demo apps - https://www.foundationdb.org/ (https://www.foundationdb.org/) - https://riak.com/index.html (https://riak.com/index.html) - https://www.cockroachlabs.com/ (https://www.cockroachlabs.com/) - https://en.wikipedia.org/wiki/CockroachDB (https://en.wikipedia.org/wiki/CockroachDB) - https://supabase.com/ (https://supabase.com/) - https://lunar.vc/ (https://lunar.vc/) Guest Information - https://twitter.com/VaxineIO (https://twitter.com/VaxineIO) – Vaxine.io on Twitter - https://github.com/vaxine-io/ (https://github.com/vaxine-io/) – Vaxine Github Organization - https://vaxine.io (https://vaxine.io) – Vaxine.io website - https://vaxine.io/blog (https://vaxine.io/blog) – Blog Find us online - Message the show - @ThinkingElixir (https://twitter.com/ThinkingElixir) - Email the show - show@thinkingelixir.com (mailto:show@thinkingelixir.com) - Mark Ericksen - @brainlid (https://twitter.com/brainlid) - David Bernheisel - @bernheisel (https://twitter.com/bernheisel) - Cade Ward - @cadebward (https://twitter.com/cadebward)

Techzine Talks
Kubernetes wordt volwassen, maar is nog altijd complex

Techzine Talks

Play Episode Listen Later May 23, 2022 36:59


We waren aanwezig bij KubeCon + CloudNativeCon Europe in Valencia afgelopen week om meer te weten te komen over de laatste ontwikkelingen rondom Kubernetes, het bekendste project van de Cloud-Native Computing Foundation (CNCF). In deze podcast hoor je wat wij ervan opgestoken hebben.Als je het over het cloud-native ontwikkelen van applicaties hebt, kom je al heel snel uit op Kubernetes. Deze beheerlaag bovenop container-gebaseerde applicaties heeft sinds het ontstaan een jaar of acht geleden een grote vlucht genomen. Kubernetes in het algemeen en containers in het bijzonder blijven echter nog altijd behoorlijk complex. Daar is op zich niet zo heel veel mis mee, maar je moet wel weten hoe je ermee om moet gaan. Dat je er niet zonder voorkennis instapt en dus wel goed weet wat er bij komt kijken.Om wat meer inzicht te krijgen in de ontwikkelingen rondom Kubernetes, zijn we naar Valencia gereisd om tijdens KubeCon de nodige mensen te spreken. Niet alleen van de CNCF, maar vooral ook van de leveranciers in de markt die allerlei extra diensten aanbieden om Kubernetes goed in te richten.

In Depth
Building a highly-technical enterprise product? Essential advice for product leaders — Nate Stewart of Cockroach Labs

In Depth

Play Episode Listen Later May 19, 2022 58:08


Today's episode is with Nate Stewart, CPO of Cockroach Labs, the creator of database product CockroachDB. In today's conversation, we cover his essential advice for building a highly-technical product. He sketches out how the Cockroach team decided on the specific use case for its database product. Nate explains the steps the team took to reach conviction on their go-forward plan — which meant saying no to a lot of customers who didn't align with the product roadmap. Nate dives into the tactical ways to avoid taking on too many customer commitments, which he calls tech debt for product teams. Next, Nate dives into his advice for approaching design partnerships, especially when handling more conservative enterprise clients. He explains the different types of design partners, and why you should have all of those represented in the early days of your startup. Finally, we wrap up with his advice for other product leaders, including how to create a rock-solid partnership with a CEO as the first head of product, and how he solicits honest feedback across the executive team. You can follow Nate on Twitter at @Nate_Stewart You can email us questions directly at review@firstround.com or follow us on Twitter @ twitter.com/firstround and twitter.com/brettberson

Red Hat X Podcast Series
The Evolution of Serverless Databases

Red Hat X Podcast Series

Play Episode Listen Later May 3, 2022 32:52


In this conversation, Jim Walker (@jaymce, Principal Product Evangelist at Cockroach Labs) discusses how serverless has moved from compute to backing data services, and focuses on improving application developer productivity. Plus, we address why developers love consuming a serverless SQL database, how CockroachDB thinks about serverless, and what the future of application development is going to look like with a serverless SQL database.

Kubernetes Bytes
Intro to distributed databases on Kubernetes

Kubernetes Bytes

Play Episode Listen Later Jan 20, 2022 37:00


In todays episode of KubernetesBytes, hosts Ryan Wallner and Bhavin Shah discuss the basic of running distributed databases like Apache Cassandra and Kafka along with Mongo, CockroachDB and others on Kubernetes. There are various capabilities of Kubernetes that were designed for these types of data services and this podcast should help you get a basic understanding of the landscape as well as WHY you may want to run them on Kubernetes. Show Links: https://thenewstack.io/new-tools-for-optimizing-data-resilience-in-kubernetes/ https://awesome-kubernetes.readthedocs.io/ / https://nubenetes.com/ https://www.containiq.com/post/should-you-run-a-database-on-kubernetes Log4j recap - https://blog.aquasec.com/log4j-vulnerabilities-overview IPv6 support for EKS - https://aws.amazon.com/blogs/aws/amazon-elastic-kubernetes-service-adds-ipv6-networking/ https://thenewstack.io/testkube-a-new-approach-to-cloud-native-testing/ GigaOM DP report 2 https://gigaom.com/report/gigaom-radar-for-kubernetes-data-protection-2/ https://portworx.com/blog/kubernetes-failover-mongodb/ https://thenewstack.io/the-perfect-pair-kubernetes-and-distributed-sql/ https://www.purestorage.com/docs.html?item=/type/pdf/subtype/doc/path/content/dam/pdf/en/white-papers/wp-kafka-on-kubernetes-with-portworx.pdf https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/dml/dmlAboutDataConsistency.html https://developer.ibm.com/tutorials/ba-multi-data-center-cassandra-cluster-kubernetes-platform/ https://thenewstack.io/the-perfect-pair-kubernetes-and-distributed-sql/ https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/

The Cloudcast
2021 in Review, 2022 Predictions

The Cloudcast

Play Episode Listen Later Dec 22, 2021 59:54


Aaron (@aarondelp) and Brian (@bgracely) discuss the biggest trends from 2021, and make bold cloud computing predictions in 2022.SHOW: 577CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotwCHECK OUT OUR NEW PODCAST - "CLOUDCAST BASICS"SHOW SPONSORS:CBT Nuggets: Expert IT Training for individuals and teamsSign up for a CBT Nuggets Free Learner accountMegaport - Network as a Service PlatformTry Megaport - Cloud Connectivity SimplifiedSHOW NOTES:PODCAST BUSINESS:Crossed out 11yr anniversary We crossed 500 shows (March '21)Most listens in history, up about 20% from last yearWe launched Cloudcast Basics (4 seasons)We launched the Sunday Perspectives showsWe got named #1 Cloud podcastWe got named top 20 security podcastsListeners in 130 countries | 4200 CitiesIPOs from our guests - $2.678BVC Funding for our guest - $2.516BTRENDS and MAJOR STORIES from 2021:COVID pandemic continued, although some parts of many businesses opened up as vaccines became available. Working from Home seems to be a very real, long-term possibility for many in Tech. 25% of workers changed jobs (via LinkedIn)AWS - $60B, Azure - $68B, GCP - $15BAWS has new leadership. re:Invent felt very different.ARM is making a big push in the cloud (and Mac M1)This idea of “supercloud” or “overlay cloud services” is gaining traction - companies like Red Hat, Snowflake, MongoDB, Confluent, CockroachDB, etc. are growing quickly as SaaS services, even when the cloud has a native service.Cloudflare is making a move to chip away at AWS' profits (egress networking)Digital Ocean is making a bigger push around SMB cloud and developersVMware become independent again (from Dell)Cloud providers still haven't acquired legacy software companies to get into the on-premises data centers. They keep adjusting their offerings (Outposts, Arc, Anthos)Kubernetes keeps growing, but the hype has slowed down and moved to other areas adjacent to Kubernetes (Service Mesh, eBPF, etc.)Software-Supply-Chains and DevSecOps “shift left security” are now heavily funded industry segments. The metaverse, Web3, Crypto, NFTs are all starting to get a lots of hype (and confusion)2022 PREDICTIONS: Our 2020 Predictions from last yearOur 2021 Predictions from last yearAARON's PREDICTIONSZero Trust Models (again…) - Also security has been/will be the hardest part of cloud and hot job market will continueMicrosoft will become top public cloud worldwide, AWS will fall to #2Google will settle into 3rd, 4th, even 5th spot… BRIAN's PREDICTIONS Alphabet/Google decides if they still believe they can get to #2 by 2023We'll start seeing the first generation of ex-AWS people starting new companiesFEEDBACK?Email: show at the cloudcast dot netTwitter: @thecloudcastnet

Screaming in the Cloud
“Liqui”fying the Database Bottleneck with Robert Reeves

Screaming in the Cloud

Play Episode Listen Later Dec 16, 2021 50:45


About RobertR2 advocates for Liquibase customers and provides technical architecture leadership. Prior to co-founding Datical (now Liquibase), Robert was a Director at the Austin Technology Incubator. Robert co-founded Phurnace Software in 2005. He invented and created the flagship product, Phurnace Deliver, which provides middleware infrastructure management to multiple Fortune 500 companies.Links: Liquibase: https://www.liquibase.com Liquibase Community: https://www.liquibase.org Liquibase AWS Marketplace: https://aws.amazon.com/marketplace/seller-profile?id=7e70900d-dcb2-4ef6-adab-f64590f4a967 Github: https://github.com/liquibase Twitter: https://twitter.com/liquibase TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: It seems like there is a new security breach every day. Are you confident that an old SSH key, or a shared admin account, isn't going to come back and bite you? If not, check out Teleport. Teleport is the easiest, most secure way to access all of your infrastructure. The open source Teleport Access Plane consolidates everything you need for secure access to your Linux and Windows servers—and I assure you there is no third option there. Kubernetes clusters, databases, and internal applications like AWS Management Console, Yankins, GitLab, Grafana, Jupyter Notebooks, and more. Teleport's unique approach is not only more secure, it also improves developer productivity. To learn more visit: goteleport.com. And not, that is not me telling you to go away, it is: goteleport.com. Corey: You know how Git works right?Announcer: Sorta, kinda, not really. Please ask someone else.Corey: That's all of us. Git is how we build things, and Netlify is one of the best ways I've found to build those things quickly for the web. Netlify's Git-based workflows mean you don't have to play slap-and-tickle with integrating arcane nonsense and web hooks, which are themselves about as well understood as Git. Give them a try and see what folks ranging from my fake Twitter for Pets startup, to global Fortune 2000 companies are raving about. If you end up talking to them—because you don't have to; they get why self-service is important—but if you do, be sure to tell them that I sent you and watch all of the blood drain from their faces instantly. You can find them in the AWS marketplace or at www.netlify.com. N-E-T-L-I-F-Y dot com.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This is a promoted episode. What does that mean in practice? Well, it means the company who provides the guest has paid to turn this into a discussion that's much more aligned with the company than it is the individual.Sometimes it works, Sometimes it doesn't, but the key part of that story is I get paid. Why am I bringing this up? Because today's guest is someone I met in person at Monktoberfest, which is the RedMonk conference in Portland, Maine, one of the only reasons to go to Maine, speaking as someone who grew up there. And I spoke there, I met my guest today, and eventually it turned into this, proving that I am the envy of developer advocates everywhere because now I can directly tie me attending one conference to making a fixed sum of money, and right now they're all screaming and tearing off their headphones and closing this episode. But for those of you who are sticking around, thank you. My guest today is the CTO and co-founder of Liquibase. Please welcome Robert Reeves. Robert, thank you for joining me, and suffering the slings and arrows I'm about to hurled directly into your arse, as a warning shot.Robert: [laugh]. Man. Thanks for having me. Corey, I've been looking forward to this for a while. I love hanging out with you.Corey: One of the things I love about the Monktoberfest conference, and frankly, anything that RedMonk gets up to is, forget what's on stage, which is uniformly excellent; forget the people at RedMonk who are wonderful and I aspire to do more work with them in different ways; they're great, but the people that they attract are invariably interesting, they are invariably incredibly diverse in terms of not just demographics, but interests and proclivities. It's just a wonderful group of people, and every time I get the opportunity to spend time with those folks I do, and I've never once regretted it because I get to meet people like you. Snark and cynicism about sponsoring this nonsense aside—for which I do thank you—you've been a fascinating person to talk to you because you're better at a lot of the database-facing things than I am, so I shortcut to instead of forming my own opinions, I just skate off of yours in some cases. You're going to get letters now.Robert: Well, look, it's an occupational hazard, right? Releasing software, it's hard so you have to learn these platforms, and part of it includes the database. But I tell you, you're spot on about Monktoberfest. I left that conference so motivated. Really opened my eyes, certainly injecting empathy into what I do on a day-to-day basis, but it spurred me to action.And there's a lot of programs that we've started at Liquibase that the germination for that seed came from Monktoberfest. And certainly, you know, we were bummed out that it's been canceled two years in a row, but we can't wait to get back and sponsor it. No end of love and affection for that team. They're also really smart and right about a hundred percent of the time.Corey: That's the most amazing part is that they have opinions that generally tend to mirror my own—which, you know—Robert: [laugh].Corey: —confirmation bias is awesome, but they almost never get it wrong. And that is one of the impressive things is when I do it, I'm shooting from the hip and I already have an apology half-written and ready to go, whereas when dealing with them, they do research on this and they don't have the ‘I'm a loud, abrasive shitpostter on Twitter' defense to fall back on to defend opinions. And if they do, I've never seen them do it. They're right, and the fact that I am as aligned with them as I am, you'd think that one of us was cribbing from the other. I assure you that's not the case.But every time Steve O'Grady or Rachel Stephens, or Kelly—I forget her last name; my apologies is all Twitter, but she studied medieval history, I remember that—or James Governor writes something, I'm uniformly looking at this and I feel a sense of dismay, been, “Dammit. I should have written this. It's so well written and it makes such a salient point.” I really envy their ability to be so consistently on point.Robert: Well, they're the only analysts we pay money to. So, we vote with our dollars with that one. [laugh].Corey: Yeah. I'm only an analyst when people have analyst budget. Other than that, I'm whatever the hell you describe me. So, let's talk about that thing you're here to show. You know, that little side project thing you found and are the CTO of.I wasn't super familiar with what Liquibase does until I looked into it and then had this—I got to say, it really pissed me off because I'm looking at it, and it's how did I not know that this existed back when the exact problems that you solve are the things I was careening headlong into? I was actively annoyed. You're also an open-source project, which means that you're effectively making all of your money by giving things away and hoping for gratitude to come back on you in the fullness of time, right?Robert: Well, yeah. There's two things there. They're open-source component, but also, where was this when I was struggling with this problem? So, for the folks that don't know, what Liquibase does is automate database schema change. So, if you need to update a database—I don't care what it is—as part of your application deployment, we can help.Instead of writing a ticket or manually executing a SQL script, or generating a bunch of docs in a NoSQL database, you can have Liquibase help you out with that. And so I was at a conference years ago, at the booth, doing my booth thing, and a managing director of a very large bank came to me, like, “Hey, what do you do?” And saw what we did and got angry, started yelling at me. “Where were you three years ago when I was struggling with this problem?” Like, spitting mad. [laugh]. And I was like, “Dude, we just started”—this was a while ago—it was like, “We just started the company two years ago. We got here as soon as we could.”But I struggled with this problem when I was a release manager. And so I've been doing this for years and years and years—I don't even want to talk about how long—getting bits from dev to test to production, and the database was always, always, always the bottleneck, whether it was things didn't run the same in test as they did, eventually in production, environments weren't in sync. It's just really hard. And we've automated so much stuff, we've automated application deployment, lowercase a compiled bits; we're building things with containers, so everything's in that container. It's not a J2EE app anymore—yay—but we haven't done a damn thing for the database.And what this means is that we have a whole part of our industry, all of our database professionals, that are frankly struggling. I always say we don't sell software Liquibase. We sell piano recitals, date nights, happy hours, all the stuff you want to do but you can't because you're stuck dealing with the database. And that's what we do at Liquibase.Corey: Well, you're talking about database people. That's not how I even do it. I would never call myself that, for very good reason because you know, Route 53 remains the only database I use. But the problem I always had was that, “Great. I'm doing a deployment. Oh, I'm going to put out some changes to some web servers. Okay, what's my rollback?” “Well, we have this other commit we can use.” “Oh, we're going to be making a database schema change. What's your rollback strategy,” “Oh, I've updated my resume and made sure that any personal files I had on my work laptop been backed up somewhere else when I immediately leave the company when we can't roll back.” Because there's not really going to be a company anymore at that point.It's one of those everyone sort of holds their breath and winces when it comes to anything that resembles a schema change—or an ALTER TABLE as we used to call it—because that is the mistakes will show territory and you can hope and plan for things in pre-prod environments, but it's always scary. It's always terrifying because production is not like other things. That's why I always call my staging environment ‘theory' because things work in theory but not in production. So, it's how do you avoid the mess of winding up just creating disasters when you're dealing with the reality of your production environments? So, let's back up here. How do you do it? Because it sounds like something people would love to sell me but doesn't exist.Robert: [laugh]. Well, it's real simple. We have a file, we call it the change log. And this is a ledger. So, databases need to be evolved. You can't drop everything and recreate it from scratch, so you have to apply changes sequentially.And so what Liquibase will do is it connects to the database, and it says, “Hey, what version are you?” It looks at the change log, and we'll see, ehh, “There's ten change sets”—that's what components of a change log, we call them change sets—“There's ten change sets in there and the database is telling me that only five had been executed.” “Oh, great. Well, I'll execute these other five.” Or it asks the database, “Hey, how many have been executed?” And it says, “Ten.”And we've got a couple of meta tables that we have in the database, real simple, ANSI SQL compliant, that store the changes that happen to the database. So, if it's a net new database, say you're running a Docker container with the database in it on your local machine, it's empty, you would run Liquibase, and it says, “Oh, hey. It's got that, you know, new database smell. I can run everything.”And so the interesting thing happens when you start pointing it at an environment that you haven't updated in a while. So, dev and test typically are going to have a lot of releases. And so there's going to be little tiny incremental changes, but when it's time to go to production, Liquibase will catch it up. And so we speak SQL to the database, if it's a NoSQL database, we'll speak their API and make the changes requested. And that's it. It's very simple in how it works.The real complex stuff is when we go a couple of inches deeper, when we start doing things like, well, reverse engineering of your database. How can I get a change log of an existing database? Because nobody starts out using Liquibase for a project. You always do it later.Corey: No, no. It's one of those things where when you're doing a project to see if it works, it's one of those, “Great, I'll run a database in some local Docker container or something just to prove that it works.” And, “Todo: fix this later.” And yeah, that todo becomes load-bearing.Robert: [laugh]. That's scary. And so, you know, we can help, like, reverse engineering an entire database schema, no problem. We also have things called quality checks. So sure, you can test your Liquibase change against an empty database and it will tell you if it's syntactically correct—you'll get an error if you need to fix something—but it doesn't enforce things like corporate standards. “Tables start with T underscore.” “Do not create a foreign key unless those columns have an ID already applied.” And that's what our quality checks does. We used to call it rules, but nobody likes rules, so we call it quality checks now.Corey: How do you avoid the trap of enumerating all the bad things you've seen happen because at some point, it feels like that's what leads to process ossification at large companies where, “Oh, we had this bad thing happen once, like, a disk filled up, so now we have a check that makes sure that all the disks are at least 20, empty.” Et cetera. Great. But you keep stacking those you have thousands and thousands and thousands of those, and even a one-line code change then has to pass through so many different tests to validate that this isn't going to cause the failure mode that happened that one time in a unicorn circumstance. How do you avoid the bloat and the creep of stuff like that?Robert: Well, let's look at what we've learned from automated testing. We certainly want more and more tests. Look, DevOp's algorithm is, “All right, we had a problem here.” [laugh]. Or SRE algorithm, I should say. “We had a problem here. What happened? What are we going to change in the future to make sure this doesn't happen?” Typically, that involves a new standard.Now, ossification occurs when a person has to enforce that standard. And what we should do is seek to have automation, have the machine do it for us. Have the humans come up and identify the problem, find a creative way to look for the issue, and then let the machine enforce it. Ossification happens in large organizations when it's people that are responsible, not the machine. The machines are great at running these things over and over again, and they're never hung over, day after Super Bowl Sunday, their kid doesn't get sick, they don't get sick. But we want humans to look at the things that we need that creative energy, that brain power on. And then the rote drudgery, hand that off to the machine.Corey: Drudgery seems like sort of a job description for a lot of us who spend time doing operation stuff.Robert: [laugh].Corey: It's drudgery and it's boring, punctuated by moments of sheer terror. On some level, you're more or less taking some of the adrenaline high of this job away from people. And you know, when it comes to databases, I'm kind of okay with that as it turns out.Robert: Yeah. Oh, yeah, we want no surprises in database-land. And that is why over the past several decades—can I say several decades since 1979?Corey: Oh, you can s—it's many decades, I'm sorry to burst your bubble on that.Robert: [laugh]. Thank you, Corey. Thank you.Corey: Five, if we're being honest. Go ahead.Robert: So, it has evolved over these many decades where change is the enemy of stability. And so we don't want change, and we want to lock these things down. And our database professionals have become changed from sentinels of data into traffic cops and TSA. And as we all know, some things slip through those. Sometimes we speed, sometimes things get snuck through TSA.And so what we need to do is create a system where it's not the people that are in charge of that; that we can set these policies and have our database professionals do more valuable things, instead of that adrenaline rush of, “Oh, my God,” how about we get the rush of solving a problem and saving the company millions of dollars? How about that rush? How about the rush of taking our old, busted on-prem databases and figure out a way to scale these up in the cloud, and also provide quick dev and test environments for our developer and test friends? These are exciting things. These are more fun, I would argue.Corey: You have a list of reference customers on your website that are awesome. In fact, we share a reference customer in the form of Ticketmaster. And I don't think that they will get too upset if I mention that based upon my work with them, at no point was I left with the impression that they played fast and loose with databases. This was something that they take very seriously because for any company that, you know, sells tickets to things you kind of need an authoritative record of who's bought what, or suddenly you don't really have a ticket-selling business anymore. You also reference customers in the form of UPS, which is important; banks in a variety of different places.Yeah, this is stuff that matters. And you support—from the looks of it—every database people can name except for Route 53. You've got RDS, you've got Redshift, you've got Postgres-squeal, you've got Oracle, Snowflake, Google's Cloud Spanner—lest people think that it winds up being just something from a legacy perspective—Cassandra, et cetera, et cetera, et cetera, CockroachDB. I could go on because you have multiple pages of these things, SAP HANA—whatever the hell that's supposed to be—Yugabyte, and so on, and so forth. And it's like, some of these, like, ‘now you're just making up animals' territory.Robert: Well, that goes back to open-source, you know, you were talking about that earlier. There is no way in hell we could have brought out support for all these database platforms without us being open-source. That is where the community aligns their goals and works to a common end. So, I'll give you an example. So, case in point, recently, let me see Yugabyte, CockroachDB, AWS Redshift, and Google Cloud Spanner.So, these are four folks that reached out to us and said, either A) “Hey, we want Liquibase to support our database,” or B) “We want you to improve the support that's already there.” And so we have what we call—which is a super creative name—the Liquibase test harness, which is just genius because it's an automated way of running a whole suite of tests against an arbitrary database. And that helped us partner with these database vendors very quickly and to identify gaps. And so there's certain things that AWS Redshift—certain objects—that AWS Redshift doesn't support, for all the right reasons. Because it's data warehouse.Okay, great. And so we didn't have to run those tests. But there were other tests that we had to run, so we create a new test for them. They actually wrote some of those tests. Our friends at Yugabyte, CockroachDB, Cloud Spanner, they wrote these extensions and they came to us and partnered with us.The only way this works is with open-source, by being open, by being transparent, and aligning what we want out of life. And so what our friends—our database friends—wanted was they wanted more tooling for their platform. We wanted to support their platform. So, by teaming up, we help the most important person, [laugh] the most important person, and that's the customer. That's it. It was not about, “Oh, money,” and all this other stuff. It was, “This makes our customers' lives easier. So, let's do it. Oop, no brainer.”Corey: There's something to be said for making people's lives easier. I do want to talk about that open-source versus commercial divide. If I Google Liquibase—which, you know, I don't know how typing addresses in browsers works anymore because search engines are so fast—I just type in Liquibase. And the first thing it spits me out to is liquibase.org, which is the Community open-source version. And there's a link there to the Pro paid version and whatnot. And I was just scrolling idly through the comparison chart to see, “Oh, so ‘Community' is just code for shitty and you're holding back advanced features.” But it really doesn't look that way. What's the deal here?Robert: Oh, no. So, Liquibase open-source project started in 2006 and Liquibase the company, the commercial entity, started after that, 2012; 2014, first deal. And so, for—Nathan Voxland started this, and Nathan was struggling. He was working at a company, and he had to have his application—of course—you know, early 2000s, J2EE—support SQL Server and Oracle and he was struggling with it. And so he open-sourced it and added more and more databases.Certainly, as open-source databases grew, obviously he added those: MySQL, Postgres. But we're never going to undo that stuff. There's rollback for free in Liquibase, we're not going to be [laugh] we're not going to be jerks and either A) pull features out or, B) even worse, make Stephen O'Grady's life awful by changing the license [laugh] so he has to write about it. He loves writing about open-source license changes. We're Apache 2.0 and so you can do whatever you want with it.And we believe that the things that make sense for a paying customer, which is database-specific objects, that makes sense. But Liquibase Community, the open-source stuff, that is built so you can go to any database. So, if you have a change log that runs against Oracle, it should be able to run against SQL Server, or MySQL, or Postgres, as long as you don't use platform-specific data types and those sorts of things. And so that's what Community is about. Community is about being able to support any database with the same change log. Pro is about helping you get to that next level of DevOps Nirvana, of reaching those four metrics that Dr. Forsgren tells us are really important.Corey: Oh, yes. You can argue with Nicole Forsgren, but then you're wrong. So, why would you ever do that?Robert: Yeah. Yeah. [laugh]. It's just—it's a sucker's bet. Don't do it. There's a reason why she's got a PhD in CS.Corey: She has been a recurring guest on this show, and I only wish she would come back more often. You and I are fun to talk to, don't get me wrong. We want unbridled intellect that is couched in just a scintillating wit, and someone is great to talk to. Sorry, we're both outclassed.Robert: Yeah, you get entertained with us; you learn with her.Corey: Exactly. And you're still entertained while doing it is the best part.Robert: [laugh]. That's the difference between Community and Pro. Look, at the end of the day, if you're an individual developer just trying to solve a problem and get done and away from the computer and go spend time with your friends and family, yeah, go use Liquibase Community. If it's something that you think can improve the rest of the organization by teaming up and taking advantage of the collaboration features? Yes, sure, let us know. We're happy to help.Corey: Now, if people wanted to become an attorney, but law school was too expensive, out of reach, too much time, et cetera, but they did have a Twitter account, very often, they'll find that they can scratch that itch by arguing online about open-source licenses. So, I want to be very clear—because those people are odious when they email me—that you are licensed under the Apache License. That is a bonafide OSI approved open-source license. It is not everyone except big cloud companies, or service providers, which basically are people dancing around—they mean Amazon. So, let's be clear. One, are you worried about Amazon launching a competitive service with a dumb name? And/or have you really been validated as a product if AWS hasn't attempted and failed to launch a competitor?Robert: [laugh]. Well, I mean, we do have a very large corporation that has embedded Liquibase into one of their flagship products, and that is Oracle. They have embedded Liquibase in SQLcl. We're tickled pink because that means that, one, yes, it does validate Liquibase is the right way to do it, but it also means more people are getting help. Now, for Oracle users, if you're just an Oracle shop, great, have fun. We think it's a great solution. But there's not a lot of those.And so we believe that if you have Liquibase, whether it's open-source or the Pro version, then you're going to be able to support all the databases, and I think that's more important than being tied to a single cloud. Also—this is just my opinion and take it for what it's worth—but if Amazon wanted to do this, well, they're not the only game in town. So, somebody else is going to want to do it, too. And, you know, I would argue even with Amazon's backing that Liquibase is a little stronger brand than anything they would come out with.Corey: This episode is sponsored by our friends at Oracle HeatWave is a new high-performance accelerator for the Oracle MySQL Database Service. Although I insist on calling it “my squirrel.” While MySQL has long been the worlds most popular open source database, shifting from transacting to analytics required way too much overhead and, ya know, work. With HeatWave you can run your OLTP and OLAP, don't ask me to ever say those acronyms again, workloads directly from your MySQL database and eliminate the time consuming data movement and integration work, while also performing 1100X faster than Amazon Aurora, and 2.5X faster than Amazon Redshift, at a third of the cost. My thanks again to Oracle Cloud for sponsoring this ridiculous nonsense. Corey: So, I want to call out though, that on some level, they have already competed with you because one of database that you do not support is DynamoDB. Let's ignore the Route 53 stuff because, okay. But the reason behind that, having worked with it myself, is that, “Oh, how do you do a schema change in DynamoDB?” The answer is that you don't because it doesn't do schemas for one—it is schemaless, which is kind of the point of it—as well as oh, you want to change the primary, or the partition, or the sort key index? Great. You need a new table because those things are immutable.So, they've solved this Gordian Knot just like Alexander the Great did by cutting through it. Like, “Oh, how do you wind up doing this?” “You don't do this. The end.” And that is certainly an approach, but there are scenarios where those were first, NoSQL is not a acceptable answer for some workloads.I know Rick [Horahan 00:26:16] is going to yell at me for that as soon as he hears me, but okay. But there are some for which a relational database is kind of a thing, and you need that. So, Dynamo isn't fit for everything. But there are other workloads where, okay, I'm going to just switch over. I'm going to basically dump all the data and add it to a new table. I can't necessarily afford to do that with anything less than maybe, you know, 20 milliseconds of downtime between table one and table two. And they're obnoxious and difficult ways to do it, but for everything else, you do kind of need to make ALTER TABLE changes from time to time as you go through the build and release process.Robert: Yeah. Well, we certainly have plans for DynamoDB support. We are working our way through all the NoSQLs. Started with Mongo, and—Corey: Well, back that out a second then for me because there's something I'm clearly not grasping because it's my understanding, DynamoDB is schemaless. You can put whatever you want into various arbitrary fields. How would Liquibase work with something like that?Robert: Well, that's something I struggled with. I had the same question. Like, “Dude, really, we're a schema change tool. Why would we work with a schemaless database?” And so what happened was a soon-to-be friend of ours in Europe had reached out to me and said, “I built an extension for MongoDB in Liquibase. Can we open-source this, and can y'all take care of the care and feeding of this?” And I said, “Absolutely. What does it do?” [laugh].And so I looked at it and it turns out that it focuses on collections and generating data for test. So, you're right about schemaless because these are just documents and we're not going to go through every single document and change the structure, we're just going to have the application create a new doc and the new format. Maybe there's a conversion log logic built into the app, who knows. But it's the database professionals that have to apply these collections—you know, indices; that's what they call them in Mongo-land: collections. And so being able to apply these across all environments—dev, test, production—and have consistency, that's important.Now, what was really interesting is that this came from MasterCard. So, this engineer had a consulting business and worked for MasterCard. And they had a problem, and they said, “Hey, can you fix this with Liquibase?” And he said, “Sure, no problem.” And he built it.So, that's why if you go to the MongoDB—the liquibase-mongodb repository in our Liquibase org, you'll see that MasterCard has the copyright on all that code. Still Apache 2.0. But for me, that was the validation we needed to start expanding to other things: Dynamo, Couch. And same—Corey: Oh, yeah. For a lot of contributors, there's a contributor license process you can go through, assign copyright. For everything else, there's MasterCard.Robert: Yeah. Well, we don't do that. Look, you know, we certainly have a code of conduct with our community, but we don't have a signing copyright and that kind of stuff. Because that's baked into Apache 2.0. So, why would I want to take somebody's ability to get credit and magical internet points and increase the rep by taking that away? That's just rude.Corey: The problem I keep smacking myself into is just looking at how the entire database space across the board goes, it feels like it's built on lock-in, it's built on it is super finicky to work with, and it generally feels like, okay, great. You take something like Postgres-squeal or whatever it is you want to run your database on, yeah, you could theoretically move it a bunch of other places, but moving databases is really hard. Back when I was at my last, “Real job,” quote-unquote, years ago, we were late to the game; we migrated the entire site from EC2 Classic into a VPC, and the biggest pain in the ass with all of that was the RDS instance. Because we had to quiesce the database so it would stop taking writes; we would then do snapshot it, shut it down, and then restore a new database from that RDS snapshot.How long does it take, at least in those days? That is left as an experiment for the reader. So, we booked a four hour maintenance window under the fear that would not be enough. It completed in 45 minutes. So okay, there's that. Sparked the thing up and everything else was tested and good to go. And yay. Okay.It took a tremendous amount of planning, a tremendous amount of work, and that wasn't moving it very far. It is the only time I've done a late-night deploy, where not a single thing went wrong. Until I was on the way home and the Uber driver sideswiped a city vehicle. So, there we go—Robert: [laugh].Corey: —that's the one. But everything else was flawless on this because we planned these things out. But imagine moving to a different provider. Oh, forget it. Or imagine moving to a different database engine? That's good. Tell another one.Robert: Well, those are the problems that we want our database professionals to solve. We do not want them to be like janitors at an elementary school, cleaning up developer throw-up with sawdust. The issue that you're describing, that's a one time event. This is something that doesn't happen very often. You need hands on the keyboard, you want people there to look for problems.If you can take these database releases away from those folks and automate them safely—you can have safety and speed—then that frees up their time to do these other herculean tasks, these other feats of strength that they're far better at. There is no silver bullet panacea for database issues. All we're trying to do is take about 70% of DBAs time and free it up to do the fun stuff that you described. There are people that really enjoy that, and we want to free up their time so they can do that. Moving to another platform, going from the data center to the cloud, these sorts of things, this is what we want a human on; we don't want them updating a column three times in a row because dev couldn't get it right. Let's just give them the keys and make sure they stay in their lane.Corey: There's something glorious about being able to do that. I wish that there were more commonly appreciated ways of addressing those pains, rather than, “Oh, we're going to sell you something big and enterprise-y and it's going to add a bunch of process and not work out super well for you.” You integrate with existing CI/CD systems reasonably well, as best I can tell because the nice thing about CI/CD—and by nice I mean awful—is that there is no consensus. Every pipeline you see, in a release engineering process inherently becomes this beautiful bespoke unicorn.Robert: Mm-hm. Yeah. And we have to. We have to integrate with whatever CI/CD they have in place. And we do not want customers to just run Liquibase by itself. We want them to integrate it with whatever is driving that application deployment.We're Switzerland when it comes to databases, and CI/CD. And I certainly have my favorite of those, and it's primarily based on who bought me drinks at the last conference, but we cannot go into somebody's house and start rearranging the furniture. That's just rude. If they're deploying the app a certain way, what we tell that customer is, “Hey, we're just going to have that CI/CD tool call Liquibase to update the database. This should be an atomic unit of deployment.” And it should be hidden from the person that pushes that shiny button or the automation that does it.Corey: I wish that one day that you could automate all of the button pushing, but the thing that always annoyed me in release engineering was the, “Oh, and here's where we stop to have a human press the button.” And I get it. That stuff's scary for some folks, but at the same time, this is the nature of reality. So, you're not going to be able to technology your way around people. At least not successfully and not for very long.Robert: It's about trust. You have to earn that database professional's trust because if something goes wrong, blaming Liquibase doesn't go very far. In that company, they're going to want a person [laugh] who has a badge to—with a throat to choke. And so I've seen this pattern over and over again.And this happened at our first customer. Major, major, big, big, big bank, and this was on the consumer side. They were doing their first production push, and they wanted us ready. Not on the call, but ready if there was an issue they needed to escalate and get us to help them out. And so my VP of Engineering and me, we took it. Great. Got VP of engineering and CTO. Right on.And so Kevin and I, we stayed home, stayed sober [laugh], you know—a lot of places to party in Austin; we fought that temptation—and so we stayed and I'm texting with Kevin, back and forth. “Did you get a call?” “No, I didn't get a call.” It was Friday night. Saturday rolls around. Sunday. “Did you get a—what's going on?” [laugh].Monday, we're like, “Hey. Everything, okay? Did you push to the next weekend?” They're like, “Oh, no. We did. It went great. We forgot to tell you.” [laugh]. But here's what happened. The DBAs push the Liquibase ‘make it go' button, and then they said, “Uh-Oh.” And we're like, “What do you mean, uh-oh?” They said, “Well, something went wrong.” “Well, what went wrong?” “Well, it was too fast.” [laugh]. Something—no way. And so they went through the whole thing—Corey: That was my downtime when I supposed to be compiling.Robert: Yeah. So, they went through the whole thing to verify every single change set. Okay, so that was weekend one. And then they go to weekend two, they do it the same thing. All right, all right. Building trust.By week four, they called a meeting with the release team. And they said, “Hey, process change. We're no longer going to be on these calls. You are going to push the Liquibase button. Now, if you want to integrate it with your CI/CD, go right ahead, but that's not my problem.” Dev—or, the release team is tier one; dev is tier two; we—DBAs—are tier three support, but we'll call you because we'll know something went wrong. And to this day, it's all automated.And so you have to earn trust to get people to give that up. Once they have trust and you really—it's based on empathy. You have to understand how terrible [laugh] they are sometimes treated, and to actively take care of them, realize the problems they're struggling with, and when you earn that trust, then and only then will they allow automation. But it's hard, but it's something you got to do.Corey: You mentioned something a minute ago that I want to focus on a little bit more closely, specifically that you're in Austin. Seems like that's a popular choice lately. You've got companies that are relocating their headquarters there, presumably for tax purposes. Oracle's there, Tesla's there. Great. I mean, from my perspective, terrific because it gets a number of notably annoying CEOs out of my backyard. But what's going on? Why is Austin on this meteoric rise and how'd it get there?Robert: Well, a lot of folks—overnight success, 40 years in the making, I guess. But what a lot of people don't realize is that, one, we had a pretty vibrant tech hub prior to all this. It all started with MCC, Microcomputer Consortium, which in the '80s, we were afraid of the Japanese taking over and so we decided to get a bunch of companies together, and Admiral Bobby Inman who was director planted it in Austin. And that's where it started. You certainly have other folks that have a huge impact, obviously, Michael Dell, Austin Ventures, a whole host of folks that have really leaned in on tech in Austin, but it actually started before that.So, there was a time where Willie Nelson was in Nashville and was just fed up with RCA Records. They would not release his albums because he wanted to change his sound. And so he had some nice friends at Atlantic Records that said, “Willie, we got this. Go to New York, use our studio, cut an album, we'll fix it up.” And so he cut an album called Shotgun Willie, famous for having “Whiskey River” which is what he uses to open and close every show.But that album sucked as far as sales. It's a good album, I like it. But it didn't sell except for one place in America: in Austin, Texas. It sold more copies in Austin than anywhere else. And so Willie was like, “I need to go check this out.”And so he shows up in Austin and sees a bunch of rednecks and hippies hanging out together, really geeking out on music. It was a great vibe. And then he calls, you know, Kris, and Waylon, and Merle, and say, “Come on down.” And so what happened here was a bunch of people really wanted to geek out on this new type of country music, outlaw country. And it started a pattern where people just geek out on stuff they really like.So, same thing with Austin film. You got Robert Rodriguez, you got Richard Linklater, and Slackers, his first movie, that's why I moved to Austin. And I got a job at Les Amis—a coffee shop that's closed—because it had three scenes in that. There was a whole scene of people that just really wanted to make different types of films. And we see that with software, we see that with film, we see it with fashion.And it just seems that Austin is the place where if you're really into something, you're going to find somebody here that really wants to get into it with you, whether it's board gaming, D&D, noise punk, whatever. And that's really comforting. I think it's the community that's just welcoming. And I just hope that we can continue that creativity, that sense of community, and that we don't have large corporations that are coming in and just taking from the system. I hope they inject more.I think Oracle's done a really good job; their new headquarters is gorgeous, they've done some really good things with the city, doing a land swap, I think it was forty acres for nine acres. They coughed up forty for nine. And it was nine acres the city wasn't even using. Great. So, I think they're being good citizens. I think Tesla's been pretty cool with building that factory where it is. I hope more come. I hope they catch what is ever in the water and the breakfast tacos in Austin.Corey: [laugh]. I certainly look forward to this pandemic ending; I can come over and find out for myself. I'm looking forward to it. I always enjoyed my time there, I just wish I got to spend more of it.Robert: How many folks from Duckbill Group are in Austin now?Corey: One at the moment. Tim Banks. And the challenge, of course, is that if you look across the board, there really aren't that many places that have more than one employee. For example, our operations person, Megan, is here in San Francisco and so is Jesse DeRose, our manager of cloud economics. But my business partner is in Portland; we have people scattered all over the country.It's kind of fun having a fully-distributed company. We started this way, back when that was easy. And because all right, travel is easy; we'll just go and visit whenever we need to. But there's no central office, which I think is sort of the dangerous part of full remote because then you have this idea of second-class citizens hanging out in one part of the country and then they go out to lunch together and that's where the real decisions get made. And then you get caught up to speed. It definitely fosters a writing culture.Robert: Yeah. When we went to remote work, our lease was up. We just didn't renew. And now we have expanded hiring outside of Austin, we have folks in the Ukraine, Poland, Brazil, more and more coming. We even have folks that are moving out of Austin to places like Minnesota and Virginia, moving back home where their family is located.And that is wonderful. But we are getting together as a company in January. We're also going to, instead of having an office, we're calling it a ‘Liquibase Lounge.' So, there's a number of retail places that didn't survive, and so we're going to take one of those spots and just make a little hangout place so that people can come in. And we also want to open it up for the community as well.But it's very important—and we learned this from our friends at GitLab and their culture. We really studied how they do it, how they've been successful, and it is an awareness of those lunch meetings where the decisions are made. And it is saying, “Nope, this is great we've had this conversation. We need to have this conversation again. Let's bring other people in.” And that's how we're doing at Liquibase, and so far it seems to work.Corey: I'm looking forward to seeing what happens, once this whole pandemic ends, and how things continue to thrive. We're long past due for a startup center that isn't San Francisco. The whole thing is based on the idea of disruption. “Oh, we're disruptive.” “Yes, we're so disruptive, we've taken a job that can be done from literally anywhere with internet access and created a land crunch in eight square miles, located in an earthquake zone.” Genius, simply genius.Robert: It's a shame that we had to have such a tragedy to happen to fix that.Corey: Isn't that the truth?Robert: It really is. But the toothpaste is out of the tube. You ain't putting that back in. But my bet on the next Tech Hub: Kansas City. That town is cool, it has one hundred percent Google Fiber all throughout, great university. Kauffman Fellows, I believe, is based there, so VC folks are trained there. I believe so; I hope I'm not wrong with that. I know Kauffman Foundation is there. But look, there's something happening in that town. And so if you're a buy low, sell high kind of person, come check us out in Austin. I'm not trying to dissuade anybody from moving to Austin; I'm not one of those people. But if the housing prices [laugh] you don't like them, check out Kansas City, and get that two-gig fiber for peanuts. Well, $75 worth of peanuts.Corey: Robert, I want to thank you for taking the time to speak with me so extensively about Liquibase, about how awesome RedMonk is, about Austin and so many other topics. If people want to learn more, where can they find you?Robert: Well, I think the best place to find us right now is in AWS Marketplace. So—Corey: Now, hand on a second. When you say the best place for anything being the AWS Marketplace, I'm naturally a little suspicious. Tell me more.Robert: [laugh]. Well, best is, you know, it's—[laugh].Corey: It is a place that is there and people can find you through it. All right, then.Robert: I have a list. I have a list. But the first one I'm going to mention is AWS Marketplace. And so that's a really easy way, especially if you're taking advantage of the EDP, Enterprise Discount Program. That's helpful. Burn down those dollars, get a discount, et cetera, et cetera. Now, of course, you can go to liquibase.com, download a trial. Or you can find us on Github, github.com/liquibase. Of course, talking smack to us on Twitter is always appreciated.Corey: And we will, of course, include links to that in the [show notes 00:46:37]. Robert Reeves, CTO and co-founder of Liquibase. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment complaining about how Liquibase doesn't support your database engine of choice, which will quickly be rendered obsolete by the open-source community.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Talk Python To Me - Python conversations for passionate developers
#345: 10 Tips and Tools for Developer Productivity

Talk Python To Me - Python conversations for passionate developers

Play Episode Listen Later Dec 15, 2021 76:39


You know that feeling when one of your developer friends or colleague tells you about some amazing tool, library, or shell environment that you never heard of that you just have to run out and try right away? This episode is jam-packed full of those moments. We welcome back Jay Miller to discuss tools and tips for developer productivity. The title says 10 tips, but we actually veer into many more along the way. I think you'll really enjoy this useful and light-hearted episode. Links from the show Jay on Twitter: @kjaymiller More Oh my ZSH plugins: github.com exa: the.exa.website bat: github.com ripgrep/amber: github.com Neovim: neovim.io RUMPS macOS Framework: github.com Black: github.com pypi-changes package: readthedocs.io asdf-python: github.com WAVE Web Accessibility Evaluation Tool: wave.webaim.org Google PageSpeed: pagespeed.web.dev XKCD Commit messages: xkcd.com secure package: github.com OWASP Top 10: owasp.org ngrok: ngrok.com starship: starship.rs Homebrew: brew.sh Chocolatey: chocolatey.org pip-tools: github.com Let's Encrypt: letsencrypt.org Sourcetree Git App: sourcetreeapp.com Oh my ZSH: ohmyz.sh nerd fonts: nerdfonts.com Oh my Posh: ohmyposh.dev Windows Terminal: microsoft.com McFly shell history: github.com Fig IO enhanced shell: fig.io Conduit podcast: relay.fm htmx course at Talk Python: talkpython.fm/htmx Watch this episode on YouTube: youtube.com Episode transcripts: talkpython.fm --- Stay in touch with us --- Subscribe on YouTube: youtube.com Follow Talk Python on Twitter: @talkpython Follow Michael on Twitter: @mkennedy Sponsors Coiled CockroachDB AssemblyAI Talk Python Training

The Cloudcast
The Evolution of Serverless Databases

The Cloudcast

Play Episode Listen Later Nov 24, 2021 33:12


Jim Walker (@jaymce, Principal Product Evangelist at @CockroachDB) talks about how serverless has moved from compute to backing data services, and focuses on improving application developer productivity. SHOW: 569CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotwCHECK OUT OUR NEW PODCAST - "CLOUDCAST BASICS"SHOW SPONSORS:Megaport - Network as a Service PlatformTry Megaport - Cloud Connectivity SimplifiedCBT Nuggets: Expert IT Training for individuals and teamsSign up for a CBT Nuggets Free Learner accountSHOW NOTES:Cockroach LabsCockroachDB Serverless(Eps.438) Scalable Databases on KubernetesTopic 1 - Welcome to the show. Let's start by talking a little bit about your background, and where you focus your attention at Cockroach Labs. Give us a quick overview of CockroachDB.Topic 2 - We've covered “Serverless” quite a bit on the show, mostly focused around compute services, but there seems to be a growing trend around serverless as it relates to data services. Can you give us some background on what's driving these new capabilities?Topic 3 - Implementations of Database-as-a-Service have been around for quite a while now. What's new or different around service database offerings? Topic 4 - CockroachDB is fairly unique in its ability to span locations (e.g. in essence be multi-cloud). Help us connect the dots between the core elements of CockroachDB, and where the new serverless capabilities enhance that (for an application, for a DB team, for an Operations team, etc.)Topic 5 - What are some of the new use-cases that can potentially be unlocked with the new serverless offering? Topic 6 - How do you expect the new serverless offering to change the way that application teams and operations teams interact with systems going forward? FEEDBACK?Email: show at the cloudcast dot netTwitter: @thecloudcastnet

Cloud Posse DevOps
Cloud Posse DevOps "Office Hours" (2021-11-17)

Cloud Posse DevOps "Office Hours" Podcast

Play Episode Listen Later Nov 17, 2021 62:30


Cloud Posse holds public "Office Hours" every Wednesday at 11:30am PST to answer questions on all things related to DevOps, Terraform, Kubernetes, CICD. Basically, it's like an interactive "Lunch & Learn" session where we get together for about an hour and talk shop. These are totally free and just an opportunity to ask us (or our community of experts) any questions you may have. You can register here: https://cloudposse.com/office-hoursJoin the conversation: https://slack.cloudposse.com/Find out how we can help your company:https://cloudposse.com/quizhttps://cloudposse.com/accelerate/Learn more about Cloud Posse:https://cloudposse.comhttps://github.com/cloudpossehttps://sweetops.com/https://newsletter.cloudposse.comhttps://podcast.cloudposse.com/[00:00:00​] Intro[00:01:09​] American spy hacked booking.com, company stayed silenthttps://www.nrc.nl/nieuws/2021/11/10/american-spy-hacked-bookingcom-company-stayed-silent-a4065086[00:02:51​] Fake emails sent from infrastructure owned by the FBI/DHS (the LEEP portal)https://twitter.com/spamhaus/status/1459450061696417792?s=21[00:06:35] Argo CD v2.2 release candidatehttps://blog.argoproj.io/argo-cd-v2-2-release-candidate-4e16e985b486[00:10:00​] Resource Factories: A descriptive approach to Terraformhttps://medium.com/google-cloud/resource-factories-a-descriptive-approach-to-terraform-581b3ebb59c[00:36:15​] Terraform Module Versions Cli https://github.com/keilerkonzept/terraform-module-versions[00:39:48​] Are there any SQL database (e.g., CockroachDB, Percona) solutions which run in AWS (EC2 or EKS), and outperform AWS Aurora or any proxy recommendations to put in front of Aurora that provide query priority, better replication etc? [00:47:56​] Moving from Terragrunt into native Terraform, what are good resources to learn how to split Terraform workspaces for infrastructure? [00:52:13​] How many dev teams are using conventional commits? [01:00:25​] Outro #officehours,#cloudposse,#sweetops,#devops,#sre,#terraform,#kubernetes,#awsSupport the show (https://cloudposse.com/office-hours/)

Les Cast Codeurs Podcast
LCC 267 - Lagom efface sa dette technique

Les Cast Codeurs Podcast

Play Episode Listen Later Nov 15, 2021 76:33


Antonio et Emmanuel discutent Microsoft et Java, cryostat, Java 17, Micronaut, Quarkus, Play framework, Lagom, Amazon, CORS, CSS (si si), Hibernate Reactive, AtomicJar, canary, amplification algorithmique. Enregistré le 12 novembre 2021 Téléchargement de l'épisode LesCastCodeurs-Episode–267.mp3 News Langages Blog sur les extraits de code dans les JavaDocs (18 Oct 2021) C'est plus agréable à utiliser que les balises pre, pas besoin d'escaping (pour < et >), l'espace à gauche est normalisé On peut mettre en valeur certaines portion, ou remplacer par une expression régulière certains bouts Et on peut également externaliser d'où vient l'extrait de code, au lieu de le mettre dans la JavaDoc, on peut référencer une région de son vrai code. Donc au moins, on est sûr que c'est du code valide et qui compile évidemment Gunnar explique comment reprendre le code provenant de nos classes de test, pour le faire apparaître dans les JavaDocs, créant ainsi une vraie documentation “exécutable” Compress class space (27 Mars 2019) compressed object ou class pointer sur 64buts en 32 bits vis adresse relative due adresse relative, la Klass structure dans le metaspace doit être mémoire contiguë et pré allouée initialement (risque de non reallocation si mémoire libre non contiguë ) Donc le classpart et le non class part séparés dans le meta space. Klass is 32G max et contiguë et la klass part est appelé compressed class space Par défaut 1G mais configurable jusqu'à 3G. C'est virtual mémoire, juste une réservation. 1K pas classe environ donc 1000000 de classes max Que quand on utilise compressed oops Que pour Java heap size de 32G max Cryostat 2.0 (18 Oct 2021) Fournit une API sécurisée pour profiler et monitored les applis Java dans les containers avec Java Flight Recorder Cryostat peut récupérer stoquer et analyser les enregistrements flight recorder de containers Ensuite consommé par graphana ou l.appli JDK Mission Control desktop Fichier reste local au container par défaut donc pas pratique Connection via JMX directe pas pratique ni secure par défaut Cryostat récupère les recording via HTTPS A un opérateur kubernetes Etc Microsoft augmente ses investissements dans Java. (4 Nov 2021) Microsoft rejoints le JCP Travaille sur VSCode for Java avec Red Hat Est OK avec le LTS passant à 2 ans et va aider à supporter ces releases plus fréquentes Librairies Micronaut 3.1 (11 Oct 2021) support d'applications utilisant JDK 17 améliorations d'injections de dependances (repeatable scopes, primitive beans, etc) les classes générées sont plus petites et amélioration de consommation mémoire sous GraalVM routes HTTP par regexp random port binding (pour les conflits de tests) Changement certificats TLs via refresh sans arreter le serveur Kotlin coroutine supportées dans micronaut data extension de la couverture de support JPA (e.g. attribute converter) support des informers Kubernetes via le Kubernetes SDK integration Oracle Coherence sortie du mode preview Quarkus 2.4 (27 Oct 2021) Hibernate Reactive 1.0.0.Final Introducing Kafka Streams DevUI (c'est cool pour développer ca et savoir ce qui se passe Support continuous testing for multi module projects Support AWT image resize via new AWT extension Lightbend lâche Play Framework (20 Oct 2021) lightbend construit sur Scala, akka, et play framework C'est le moment de la 2.0 je crois Mais avec le cloud, ils veulent se focaliser sur les systèmes distribués Akka Open Source et Akka Serverless (leur PaaS) Laisse Play à la,communité et lightbend arrête d'investir dedans Dans une orga séparée Besoin de sponsors et de contributeurs Question: ils n'avaient pas déjà arrêté Scala? Lightbend déveste de Lagom aussi (27 Oct 2021) Lagom effacé par akka Platform'et Akka Serverless Trop de contraintes limitantes dans le framework Mais si client de Lightbend, supporté sur Lagom mais sans nouvelle fonctionnalité Infrastructure Installer et utiliser podman-machine sur macOS (19 Oct 2021) La virtualisation s'appuie sur qemu et met en place une VM dans laquelle les pods tournent. Podman Machine pour installer une VM linux avec les outils fonctionne aussi sous linux pour ceux qu ne supportent pas podman ou pour sandboxer fonctionne sous M1 homebrew pour l'installation comme docker machine avant en gros il y a aussi une belle présentation de Devoxx France Cloud Amazon déclaré la guerre à Microsoft en utilisant les arguments “Proprietaire” (28 Oct 2021) Aurora a un font qui parler protocole SQL server (Babelfish pour Aurora PostgreSQL). Et convertit les T-SQL Open source the t sql vers Postgres (debug). Sous license ASL Pas tout open sourcé encore Web CORS expliqué (12 Oct 2021) inclue images d'autres sites, c'est l'origine les cookies, credeitials etc etaient envoyés yahoo mail pouvait filer les credentials des utilisateurs une iFrame pouvait lire le contenu d'une autre iFrame (Netscape met en place le Cross-Frame Scripting) Access-Control-Allow-Origin: * est ok si pas de données privées Rendre une page HTML brute jolie en 100 caractères de CSS (16 Oct 2021) basique mais expliqué ligne par ligne E.g. 60–80 caractères pour la lecture Et 100 bytes de plus pour améliorer Data elasticsearch 8.0 will require java 17 (3 Nov 2021) definitely easier for something standalone than a library or anything that needs to share the JDK with all its apps PR GitHub Hibernate Reactive 1.0.0, ça vaut le coup ? (27 Oct 2021) PostgreSQL, MySQL, MariaDB, Db2, SQL Server, and CockroachDB bases de donnés désignées pour des interactions classiques Donc les constructions haut niveau ont tendances à être limitées par le protocole sous-jacent ce qui ne se voyait pas ou peu en JDBC utiliser HR si votre appli est déjà réactive au cœur (e.g. RESTEasy reactive dans Quarkus ou une appli Vert.x) Compareperfs acec techempower mais avec angle latence à un volume donné et et pas throughout max 20 requêtes d'affilée 20k request/s -> 35k sous 10ms de latence. C'est la valeur relative qui est intéressante Une requête et du processing pour rendre au client, peu de différence Toruhghput tend à être meilleur Amélioration de réactive sur un an Un vidéo cast sur le sujet Outillage AtomicJar se lance dans une offre Cloud (04 Nov 2021) les containers de test containers ne tournent plus en local Mais dans le cloud de AtomicJar A plus de spores source qu'une machine locale typique (2 cores et 8GB ram pour la docker machine) peut utiliser la machine quand les tests tournent Pour CI limitées vs containers ou les cloud IDE pour pas trop dépenser Pas de problème avec M1 Un petit binaire à installer (eg via curl) TestContainers et Quarkus: TestContainer Cloud fonctionne avec Dev Service (les containers lancés et configurés automatiquement) Encore en cours de développement (beta privée et on peut demander invitation) Méthodologies Canary releases ou avoir des testeurs (04 Nov 2021) canary release est une release en prod mais sur un petit sous ensemble des utilisateurs Peut aider a voir si une nouvelle fonctionnalité intéresse les utilisateurs avant de commiter sur le long terme Toujours option du retour arrière Donc peut on réduire les tests internes ? Risque de réputation ou abandon utilisateur (acquisition et rétentions sont chères) Test automatisés compréhensifs permettent le risque de canary Test exploratoires pour compléter les tests automatiques Loi, société et organisation Le droit à decompiler pour corriger des erreurs confirmé légal (21 Oct 2021) arrêt du 6 octobre 2021 Pour corriger une erreur affectant le fonctionnement y compris via la désactivation d'une fonction affectant le bon fonctionnement de l'application Influence de l'amplificartion algorithmique sur le contenu politique (21 Octo 2021) les recommendations algorithmiques amplifient-elle le contenu politique ? dans le cas des timeline organisées algorithmiquement et pas reverse chronologique Est-ce que ça varie entre partis politiques ou groups politiques Des sources de nouvelles plus amplifiées que d'autre Les élus sont plus amplifiés que le contenu politique général Pas d'amplification particulière d.individus ces d'autres au sein du même parti ???? La,droite tend à avoir une amplification plus importante que la gauche Les sources de nouvelles orientées à droite sont aussi plus amplifiées que celles de gauche La méthodologie est détaillée sur par exemple ce qu'est un journal de droite Pourquoi c'est amplifié différemment est une question plus difficile à répondre Amplification n'est pas mauvaise par défaut mais elle l'est si elle amène à un traitement préférentiel du à l'algorithme (vs comment les gens interagissent sur la plateforme) Le PDF de l'étude intégrale Conférences DevFest Lille le 19 novembre 2021 Devoxx France du 20 au 22 avril 2021 SunnyTech les 30 juin et 1er juillet 2022 à Montpellier Nous contacter Soutenez Les Cast Codeurs sur Patreon https://www.patreon.com/LesCastCodeurs Faire un crowdcast ou une crowdquestion Contactez-nous via twitter https://twitter.com/lescastcodeurs sur le groupe Google https://groups.google.com/group/lescastcodeurs ou sur le site web https://lescastcodeurs.com/

BadGeek
Les Cast Codeurs n°267 du 15/11/21 - LCC 267 - Lagom efface sa dette technique

BadGeek

Play Episode Listen Later Nov 15, 2021 76:33


Antonio et Emmanuel discutent Microsoft et Java, cryostat, Java 17, Micronaut, Quarkus, Play framework, Lagom, Amazon, CORS, CSS (si si), Hibernate Reactive, AtomicJar, canary, amplification algorithmique. Enregistré le 12 novembre 2021 Téléchargement de l'épisode [LesCastCodeurs-Episode-267.mp3](https://traffic.libsyn.com/lescastcodeurs/LesCastCodeurs-Episode-267.mp3) ## News ### Langages [Blog sur les extraits de code dans les JavaDocs](https://www.morling.dev/blog/executable-javadoc-code-snippets/) (18 Oct 2021) * C'est plus agréable à utiliser que les balises pre, pas besoin d'escaping (pour < et >), l'espace à gauche est normalisé * On peut mettre en valeur certaines portion, ou remplacer par une expression régulière certains bouts * Et on peut également externaliser d'où vient l'extrait de code, au lieu de le mettre dans la JavaDoc, on peut référencer une région de son vrai code. * Donc au moins, on est sûr que c'est du code valide et qui compile évidemment * Gunnar explique comment reprendre le code provenant de nos classes de test, pour le faire apparaître dans les JavaDocs, créant ainsi une vraie documentation “exécutable” [Compress class space](https://stuefe.de/posts/metaspace/what-is-compressed-class-space/) (27 Mars 2019) * compressed object ou class pointer sur 64buts en 32 bits vis adresse relative * due adresse relative, la Klass structure dans le metaspace doit être mémoire contiguë et pré allouée initialement (risque de non reallocation si mémoire libre non contiguë ) * Donc le classpart et le non class part séparés dans le meta space. Klass is 32G max et contiguë et la klass part est appelé compressed class space * Par défaut 1G mais configurable jusqu'à 3G. C'est virtual mémoire, juste une réservation. * 1K pas classe environ donc 1000000 de classes max * Que quand on utilise compressed oops * Que pour Java heap size de 32G max [Cryostat 2.0](https://developers.redhat.com/articles/2021/10/18/announcing-cryostat-20-jdk-flight-recorder-containers) (18 Oct 2021) * Fournit une API sécurisée pour profiler et monitored les applis Java dans les containers avec Java Flight Recorder * Cryostat peut récupérer stoquer et analyser les enregistrements flight recorder de containers * Ensuite consommé par graphana ou l.appli JDK Mission Control desktop * Fichier reste local au container par défaut donc pas pratique * Connection via JMX directe pas pratique ni secure par défaut * Cryostat récupère les recording via HTTPS * A un opérateur kubernetes * Etc [Microsoft augmente ses investissements dans Java](https://devblogs.microsoft.com/java/microsoft-deepens-its-investments-in-java/). (4 Nov 2021) * Microsoft rejoints le [JCP](https://jcp.org/) * Travaille sur VSCode for Java avec Red Hat * Est OK avec le LTS passant à 2 ans et va aider à supporter ces releases plus fréquentes ### Librairies [Micronaut 3.1](https://micronaut.io/2021/10/11/micronaut-framework-released/) (11 Oct 2021) * support d'applications utilisant JDK 17 * améliorations d'injections de dependances (repeatable scopes, primitive beans, etc) * les classes générées sont plus petites et amélioration de consommation mémoire sous GraalVM * routes HTTP par regexp * random port binding (pour les conflits de tests) * Changement certificats TLs via refresh sans arreter le serveur * Kotlin coroutine supportées dans micronaut data * extension de la couverture de support JPA (e.g. attribute converter) * support des informers Kubernetes via le Kubernetes SDK * integration Oracle Coherence sortie du mode preview [Quarkus 2.4](https://quarkus.io/blog/quarkus-2-4-0-final-released/) (27 Oct 2021) * Hibernate Reactive 1.0.0.Final * Introducing Kafka Streams DevUI (c'est cool pour développer ca et savoir ce qui se passe * Support continuous testing for multi module projects * Support AWT image resize via new AWT extension [Lightbend lâche Play Framework](https://www.lightbend.com/blog/on-the-future-of-play-framework) (20 Oct 2021) * lightbend construit sur Scala, akka, et play framework * C'est le moment de la 2.0 je crois * Mais avec le cloud, ils veulent se focaliser sur les systèmes distribués * Akka Open Source et Akka Serverless (leur PaaS) * Laisse Play à la,communité et lightbend arrête d'investir dedans * Dans une orga séparée * Besoin de sponsors et de contributeurs * Question: ils n'avaient pas déjà arrêté Scala? [Lightbend déveste de Lagom aussi](https://discuss.lightbend.com/t/the-future-of-lagom/8962) (27 Oct 2021) * Lagom effacé par akka Platform'et Akka Serverless * Trop de contraintes limitantes dans le framework * Mais si client de Lightbend, supporté sur Lagom mais sans nouvelle fonctionnalité ### Infrastructure [Installer et utiliser podman-machine sur macOS](https://blog.while-true-do.io/podman-machine/) (19 Oct 2021) * La virtualisation s'appuie sur qemu et met en place une VM dans laquelle les pods tournent. * Podman Machine pour installer une VM linux avec les outils * fonctionne aussi sous linux pour ceux qu ne supportent pas podman ou pour sandboxer * fonctionne sous M1 * homebrew pour l'installation * comme docker machine avant en gros * [il y a aussi une belle présentation de Devoxx France](https://www.youtube.com/watch?v=pUFIG2AMDhg) ### Cloud [Amazon déclaré la guerre à Microsoft en utilisant les arguments “Proprietaire”](https://aws.amazon.com/blogs/aws/goodbye-microsoft-sql-server-hello-babelfish/) (28 Oct 2021) * Aurora a un font qui parler protocole SQL server ([Babelfish pour Aurora PostgreSQL](https://aws.amazon.com/fr/rds/aurora/babelfish/)). * Et convertit les T-SQL * Open source the t sql vers Postgres (debug). Sous license ASL * Pas tout open sourcé encore ### Web [CORS expliqué](https://jakearchibald.com/2021/cors/) (12 Oct 2021) * inclue images d'autres sites, c'est l'origine * les cookies, credeitials etc etaient envoyés * yahoo mail pouvait filer les credentials des utilisateurs * une iFrame pouvait lire le contenu d'une autre iFrame (Netscape met en place le Cross-Frame Scripting) * `Access-Control-Allow-Origin: *` est ok si pas de données privées [Rendre une page HTML brute jolie en 100 caractères de CSS](https://www.swyx.io/css-100-bytes) (16 Oct 2021) * basique mais expliqué ligne par ligne * E.g. 60-80 caractères pour la lecture * Et 100 bytes de plus pour améliorer ### Data [elasticsearch 8.0 will require java 17](https://twitter.com/xeraa/status/1455980076001071106) (3 Nov 2021) * definitely easier for something standalone than a library or anything that needs to share the JDK with all its apps * [PR GitHub](https://github.com/elastic/elasticsearch/pull/79873) [Hibernate Reactive 1.0.0, ça vaut le coup ?](https://in.relation.to/2021/10/27/hibernate-reactive-performance/) (27 Oct 2021) * PostgreSQL, MySQL, MariaDB, Db2, SQL Server, and CockroachDB * bases de donnés désignées pour des interactions classiques * Donc les constructions haut niveau ont tendances à être limitées par le protocole sous-jacent ce qui ne se voyait pas ou peu en JDBC * utiliser HR si votre appli est déjà réactive au cœur (e.g. RESTEasy reactive dans Quarkus ou une appli Vert.x) * Compareperfs acec techempower mais avec angle latence à un volume donné et et pas throughout max * 20 requêtes d'affilée 20k request/s -> 35k sous 10ms de latence. C'est la valeur relative qui est intéressante * Une requête et du processing pour rendre au client, peu de différence * Toruhghput tend à être meilleur * Amélioration de réactive sur un an * [Un vidéo cast sur le sujet](https://youtu.be/VGAnVX1lCxg) ### Outillage [AtomicJar se lance dans une offre Cloud](https://www.atomicjar.com/2021/11/announcing-testcontainers-cloud/) (04 Nov 2021) * les containers de test containers ne tournent plus en local * Mais dans le cloud de AtomicJar * A plus de spores source qu'une machine locale typique (2 cores et 8GB ram pour la docker machine) * peut utiliser la machine quand les tests tournent * Pour CI limitées vs containers ou les cloud IDE pour pas trop dépenser * Pas de problème avec M1 * Un petit binaire à installer (eg via curl) * TestContainers et Quarkus: TestContainer Cloud fonctionne avec Dev Service (les containers lancés et configurés automatiquement) * Encore en cours de développement (beta privée et on peut demander invitation) ### Méthodologies [Canary releases ou avoir des testeurs](https://www.infoq.com/articles/canary-releases-testing/) (04 Nov 2021) * canary release est une release en prod mais sur un petit sous ensemble des utilisateurs * Peut aider a voir si une nouvelle fonctionnalité intéresse les utilisateurs avant de commiter sur le long terme * Toujours option du retour arrière * Donc peut on réduire les tests internes ? * Risque de réputation ou abandon utilisateur (acquisition et rétentions sont chères) * Test automatisés compréhensifs permettent le risque de canary * Test exploratoires pour compléter les tests automatiques ### Loi, société et organisation [Le droit à decompiler pour corriger des erreurs confirmé légal](https://www.legalis.net/actualite/le-droit-a-decompiler-un-logiciel-pour-corriger-des-erreurs-confirme-par-la-cjue/) (21 Oct 2021) * arrêt du 6 octobre 2021 * Pour corriger une erreur affectant le fonctionnement y compris via la désactivation d'une fonction affectant le bon fonctionnement de l'application [Influence de l'amplificartion algorithmique sur le contenu politique](https://blog.twitter.com/en_us/topics/company/2021/rml-politicalcontent) (21 Octo 2021) * les recommendations algorithmiques amplifient-elle le contenu politique ? * dans le cas des timeline organisées algorithmiquement et pas reverse chronologique * Est-ce que ça varie entre partis politiques ou groups politiques * Des sources de nouvelles plus amplifiées que d'autre * Les élus sont plus amplifiés que le contenu politique général * Pas d'amplification particulière d.individus ces d'autres au sein du même parti ???? * La,droite tend à avoir une amplification plus importante que la gauche * Les sources de nouvelles orientées à droite sont aussi plus amplifiées que celles de gauche * La méthodologie est détaillée sur par exemple ce qu'est un journal de droite * Pourquoi c'est amplifié différemment est une question plus difficile à répondre * Amplification n'est pas mauvaise par défaut mais elle l'est si elle amène à un traitement préférentiel du à l'algorithme (vs comment les gens interagissent sur la plateforme) * [Le PDF de l'étude intégrale](https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/official/en_us/company/2021/rml/Algorithmic-Amplification-of-Politics-on-Twitter.pdf) ## Conférences [DevFest Lille le 19 novembre 2021](https://devfest.gdglille.org/) [Devoxx France du 20 au 22 avril 2021](https://www.devoxx.fr/) [SunnyTech les 30 juin et 1er juillet 2022 à Montpellier](https://sunny-tech.io/) ## Nous contacter Soutenez Les Cast Codeurs sur Patreon [Faire un crowdcast ou une crowdquestion](https://lescastcodeurs.com/crowdcasting/) Contactez-nous via twitter sur le groupe Google ou sur le site web

The Swyx Mixtape
Journey to CockroachDB [Spencer Kimball]

The Swyx Mixtape

Play Episode Listen Later Sep 2, 2021 10:19


Listen to the full episode on SERadio: https://www.se-radio.net/2020/06/episode-413-spencer-kimball-on-cockroachdb/Previous two parter on Spencer:  https://swyx.transistor.fm/episodes/consistent-synchronous-replication https://swyx.transistor.fm/episodes/spencer-kimball-pt-2-competing-with-big-clouds

Unboxing Startups Podcasts
News of the day | Cockroach Labs | Unboxing Startups

Unboxing Startups Podcasts

Play Episode Listen Later Jul 30, 2021 2:06


Cockroach Labs is the creator of CockroachDB, the most highly evolved cloud-native, distributed SQL database on the planet.

dot tech Podcast by Form3
Ep 13 .tech - CockroachDB - a Cloud Native Global Database

dot tech Podcast by Form3

Play Episode Listen Later Jun 1, 2021 28:11


Our new .tech series invites guests inside and outside of Form3, discussing current trends in the engineering world alongside shedding light into some of the engineering practices here at Form3.Interested in joining Form3? - https://hubs.li/H0zlrGd0Cockroach website: https://www.cockroachlabs.com/  

The Data Stack Show
35: The Future of Development is Distributed with Jim Walker of Cockroach Labs

The Data Stack Show

Play Episode Listen Later May 12, 2021 54:27


This week on The Data Stack Show, Eric and Kostas talk with Jim Walker, the VP of product marketing at Cockroach Labs, about distributed systems, competing against the speed of light, and making data easy.Highlights from this week's episode include: Jim background of translating deep technical concepts into understandable English and his work at Cockroach Labs (2:23)The origin of Cockroach Labs and distributed SQL (6:10) Living without Atomic Clocks (10:10)Having the speed of light as the ultimate competitor (13:49)CockroachDB's users (19:35)Figuring out big data for transactions (25:14)Dealing with failure (35:04)Open source code, community, and consumption (39:26)Making data easy, and what's next for Cockroach (43:12)Bringing programming into marketing (46:18)Mentioned Links:Spanner White PaperRaft & PaxosMichael Stonebraker The Data Stack Show is a weekly podcast powered by RudderStack. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

TechCrunch Startups – Spoken Edition
Cockroach Labs scores $160M Series E on $2B valuation

TechCrunch Startups – Spoken Edition

Play Episode Listen Later Jan 13, 2021 4:11


Cockroach Labs, makers of CockroachDB, have been on a fundraising roll for the last couple of years. Today the company announced a $160 million Series E on a fat $2 billion valuation. The round comes just eight months after the startup raised an $86.6 million Series D. The latest investment was led by Altimeter Capital […]

The Podlets - A Cloud Native Podcast
Stateful and Stateless Workloads (Ep 9)

The Podlets - A Cloud Native Podcast

Play Episode Listen Later Dec 23, 2019 43:34


This week on The Podlets Cloud Native Podcast we have Josh, Carlisia, Duffie, and Nick on the show, and are also happy to be joined by a newcomer, Brian Liles, who is a senior staff engineer at VMWare! The purpose of today’s show is coming to a deeper understanding of the meaning of ‘stateful’ versus ‘stateless’ apps, and how they relate to the cloud native environment. We cover some definitions of ‘state’ initially and then move to consider how ideas of data persistence and co-ordination across apps complicate or elucidate understandings of ‘stateful’ and ‘stateless’. We then think about the challenging practice of running databases within Kubernetes clusters, which effectively results in an ephemeral system becoming stateful. You’ll then hear some clarifications of the meaning of operators and controllers, the role they play in mediating and regulating states, and also how important they are in a rapidly evolving but skills-scarce environment. Another important theme in this conversation is the CAP theorem or the impossibility of consistency, availability and partition tolerance all at once, but the way different databases allow for different combinations of two out of the three. We then move on to chat about the fundamental connection between workloads and state and then end off with a quick consideration about how ideas of stateful and stateless play out in the context of networks. Today’s show is a real deep dive offering perspectives from some the most knowledgeable in the cloud native space so make sure to tune in! Follow us: https://twitter.com/thepodlets Website: https://thepodlets.io Feeback: info@thepodlets.io https://github.com/vmware-tanzu/thepodlets/issues Hosts: Carlisia Campos Duffie Cooley Bryan Liles Josh Rosso Nicholas Lane Key Points From This Episode: • What ‘stateful’ means in comparison to ‘stateless’.• Understanding ‘state’ as a term referring to data which must persist.• Examples of stateful apps such as databases or apps that revolve around databases.• The idea that ‘persistence’ is debatable, which then problematizes the definition of ‘state’. • Considerations of the push for cloud native to run stateless apps.• How inter-app coordination relates to definitions of stateful and stateless applications.• Considering stateful data as data outside of a stateless cloud native environment.• Why it is challenging to run databases in Kubernetes clusters.• The role of operators in running stateful databases in clusters.• Understanding CRDs and controllers, and how they relate to operators.• Controllers mediate between actual and desired states.• Operators are codified system administrators.• The importance of operators as app number grows in a skill-scarce environment.• Mechanisms around stateful apps are important because they ensure data integrity.• The CAP theorem: the impossibility of consistency, availability, and tolerance.• Why different databases allow for different iterations of the CAP theorem.• When partition tolerance can and can’t get sacrificed.• Recommendations on when to run stateful or stateless apps through Kubernetes.• The importance of considering models when thinking about how to run a stateful app.• Varying definitions of workloads.• Pods can run multiple workloads• Workloads create states, so you can’t have one without the other.• The term ‘workloads’ can refer to multiple processes running at once.• Why the ephemerality of Kubernetes systems makes it hard to run stateful applications. • Ideas of stateful and stateless concerning networks.• The shift from server to browser in hosting stateful sessions. Quotes: “When I started envisioning this world of stateless apps, to me it was like, ‘Why do we even call them apps? Why don’t we just call them a process?’” — @carlisia [0:02:60] “‘State’ really is just that data which must persist.” — @joshrosso [0:04:26] “From the best that I can surmise, the operator pattern is the combination of a CRD plus a controller that will operate on events from the Kubernetes API based on that CRD’s configuration.” — @bryanl [0:17:00] “Once again, don’t let developers name them anything.” — @bryanl [0:17:35] “Data integrity is so important” — @apinick [0:22:31] “You have to really be careful about the different models that you’re evaluating when trying to think about how to manage a stateful application like a database.” — @mauilion [0:31:34] Links Mentioned in Today’s Episode: KubeCon+CloudNativeCon — https://events19.linuxfoundation.org/events/kubecon-cloudnativecon-north-america-2019/Google Spanner — https://cloud.google.com/spanner/CockroachDB — https://www.cockroachlabs.com/CoreOS — https://coreos.com/Red Hat — https://www.redhat.com/enMetacontroller — https://metacontroller.app/Brandon Philips — https://www.redhat.com/en/blog/authors/brandon-phillipsMySQL — https://www.mysql.com/ Transcript: EPISODE 009 [INTRODUCTION] [0:00:08.7] ANNOUNCER: Welcome to The Podlets Podcast, a weekly show that explores Cloud Native one buzzword at a time. Each week, experts in the field will discuss and contrast distributed systems concepts, practices, tradeoffs and lessons learned to help you on your cloud native journey. This space moves fast and we shouldn’t reinvent the wheel. If you’re an engineer, operator or technically minded decision maker, this podcast is for you. [INTERVIEW] [00:00:41] JR: All right! Hello, everybody, and welcome to episode 6 of The Cubelets Podcast. Today we are going to be discussing the concept of stateful and stateless and what that means in this crazy cloud native landscape that we all work. I am Josh Rosso. Joined with me today is Carlisia. [00:00:59] CC: Hi, everybody. [00:01:01] JR: We also have Duffie. [00:01:03] D: Hey, everybody. [00:01:04] JR: Nicholas. [00:01:05] NL: Yo! [00:01:07] JR: And a newcomer to the podcast, we also have Brian. Brian, you want to give us a little intro about yourself? [00:01:12] BL: Hi! I’m Brian. I work at VMWare. I do lots of community stuff, including sharing the KubeCon+CloudNativeCon. [00:01:22] JR: Awesome! Cool. All right. We’ve got a pretty good cast this week. So let’s dive right into it. I think one of the first things that we’ve been talking a bit about is the concept of what makes an application stateful? And of course in reverse, what makes an application stateless? Maybe we could try to start by discerning those two. Maybe starting with stateless if that makes? Does someone want to take that on? [00:01:45] CC: Well, I’m going to jump right in. I have always been a developer, as supposed to some of you or all of you have who have system admin backgrounds. The first time that I heard the stateless app, I was like, “What?” That wasn’t recent, okay? It was a long time ago, but that was a knot in my head. Why would you have a stateless app? If you have an app, you’re going to need state. I couldn’t imagine what that was. But of course it makes a lot of sense now. That was also when we were more in the monolithic world. [00:02:18] BM: Actually that’s a good point. Before you go into that, it’s a great point. Whenever we start with apps or we start developing apps, we think of an application. An application does everything. It takes input and it does stuff and it gives output. But now in this new world where we have lots of apps, big apps, small apps, we start finding that there’s apps that only talk and coordinate with other apps. They don’t do anything else. They don’t save any data. They don’t do anything. That’s what – where we get into this thing called stateless apps. Apps don’t have any type of data that they store locally. [00:02:53] CC: Yeah. It’s more like when I envision in my head. You said it brilliantly, Brian. It’s almost like a process. When I started envisioning this world of stateless apps, to me it was like, “Why do we even call them apps? Why don’t we just call them a process?” They’re just shifting back data and forth but they’re not – To me, at the beginning, apps were always stateless. They went together. [00:03:17] D: I think, frequently, people think of applications that have only locally relevant stuff that is actually not going to persist to disc, but maybe held in memory or maybe only relevant to the type of connection that’s coming through that application also as stateless, which is interesting, because there’s still some state there, but the premise is that you could lose that state and not lose the functionality of that code. [00:03:42] NL: Something that we might want to dive into really quickly when talking about stateless and stateful apps. What do we mean by the word state? When I first learned about these things, that was what always screwed me up. I’m like, “What do you mean state? Like Washington? Yeah. We got it over here.” [00:03:57] JR: Oh! State. That’s that word. State is one of those words that we use to sound smarter than we actually are 95% of the time, and that’s a number I just made up. When people are talking about state, they mean databases. Yeah. But there are other types of state as well. If you maintain local cache that needs to be persistent, if you have local files that you’re dealing with, like you’re opening files. That’s still state. State really is just that it’s data that must persist. [00:04:32] D: I agree with that definition. I think that state, whether persisted to memory or persisted to disc or persisted to some external system, that’s still what we refer to as state. [00:04:41] JR: All right. Makes sense and sounds about like what I got from it as well. [00:04:45] CC: All right. So now we have this world where we talk about stateless apps and stateful apps. Are there even stateful apps? Do we call a database an app? If we have a distributed system where we have one stateless app over here, another stateless app over there and then we have the database that’s connected to the two of them, are we calling the database a stateful app or is that whole thing – How do we call this? [00:05:15] NL: Yeah. The database is very much a state as an app with state. I’m very much – [00:05:19] D: That’s a close definition. Yeah. [00:05:21] NL: Yeah. Literally, it’s the epitome of a stateful app. But then you also have these apps that talk to databases as well and they might have local data, like data that – they start a transaction and then complete it or they have a long distributed type transaction. Any apps that revolve around a database, if they store local data, whether it’s within a transaction or something else, they’re still stateful apps. [00:05:46] D: Yup. I think you can modify and input data or modify state that has to be persisted in some way I think is a stateful app, even though I do think it’s confusing because of what – As I said before, I think that there are a bunch of applications that we think of, like not everybody considers Spark jobs to be stateful. Spark jobs, for example, are something that would bring data in, mutate that data in some way, produce some output and go away. The definition there is that Spark would generally push the resulting data into some other external system. It’s interesting, because in that model, Spark is not considered to be a stateful app because the Spark job could fail, go away, get recreated, pick up the pieces where it left off or just redo that work until all of the work is done. In many cases, people consider that to be a stateless application. That’s I think is like the crux – In my opinion, the crux of the confusion around what a stateful and stateless application is, is that people frequently – I think it’s more about where you store – what you mean by persistence and how that actually realizes in your application. If you’re pushing your state to an external database, is your application still stateful? [00:06:58] NL: I think it’s a good question, or if you are gathering data from an external source and mutating it in some way, but you don’t need data to be present when you start up, is that a stateful app or a stateless app? Even though you are taking in data, modifying it and checking it, sending out to some other mechanism or serving it in your own way, does that become like a stateless app? If that app gets killed and it comes back and it’s able to recover, is it stateful or stateless? That’s a bit of a gray area, I think. [00:07:26] JR: Yeah. I feel like a lot of the customers I work with, if the application can get killed even if it has some type of local state, they still refer to it as stateless usually, to me at least, when we talk about it because they think, “I can kind of restart this application and I’m not too worried about losing whatever it may have had.” Let’s say cached for simplicity, right? I think that kind of leads us into an interesting question. We’ve talked a lot on this podcast about cloud native infrastructure and cloud native applications and it seems like since the inception of cloud native, there’s always been this push that a stateless app is the best candidate to run or the easiest candidate to run. I’m just curious if we could dive into that for a moment. Why in the cloud native infrastructure area has there always been this push for running stateless applications? Why is it simpler? Those kinds of things. [00:08:15] BL: Before we dive into that, we have to realize – And this is just a problem of our whole ecosystem, this whole cloud native. We’re very hand-wavy in our descriptions for things. There’re a lot of ambiguous descriptions, and state is one of those. Just keep that in mind, that when we’re talking today, we’re really just talking about these things that store data and when that’s the state. Just keep that in mind as you’re listening to this. But when it comes to distributed systems in general, the easiest system is a system that doesn’t need coordination with any other system. If it happens to die, that’s okay. We can just restart it. People like to start there. It’s the easiest thing to start. [00:08:58] NL: Yeah, that was basically what I was going to say. If your application needs to tie into other applications, it becomes significantly more complicated to implement it, at least for your first time and in your system. These small applications that only – They don’t care about anybody else, they just take in data or not, they just do whatever. Those are super easy to start with because they’re just like, “Here. Start this up. Who cares? Whatever happens, it happens.” [00:09:21] CC: That could be a good boundary to define – I don’t want to jump back too far, but to define where is the stateless app to me is part of a system and just say it depends for it to come back up. Does it depend on something else that has state? [00:09:39] BL: I’ll give you an example. I can give you a good example of a stateless app that we use every day, every single one of us, none of us on this call, but when you search Google. You go to google.com and you go to the bar and you type in a search, what’s happening is there is a service at the beginning that collects that search and it federates the search over many different probably clusters of computers so they can actually do the search currently. That app that actually coordinates all that work is a stateless app most likely. All it does is just splits it up and allows more CPUs to do the work. Probably, that goes away. Probably not a problem. You probably have 10 more of them. That’s what I consider stateless. It doesn’t really own any of the data. It’s the coordinator. [00:10:25] CC: Yeah. If it goes down, it comes back up. It doesn’t need to reset itself to the state where it was before. It can truly be considered a stateless because it can just, “Okay. I reset. I’m starting from the beginning from this clear state.” [00:10:43] BL: Yes. That’s a good summary of that. [00:10:45] CC: Because another way to think about stateless – What makes an app stateful app, does it have to be combined or like deployed and shipped together with the part that maintains the state? That’s a more clear cut definition. Then that app is definitely a stateful app. [00:11:05] D: What we frequently talk about in like the cloud native space is like you know that you have a stateless app if you can just create 20 of them and not have to worry about the coordination of them. They are all workers. They are all going to take input. You could spread the load across those 20 in an identical way and not worry about which one you landed on. That’s stateless application. A stateful application is a very different thing. You have to have some coordination. You have to say how many databases can you have on a backend? Because you’re persisting data there, you have to be really careful about that you only write to the master database or to the writing database and you could read of any other memories of that database cluster, that sort of stuff. [00:11:44] CC: It might seem that we are going so deep into this differentiating between stateful and stateless, but this is so important because clusters are usually designed to be ephemeral. Ephemeral means obviously they die down, they are brought back up, the nodes, and you should worry as least as possible with the state of things. Then going back to what Joshua is saying, when we are in this cloud native world, usually we are talking about stateless apps, stateless workloads and then we’re going to just talk about what workload means. But then if that’s the case, where are the stateful apps? It’s like we have this vision that the stateful apps live outside the cloud native world? How does it work? But it’s supposed to work. [00:12:36] BL: Yup. This is the question that keeps a lot of people employed. Making sure my state is available when I need it. You know what? I’m not going to even use that word state. Making sure my data is available wherever I need it and when I need it. I don’t want to go too deep in right now, but this is actually a huge problem in the Kubernetes community in general, and we see it because there’s been lots of advice given, “Don’t run things like databases in your clusters.” This is why we see people taking the ideas of Google Spanner and like CockroachDB and actually going through a lot of work to make sure that you can run databases in Kubernetes clusters. The interesting piece about this is that we’re actually to the point where we can run these types of workloads in our clusters, but with a caveat, big star at the end, it’s very difficult and you have to know what you’re doing. [00:13:34] JR: Yeah. I want to dovetail on that Brian, because it’s something that we see all the time. I feel like when we first started setting up, let’s call them clusters, but in our case it was Kubernetes, right? We always saw that data level always being delegated to like if you’re in Amazon, some service that they hosted and so on. But now I think more and more of the customers that at least I’m seeing. I’m sure Nicholas and Duffie too, they’re interested in doing exactly what you just described. Cockroach is an example I literally just worked with recently, and it’s just interesting how much more thoughtful they have to be about their cluster operations. Going back to what you said Carlisia, it’s not as easy as just like trashing a cluster and instantiating a new one anymore, like they’re used to. They need to be more thoughtful about keeping that data integrity intact through things like upgrades and disaster recover. [00:14:18] D: Another interesting point kind to your point, Brian, is that like, frequently, people are starting to have conversations and concerns around data gravity, which means that I have a whole bunch of data that I need to work with, like to a Spark job, which I mentioned earlier. I need to basically put my compute where that data is. The way that I store that data inside the cluster and use Kubernetes to manage it or whether I just have to make sure that I have some way of bringing up compute workloads close to that data. It’s actually kind of introducing a whole new layer to this whole thing. [00:14:48] BL: Yeah! Whole new layer of work and a whole new layer of complexity, because that’s actually – The crux of all this is like where we slide the complexity too, but this is interesting, and I don’t want to go too far to this one definitely. This is why we’re seeing more people creating operators around managing data. I’ve seen operators who are bringing databases up inside of Kubernetes. I’ve seen operators that actually can bring up resources outside of Kubernetes using the Kubernetes API. The interesting thing about this is that I looked at both solutions and I said, “I still don’t know what the answer is,” and that’s great. That means that we have a lot to learn about the problem, and at least we have some paths for it. [00:15:29] NL: Actually, that kind of reminds me of the first time I ever heard the word stateful or stateless – I’m an infrastructure guy. Was around the discussion of operators, which there’s only a couple of years ago when operators were first introduced at CoreOS and some people were like, “Oh! Well, this is how you now operate a stateful mechanism inside of Kubernetes. This is the way forward that we want to propose.” I was just like, “Cool! What is that? What’s state? What do you mean stateful and stateless?” I had no idea. Josh, you were there. You’re like, “Your frontend doesn’t care about state and your backend does.” I’m like, “Does it? I don’t know. I’m not a developer.” [00:16:10] JR: Let’s talk about exactly that, because I think these patterns we’re starting to see are coming out of the needs that we’re all talking about, right? We’ve seen at least in the Kubernetes community a lot of push for these different constructs, like something called a stateful [inaudible 00:16:21], which isn’t that important right now, but then also like an operator. Maybe we can start by defining what is an operator? What is that pattern and why does it relate to stateful apps? [00:16:31] CC: I think that would be great. I am not clear what an operator is. I know there’s going to be a controller involved. I know it’s not a CRD. I am not clear on that at all, because I only work with CRDs and we don’t define – like the project I worked on with Velero, we don’t categorize it as an operator. I guess an operator uses specific framework that exists out there. Is it a Kubernetes library? I have no idea. [00:16:56] BL: We did it to ourselves again. We’re all doing these to ourselves. From the best that I can surmise, the operator pattern is the combination of a CRD plus a controller that will operate on events from the Kubernetes API based on that CRD’s configuration. That’s what an operator is. [00:17:17] NL: That’s exactly right. [00:17:18] BL: To conflate this, Red Hat created the operator SDK, and then you have [inaudible 00:17:23] and you have a Metacontroller, which can help you build operators. Then we actually sometimes conflate and call CRDs operators, and that’s pretty confusing for everyone. Once again, don’t let developers name anything. [00:17:41] CC: Wait. So let’s back up a little. Okay. There is an actual library that’s called an operator. [00:17:46] BL: Yes. There’s an operator SDK. [00:17:47] CC: Referred to as an operator. I heard that. Okay. Great. But let me back up a little because – [00:17:49] D: The word operator can [00:17:50] CC: Because if you are developing an app for Kubernetes, if you’re extending Kubernetes, you are – Okay, you might not use CRDs, but if you are using CRDs, you need a controller, right? Because how will you do actions? Then every app that has a CRD – because the alternative to having CRDs is just using the API directly without creating CRDs to reflect to resources. If you’re creating CRDs to reflect to resources, you need controllers. All of those apps, they have CRDs, are operators. [00:18:24] D: Yip [inaudible 00:18:25] is an operator. [00:18:26] CC: [inaudible 00:18:26] not an operator. How can you extend Kubernetes and not be qualified [inaudible 00:18:31] operator? [00:18:32] BL: Well, there’s a way. There is a way. You can actually just create a CRD and use a CRD for data storage, you know, store states, and you can actually query the Kubernetes API for that information. You don’t need a controller, but we couple them with controllers a lot to perform action based on that state we’ve saved to etcd. [00:18:50] CC: Duffie. [00:18:51] D: I want to back up just for a moment and talk about the controller pattern and what it is and then go from there to operators, because I think it makes it easier to get it in your head. A control pattern is effectively a way to understand desired state and real state and provide some logic or business code that will allow you to converge those two states, your actual state and your desired state. This is a pattern that we see used in almost everything within a distributed system. It’s like within Kubernetes, within most of the kind of more interesting systems that are out there. This control pattern describes a pretty good way of actually managing application flow across distributed systems. Now, operators, when they were initially introduced, we were talking about that this is a slightly different thing. Operators, when we introduced the idea, came more from like the operational burden of these stateful applications, things like databases and those sorts of stuff. With the database, etcd for example, you have a whole bunch of operational and runtime concerns around managing the lifecycle of that system. How do I add a new member to the cluster? What do I do when a member dies? How do I take action? Right now, that’s somebody like myself waking up at 2 in the morning and working through a run book to basically make sure that that service remains operational through the night. But the idea of an operator was to take that control pattern that we described earlier and make it wake up at 2 in the morning to fix this stuff. We’re going to actually codify the operational knowledge of managing the burden of these stateful applications so that we don’t have to wake up at 2 in the morning and do it anymore. Nobody wants to do that. [00:20:32] BL: Yeah. That makes sense. Remember back at KubCon years ago, I know it was one in Seattle where Brandon Philips was on stage talking about operators. He basically was saying if we think about SysOp, system operators, it was a way to basically automate or capture the knowledge of our system administrators in scripts or in a process or in code a la operators. [00:20:57] D: The last part that I’ll add to this thing, which I think is actually what really describes the value of this idea to me is that there are only so many people on the planet that do what the people in this blog post do. Maybe you’re one of them that listen to this podcast. People who are operating software or operating infrastructure at scale, there just aren’t that many of us on the planet. So as we add more applications, as more people adopt the cloud native regime or start coming to a place where they can crank out more applications more quickly, we’re going to have to get to a place where we are able to automate the burden of managing those applications, because there just aren’t enough of us to be able to support the load that is coming. There just aren’t enough people on the planet that do this to be able to support that. That’s the thing that excites me most about the operator pattern, is that it gives us a place to start. It gives us a place to actually start thinking about managing that burden over time, because if we don’t start changing the way we think about managing that burden, we’re going to run out of people. We’re not going to be able to do it. [00:22:05] NL: Yeah. It’s interesting. With stateful apps, we keep kind of bringing them – coming back to stateful apps, because stateful apps are hard and stateless apps are easy, and we’ve created all these mechanisms around operating things with state because of how just complicated it is to make sure that your data is ready, accessible and has integrity. That’s the big one that I keep not thinking about as a SysOps person coming into the Dev world. Data integrity is so important and making sure that your data is exactly what it needs to be and was the last time you checked it, is super important. It’s only something I’m really starting to grasp. That’s why I was like these things, like operators and all these mechanisms that we keep creating and recreating and recreating keep coming about, because making sure that your stateful apps have the right data at the right time is so important. [00:22:55] BL: Since you brought this up, and we just talked about why a state is so hard, I want to introduce the new term to this conversation, the whole CAP theorem, where data would typically be – in a distributed system at least, your data will be consistent or your data can be available, or if your distributed systems falls in multiple parts, you can have partition tolerance. This is one of those computer science things where you can actually pick two. You can have it be available and have partition tolerance, but your data won’t be consistent, or you can have consistency and availability, but you won’t have partition tolerance. If your cluster splits into two for some reason, the data will be bad. This is why it’s hard, this is why people have written basically lots of PhD dissertations on this subject, and this is why we are talking about this here today, is because managing state, and particularly managing distributed, is actually a very, very hard problem. But there’s software out there that will help us, and Kubernetes is definitely part of that and stateful sets are definitely part of that as well. [00:24:05] JR: I was just going to say on those three points, consistently, availability and partition tolerance. Obviously, we’d want all three if we could have them. Is there one that we most commonly tradeoff and give up or does it go case-by-case? [00:24:17] BL: Actually, it’s been proven. You can’t have all three. It’s literally impossible. It depends. If you have a MySQL server and you’re using MySQL to actually serve data out of this, you’re going to most likely get consistency and availability. If you have it replicated, you might not have partition tolerance. That’s something to think about, and there are different databases and this is actually one of the reasons why there are different databases. This is why people use things like relational databases and they use key value stores not because we really like the interfaces, but because they have different properties around the data. [00:24:55] NL: That’s an interesting point and something that I had recently just been thinking about, like why are there so many different types of databases. I just didn’t know. It was like in only recently heard of CAP theorem as well just before you mentioned it. I’m like, “Wow! That’s so fascinating.” The whole thing where you only pick two. You can’t get three. Josh, to kind of go back to your question really quickly, I think that partition tolerance is the one that we throw away the most. We’re willing to not be able to segregate our database as much as possible because C and A are just too important, I think. At least that’s what I’m saying, like I am wearing an [inaudible 00:25:26] shirt and [inaudible 00:25:27] is not partition tolerant. It’s bad at it. [00:25:31] BL: This is why Google introduced Spanner, and Spanner in some situations can get free with tradeoffs and a lot of really, really smart stuff, but most people can’t run this scale. But we do need to think about partition tolerance, especially with data whenever – Let’s say you run a store and you have multiple instances across the world and someone buys something from inventory, what is your inventory look like at any particular point? You don’t have to answer my question, of course, but think about that. These are still very important problems if fiber gets cut across the Atlantic and now I’ve sold more things than I have. Carlisia, speaking to you as someone who’s only been a developer, have you moved your thoughts on state any further? [00:26:19] CC: Well, I feel that I’m clear on – Well, I think you need to clarify your question better for me. If you’re asking if I understand what it means, I understand what it means. But I actually was thinking to ask this question to all of you, because I don’t know the answer, if that’s the question you’re asking me. I want to put that to the group. Do you recommend people, as in like now-ish, to run stateful workloads? We need to talk about workloads mean. Run stateful apps or database in sites if they’re running a Kubernetes cluster or if they’re planning for that, do you all as experts recommend that they should already be looking into doing that or they should be running for now their stateful apps or databases outside of the cloud native ecosystem and just connecting the two? Because if that’s what your question was, I don’t know. [00:27:21] BL: Well, I’ll take this first. I think that we should be spending lots of more time than we are right now in coming up community-tested solutions around using stateful sets to their best ability. What that means is let’s say if you’re running a database inside of Kubernetes and you’re using a stateful set to manage this, what we do need to figure out is what happens when my database goes down? The pod just kills? When I bring up a new version, I need to make sure that I have the correct software to verify integrity, rebuilt things, so that when it comes back up, it comes back up correctly. That’s what I think we should be doing. [00:27:59] JR: For me, I think working with customers, at least Kubernetes-oriented folks, when they’re trying to introduce Kubernetes as their orchestration part of their overall platform, I’m usually just trying to kind of meet them where they’re at. If they’re new to Kubernetes and distributed systems as a whole, if we have stateless, let’s call them maybe simpler applications to start with, I generally have them lean into that first, because we already have so much in front of us to learn about. I think it was either Brian or Duffie, you said it introduces a whole bunch more complexity. You have to know what you’re doing. You have to know how to operate these things. If they’re new to Kubernetes, I generally will advise start with stateless still. But that being said, so many of our customers that we work with are very interested in running stateful workloads on Kubernetes. [00:28:42] CC: But just to clarify what you said, Josh, because you spoke like an expert, but I still have beginner’s ears. You said something that sounded to me like you recommend that you go stateless. It sounded to me like that. What you really say is that they take out the stateless part of what they have, which they might already have or they might have to change and put the stateless. You’re not suggesting that, “Oh! You can’t do stateful anymore. You need to just do everything stateless.” What you’re saying is take the stateless part of your system, put that in Kubernetes, because that is really well-tested and keep the stateful outside of that ecosystem. Is that right? [00:29:27] JR: I think that’s a better way to put it. Again, it’s not that Kubernetes can’t do stateful. It’s more of a concept of biting off more than you can chew. We still work with a lot of people who are very new to these distributed systems concepts, and to take on running stateful workloads, if we could just delegate that to some other layer, like outside of the cluster, that could be a better place to start, at least in my experience. Nicholas and Duff might have different – [00:29:51] NL: Josh, you basically nailed it like what I was going to say, where it’s like if the team that I’m working with is interested in taking on the complexity of maintaining their databases, their stateful sets and making sure that they have data integrity and availability, then I’m all for them using Kubernetes for a stateful set. Kubernetes can run stateful applications, but there is all this complexity that we keep talking about and maintaining data and all that. If they’re willing to take on their complexity, great, it’s there for you. If they’re not, if they’re a little bit kind of behind as – Not behind, but if they’re kind of starting out their Kubernetes journey or their distributed systems journey, I would recommend them to move that complexity to somebody else and start with something a little bit easier, like a stateless application. There are a lot of good services that provide data as a service, right? You’ve got dataview as RDS is great for creating stateful application. You can leverage it anytime and you’ve got like dedicated wires too. I would point them to there first if they don’t want to take on like complexity. [00:30:51] D: I completely agree with that. An important thing I would add, which is in response to the stateful set piece here, is that as we’ve already described, managing a stateful application like a database does come with some complexity. So you should really carefully look at just what these different models provide you. Whether that model is making use of a stateful set, which provides you like ordinality, ensuring that things start up in a particular order and some of the other capabilities around that stuff. But it won’t, for example, manage some of the complexity. A stateful set won’t, for example, try and issue a command to the new member to make sure that it’s part of an existing database cluster. It won’t manage that kind of stuff. So you have to really be careful about the different models that you’re evaluating when trying to think about how to manage a stateful application like a database. I think because it’s actually why the topic of an operator came up kind of earlier, which was that like there are a lot of primitives within Kubernetes in general that provide you a lot of capability for managing things like stateful applications, but they may not entirely suit your needs. Because of the complexity with stateful applications, you have to really kind of be really careful about what you adopt and where you jump in. [00:32:04] CC: Yeah. I know just from working with Velero, which is a tool for doing backup and recovery migration of Kubernetes clusters. I know that we backup volumes. So if you have something mounted on a volume, we can back that up. I know for a fact that people are using that to backup stateful workloads. We need to talk about workloads. But at any case, one thing to – I think one of you mentioned is that you definitely also need to look at a backup and recovery strategy, which is ever more important if you’re doing stateful workloads. [00:32:46] NL: That’s the only time it’s important. If you’re doing stateless, who cares? [00:32:49] BL: Have we defined what a workload is? [00:32:50] CC: Yeah. But let me say something. Yeah, I think we should do an episode on that maybe, maybe not. We should do an episode on GitOps type of thing for related things, because even though you – Things are stateless, but I don’t want to get into it. Your cluster will change state. You can recover in stuff from like a fresh version. But as it goes through a lifecycle, it will change state and you might want to keep that state. I don’t know. I’m not the expert in that area, but let’s talk about workloads, Brian. Okay. Let me start talking about workloads. I never heard the term workload until I came into the cloud native world, and that was about a year ago or when they started looking in this space more closely. Maybe a little bit before a year ago. It took me forever to understand what a workload was. Now I understand, especially today, we’re talking about a little bit before we started recording. Let me hear from you all what it means to you. [00:34:00] BL: This is one of those terms, and I’m sure like the last any ex-Googlers about this, they’ll probably agree. This is a Google term that we actually have zero context about why it’s a term. I’m sure we could ask somebody and they would tell us, but workloads to me personally are anything that ultimately creates a pod. Deployments create replica sets, create pods. That whole thing is a workload. That’s how I look at it. [00:34:29] CC: Before there were pods, were there workloads, or is a workload a new thing that came along with pods? [00:34:35] BL: Once again, these words don’t make any sense to us, because they’re Google terms. I think that a pod is a part of a workload, like a deployment is a part of a workload, like a replica set is part of a workload. Workload is the term that encompasses an entire set of objects. [00:34:52] D: I think of a workload as a subset of an application. When I think of an application or a set of microservices, I might think of each of the services that make up that entire application as a workload. I think of it that way because that’s generally how I would divide it up to Brian’s point into different deployment or different stateful sets or different – That sort of stuff. Thinking of them each as their own autonomous piece, and altogether they form an application. That’s my think of it. [00:35:20] CC: To connect to what Brian said, deployment, will always run in the pods, which is super confusing if you’re not looking at these things, just so people understand, because it took me forever to understand that. The connection between a workload, a deployment and a pod. Pods contain – If you have a deployment that you’re going to shift Kubernetes – I don’t know if shift is the right word. You’re going to need to run on Kubernetes. That deployment needs to run somewhere, in some artifact, and that artifact is called a pod. [00:35:56] NL: Yeah. Going back to what Duffie said really quickly. A workload to me was always a process, kind of like not just a pod necessarily, but like whatever it is that if you’re like, “I just need to get this to run,” whatever that is. To me that was always a workload, but I think I’m wrong. I think I’m oversimplifying it. I’m just like whatever your process is. [00:36:16] BL: Yeah. I would give you – The reason why I would not say that is because a pod can run multiple containers at once, which ergo is multiple processes. That’s why I say it that way. [00:36:29] NL: Oh! You changed my mind. [00:36:33] BL: The reason I bring this up, and this is probably a great idea for a future show, is about all the jargon and terminology that we use in this land that we just take as everyone knows it, but we don’t all know it, and should be a great conversation to have around that. But the reason I always bring up the whole workload thing is because when we think about workloads and then you can’t have state without workloads, really. I just wanted to make sure that we tied those two things together. [00:36:58] CC: Why can you not have state without workloads? What does that mean? [00:37:01] BL: Well, the reason you can’t have state without workloads is because something is going to have to create that state, whether that workload is running in or out a cluster. Something is going to have to create it. It just doesn’t come out of nowhere. [00:37:11] CC: That goes back to what Nick was saying, that he thinks a workload is a process. Was that was you said, Nick? [00:37:18] NL: It is, yeah, but I’m renegading on that. [00:37:23] CC: At least I could see why you said that. Sorry, Brian. I cut you off. [00:37:28] BL: What I was saying is a workload ultimately is one or more processes. It’s not just a process. It’s not a single process. It could be 10, it could be 1. [00:37:39] JS: I have one final question, and we can bail on this and edit it out if it’s not a good one to end with. I hope it’s not too big, but I think maybe one thing we overlooked is just why it’s hard to run stateful workloads in these new systems like Kubernetes. We talked about how there’s more complexity and stuff, but there might be some room to talk about – People have been spinning up an EC2 server, a server on the web and running MySQL on it forever. Why in like the Kubernetes world of like pods and things is it a little bit harder to run, say, MySQL just [inaudible 00:38:10]. Is that something worth diving into? [00:38:13] NL: Yeah, I think so. I would say that for things like, say, applications, like databases particularly, they are less resilient to outages. While Kubernetes itself is dedicated to – Or most container orchestrations, but Kubernetes specifically, are dedicated to running your pods continuously as long as they will, that it is still somewhat of a shifting landscape. You do have priority and preemption. If you don’t set those things up properly of if there’s just like a total failure of your system at large, your stateful application can just go down at any time. Then how do you reconcile the outage in data, whatever data that might have gotten lost? Those sorts of things become significantly more complicated in an environment like Kubernetes where you don’t necessarily have access to a command line to run the commands to recover as easy. You may not, but it’s the same. [00:39:01] BL: Yes. You got to understand what databases do. Disk is slow, whether you have spinning disk or you have disk on chip, like SSD. What databases do in a lot of cases is they store things in memory. So if it goes away, didn’t get stored. In other cases, what databases do is they have these huge transactional logs, maybe they write them out in files and then they process the transaction log whenever they have CPU time. If a database dies just suddenly, maybe its state is inconsistent because it had items that were to be processed in a queue that haven’t been processed. Now it doesn’t know what’s going on, which is why – [00:39:39] NL: That’s interesting. I didn’t know that. [00:39:40] BL: If you kill MySQL, like kill MySQL D with a -9, why it might not come back up. [00:39:46] JR: Yeah. Going back to Kubernetes as an example, we are living in this newer world where things can get rescheduled and moved around and killed and their IPs changed and things. It seems like this environment is, should I say, more ephemeral, and those types of considerations becoming to be more complex. [00:40:04] NL: I think that really nails it. Yeah. I didn’t know that there were transactional logs about databases. I should, I feel like, have known that but I just have no idea. [00:40:11] D: There’s one more part to the whole stateful, stateless thing that I think is important to cover, but I don’t know if we’ll be able to cover it entirely in the time that we have left, and that is from the network perspective. If you think about the types of connections coming into an application, we refer to some of those connections as stateful and stateless. I think that’s something we could tackle in our remaining time, or what’s everybody’s thought? [00:40:33] JR: Why don’t you try giving us maybe a quick summary of it, Duffie, and then we can end on that. [00:40:36] CC: Yeah. I think it’s a good idea to talk about network and then address that in the context of network. I’m just thinking an idea for an episode. But give us like a quick rundown. [00:40:45] D: Sure. A lot of the kind of older monolithic applications, the way that you would scale these things is you would have multiple of them and then you would have some intelligence in the way that you’re routing connections down to those applications that would describe the ability to ensure that when Bob accesses a website and he authenticates, he’s going to authenticate to one specific instance of this application and the intelligence up in the frontend is going to handle the routing to make sure that Bob’s connection always comes back to that same instance. This is an older pattern. It’s been around for a very long time and it’s certainly the way that we first kind of learned to scale applications before we’ve decided to break into maker services and kind of handle a lot of this routing in a more resilient way. That was kind of one of the early versions of how we do this, and that is a pretty good example of a stateful session, and that there is actually some – Perhaps Bob has authenticated and he has a cookie that allows him, that when he comes back to that particular application, a lot of the settings, his browser settings, whether he’s using the dark theme or the light theme, that sort of stuff, is persisted on the server side rather than on the client side. That’s kind of what I mean by stateful sessions. Stateless sessions mean it doesn’t really matter that the user is terminating to the same end of point, because we’ve managed to keep the state either with the client. We’re handling state on the browser side of things rather on the server side of things. So you’re not necessarily gaining anything by pushing that connection back to the same specific instance, but just to a service that is more widely available. There are lots of examples of this. I mean, Brian’s example of Google earlier. Obviously, when I come back to Google, there are some things I want it to remember. I want it to remember that I’m logged in as myself. I want it to remember that I’ve used a particular – I want it to remember my history. I want it to remember that kind of stuff so that I could go back and find things that I looked at before. There are a ton of examples of this when we think about it. [00:42:40] JR: Awesome! All right, everyone. Thank you for joining us in episode 6, Stateful and Stateless. Signing off. I’m Josh Rosso, and going across the line, thank you Nicholas Lane. [00:42:54] NL: Thank you so much. This was really informative for me. [00:42:56] JR: Carlisia Campos. [00:42:57] CCC: This was a great conversation. Bye, everybody. [00:42:59] JR: Our new comer, Brian Liles. [00:43:01] BL: Until next time. [00:43:03] JR: And Duffie Cooley. [00:43:05] DCC: Thank you so much, everybody. [00:43:06] JR: Thanks all. [00:43:07] CCC: Bye! [END OF EPISODE] [0:50:00.3] ANNOUNCER: Thank you for listening to The Podlets Cloud Native Podcast. Find us on Twitter at https://twitter.com/ThePodlets and on the http://thepodlets.io/ website, where you'll find transcripts and show notes. We'll be back next week. Stay tuned by subscribing. [END]See omnystudio.com/listener for privacy information.

The Podlets - A Cloud Native Podcast
Disaster and Recovery (Ep 8)

The Podlets - A Cloud Native Podcast

Play Episode Listen Later Dec 16, 2019 42:07


In this episode of The Podlets Podcast, we are talking about the very important topic of recovery from a disaster! A disaster can take many forms, from errors in software and hardware to natural disasters and acts of God. That being said that are better and worse ways of preparing for and preventing the inevitable problems that arise with your data. The message here is that issues will arise but through careful precaution and the right kind of infrastructure, the damage to your business can be minimal. We discuss some of the different ways that people are backing things up to suit their individual needs, recovery time objectives and recovery point objectives, what high availability can offer your system and more! The team offers a bunch of great safety tips to keep things from falling through the cracks and we get into keeping things simple avoiding too much mutation of infrastructure and why testing your backups can make all the difference. We naturally look at this question with an added focus on Kubernetes and go through a few tools that are currently available. So for anyone wanting to ensure safe data and a safe business, this episode is for you! Follow us: https://twitter.com/thepodlets Website: https://thepodlets.io Feeback: info@thepodlets.io https://github.com/vmware-tanzu/thepodlets/issues Hosts: https://twitter.com/carlisiahttps://twitter.com/bryanlhttps://twitter.com/joshrossohttps://twitter.com/opowero Key Points From This Episode: • A little introduction to Olive and her background in engineering, architecture, and science. • Disaster recovery strategies and the portion of customers who are prepared.• What is a disaster? What is recovery? The fundamentals of the terms we are using.• The physicality of disasters; replication of storage for recovery.• The simplicity of recovery and keeping things manageable for safety.• What high availability offers in terms of failsafes and disaster avoidance.• Disaster recovery for Kubernetes; safety on declarative systems.• The state of the infrastructure and its interaction with good and bad code.• Mutating infrastructure and the complications in terms of recovery and recreation. • Plug-ins and tools for Kubertnetes such as Velero.• Fire drills, testing backups and validating your data before a disaster!• The future of backups and considering what disasters might look like. Quotes: “It is an exciting space, to see how different people are figuring out how to back up distributed systems in a reliable manner.” — @opowero [0:06:01] “I can assure you, careers and fortunes have been made on helping people get this right!” — @bryanl [0:07:31] “Things break all the time, it is how that affects you and how quickly you can recover.” —@opowero [0:23:57] “We do everything through the Kubernetes API, that's one reason why we can do selectivebackups and restores.” — @carlisia [0:32:41] Links Mentioned in Today’s Episode: The Podlets — https://thepodlets.io/The Podlets on Twitter — https://twitter.com/thepodletsVMware — https://www.vmware.com/Olive Power — https://uk.linkedin.com/in/olive-power-488870138Kubernetes — https://kubernetes.io/PostgreSQL — https://www.postgresql.org/AWS — https://aws.amazon.com/Azure — https://azure.microsoft.com/Google Cloud — https://cloud.google.com/Digital Ocean — https://www.digitalocean.com/SoftLayer — https://www.ibm.com/cloudOracle — https://www.oracle.com/HackIT — https://hackit.org.uk/Red Hat — https://www.redhat.com/Velero — https://blog.kubernauts.io/backup-and-restore-of-kubernetes-applications-using- heptios-velero-with-restic-and-rook-ceph-as-2e8df15b1487CockroachDB — https://www.cockroachlabs.com/Cloud Spanner — https://cloud.google.com/spanner/ Transcript: EPISODE 08[INTRODUCTION] [0:00:08.7] ANNOUNCER: Welcome to The Podlets Podcast, a weekly show that explores Cloud Native one buzzword at a time. Each week, experts in the field will discuss and contrast distributed systems concepts, practices, tradeoffs and lessons learned to help you on your cloud native journey. This space moves fast and we shouldn’t reinvent the wheel. If you’re an engineer, operator or technically minded decision maker, this podcast is for you. [EPISODE] [00:00:41] CC: Hi, everybody. We are back. This is episode number 8. Today we have on the show myself, Carlisia Campos and Josh. [00:00:51] JR: Hello, everyone. [00:00:52] CC: That was Josh Rosso. And Olive Power. [00:00:55] OP: Hello. [00:00:57] CC: And also Brian Lyles. [00:00:59] BL: Hello. [00:00:59] CC: Olive, this is your first time, and I didn’t even give you a heads-up. But tell us a little bit about your background. [00:01:06] OP: Yeah, sure. I’m based in the UK. I joined VMware as part of the Heptio acquisition, which I joined Heptio way back last year in October. The acquisition happened pretty quickly for me. Before that, I was at Red Hat working on some of their cloud management tooling and a bit of OpenShift as well. Before that, I worked with HP and Fujitsu. I kind of work in enterprise management a lot, so things like desired state and automation are kind of things that have followed me around through most of my career. Coming in here to VMware, working in the cloud native applications business unit is kind of a good fit for me. I’m a mom of two and I’m based in the UK, which I have to point out, currently undergoing a heat wave. We’ve had about like 3 weeks of 25 to 30 degrees, which is warm, very warm for us. Everybody is in a great mood. [00:01:54] CC: You have a science background, right? [00:01:57] OP: Yeah, I studied chemistry in university and then I went on to do a PhD in cancer research. I was trying to figure out ways where we could predict how different people will going to respond to radiation treatments and then with a view to tailoring everybody’s treatment to make it unique for them rather than giving the same treatment to different who present you with the same disease but were response very, very different. Yeah, that was really, really interesting. [00:02:22] CC: What is your role at VMware? [00:02:23] OP: I’m a cloud native architect. I help customers predominantly focus on their Kubernetes platforms and how to build them either from scratch or help them get more production-ready depending on where they are in their Kubernetes journey. It’s been really exciting part of being part of Heptio and following through into the VMware acquisition. We’re going to speak to customers a lot at very exciting times for them. They’re kind of embarking on their Kubernetes journey a lot of them. We’re with them from the start and every step of the way. That’s really rewarding and exciting. [00:02:54] CC: Let me pick up on that thread actually, because one thing that I love about this group for me, because I don’t get to do that. You all meet customers and you know what they are doing. Get that knowledge first-hand. What would you say the percentage of the clients that you see, how disaster recovery strategy, which by the way is a topic of today’s show. [00:03:19] OP: I speak to customers a lot. As I mentioned earlier, a lot of them are like in different stages of their journey in terms of automation, in terms of infrastructure of code, in terms of where they want to go for their next platform. But there generally in the room a team that is responsible for backup and recovery, and that’s generally sort of leads into this storage team really because you’re trying to backup state predominantly. When we’re speaking to customers, we’ll have the automation people in the room. We’ll have the developers in the room and we’ll have the storage people in the room, and they are the ones that are primarily – Out of those three sort of folks I’ve mentioned, they’re the ones that are primarily concerned about backup. How to back up their data. How to restore it in a way that satisfies the SLAs or the time to get your systems back online in a timely manner. They are the force concerned with that. [00:04:10] JR: I think it’s interesting, because it’s almost scary how many of our customers don’t actually have a disaster recovery strategy of any sort. I think it’s often times just based on the maturity of the platform. A lot of the applications and such, they’re worried about downtime, but not necessarily like it’s going to devastate the business in a lot of these apps. I’m not trying to say that people don’t run mission critical apps on things like Kubernetes. It’s just a lot of people are very new and they’re just kind of ramping up. It’s a really complicated thing that we work with our customers on, and there’re so many like layers to this. I’m sure layers that we’ll get into. There are things like disaster recovery of the actual platform. If Kubernetes, as an example, goes down. Getting it back up, backing up its data store that we call etcd. There’s obviously like the applications disaster recovery. If a cluster of some sort goes own, be it Kubernetes or otherwise, shifting some CI system and redeploying that into some B cluster to bring it back up. Then to Olive’s point, what she said, it all comes back to storage. Yeah. I mean, that’s where it gets extremely complicated. Well, at least in my mind, it’s complicated for me, I should say. When you’re thinking about, “Okay, I’m running this PostgreS as a service thing on this cluster.” It’s not that simple to just move the app from cluster A to cluster B anymore. I have to consider what do I do with the data? How do I make sure I don’t lose it out? Then that’s a pretty complicated question to answer. [00:05:32] OP: I think a lot of the storage providers, vendors playing in that storage space are kind of looking at novel ways to solve that and have adapted their current thinking maybe that was maybe slightly older thinking to new ways of interacting with Kubernetes cluster to provide that ongoing replication of data around different systems outside of the Kubernetes and then allowing it to be ported back in when a Kubernetes cluster – If we’re talking about Kubernetes in this instance as a platform, porting that data back in. There’re a lot of vendors playing in that space. It’s kind of an exciting space really to see how different people are figuring out how to back up distributed systems in reliable manner, because different people want different levels of backup. Because of the microservices nature of the cloud native architectures that we predominantly deal with, your application is not just one thing anymore. Certain parts of that application need to be recovered fairly quickly, and other parts don’t need to recover that quickly. It’s all about functionality ultimately that your end customers or your end users see. If you think about visually as like a banking application, for example, where if you’re looking at things like – The customer is interacting with that and they can check their financial details and they can check the current stages of their account, then they are two different services. But the actual service to transfer money into their account is down. It’s still a pretty functional system to the end user. But in the background, all those great systems are in place to recover that transfer of money functionality, but it’s not detrimental to your business if that’s down. There’ll be different SLAs and different objectives in terms of recovery, in terms of the amount of time that it takes for you to restore. All of that has to be factored in into disaster recovery plans and it’s up to the company and we can help as much as possible for them to figure out which feats of the applications and which feats of your business need to conform to certain SLAs in terms of recovery, because different feats will have different standards and different times in and around that space. It’s a complicated thing. It definite is. [00:07:29] BL: I want to take a step back and unpack this term, disaster recovery, because I can assure you, careers and fortunes have been made on helping people get this right. Before we get super deep into this, what’s a disaster and then what’s a recovery for that? Have you thought about that at a fundamental level? [00:07:45] OP: Just for me, if we would kind of take it at face value. A physical disaster, they could be physical ones or software-based ones. Physical ones can be like earthquakes or floodings, fires, things like that that are happening either in your region or can be fairly widespread across the area that you’re in, or software, cyber attacks that are perhaps to your own internal systems, like your system has been compromised. That’s fairly local to you. There are two different design strategies there. Physical disaster, you have to have a recover plan that is outside of that physical boundary that you can recover your system from somewhere that’s not affected by that physical disaster. For the recovery in terms of software in terms of your system has been compromised, then the recovery from that is different. I’m not an expert on cyber attacks and vulnerabilities, but the recovery from there for companies trying to recover from that, they plan for it as much as possible. So they down their systems and try and get patches and fixes to them as quickly as possible and spin the system backups. [00:08:49] BL: I’m understanding what you’re saying. I’m trying to unpack it for those of us listening who don’t really understand it. I’m going to go through what you said and we’ll unpack it a little bit. Physical from my assumption is we’re running workloads. Let’s say we’re just going to say in a cloud, not on-premise. We’re running workloads in let’s say AWS, and in the United States, we can take care local diversity by running in East and West regions. Also, we can take care of local diversity by running in availability, but they don’t reach it, because AWS is guaranteed that AZ1 and AZ3 have different network connections, are not in the same building, and things like that. Would you agree? Do you see that? I mean, this is for everyone out there. I’m going to go from super high-level down to more specific. [00:09:39] OP: I personally wouldn’t argue that, except not everybody is on AWS. [00:09:43] BL: Okay. AWS, or Azure, or Google Cloud, DigitalOcean, or SoftLayer, or Oracle, or Packet. If I thought about this, probably we could do 20 more. [00:09:55] JR: IBM. [00:09:56] BL: IBM. That’s why I said SoftLayer. They all practice in the physical diversity. They all have different regions that you can deploy software. Whether it’s be data locality, but also for data protection. If you’re thinking about creating a planet for this, this would be something you could think about. Where does my rest? What could happen to that data? Building could actually just fall over on to itself. All the hard drives are gone. What do I do? [00:10:21] OP: You’re saying that replication is a form of backup? [00:10:26] BL: I’m actually saying way more than that. Before you even think about things when it comes to disaster recovery, you got to define what a disaster is. Some applications can actually run out of multiple physical locations. Let’s go back to my AWS example, because it’s everywhere and everyone understands how AWS works at a high-level. Sometimes people are running things out of US-East-1 and US-West-2, and they could run both of the applications. The reason they can do that is because the individual transactions of whatever they’re doing don’t need to talk to one another. They connect just websites out of places. To your point, when you talk about now you have the issue where maybe you’re doing inventory management, because you have a large store and you’re running it out of multiple countries. You’re in the EU and you’re somewhere on APAC as well. What do you do about that? Well, there are a couple of ways that – I could think about how we would do that. We could actually just have all the database connections go back to one single main service. Then what we could do with that main service is that we could have it replicated in their local place and then we can replicate it in a remote place too. If the local place goes up, at least you can point all the other sites back to this one. That’s the simplest way. The reason I wanted to bring this up, is because I don’t like acronyms all that much, but disaster recovery has two of my favorite ones and they’re called RPO and RTO. Really, what it comes down to is you need to think about when you have a disaster, no matter that disaster is or how you define it, you have RTO. Basically, it’s the time that you can be down before there’s a huge issue. Then you have something called DPO, which is without going into all the names, is how far you can go since your last backup before you have business problems. Just thinking about those things is how we should think about our backup disaster recovery, and it’s all based on how your business works or how your project works and how long you can be down and how much data you have. [00:12:27] CC: Which goes to what Olive was saying. Please spell out to us what RTO and RPO stand for. [00:12:35] BL: I’m going to look them up real quick, because I literally pushed those acronym meanings out. I just know what they mean. [00:12:40] OP: I think it’s recovery time objective and recovery data objective. [00:12:45] BL: Yeah. I don’t know what the P stands for, but it is for data. [00:12:49] OP: Recovery. [00:12:51] BL: It’s the recovery points. Yeah. That’s what it is. It is the recovery point objective, RPO; and recovery time objective, RTO. You could tell that I’ve spent a lot of time in enterprise, because we don’t even define words. The acronym means what it is. Do you know what the acronym stands for anymore? [00:13:09] OP: How far back in terms of data can we go that was still okay? How far back in time can we be down, basically, until we’re okay? [00:13:17] CC: It is true though, and as Josh was saying, some teams or companies or products, especially companies that are starting their journey, their cloud native journey. They don’t have a backup, because there are many complicated things to deal with, and backup is super complicated, I mean, the disaster recovery strategy. Doing that is not trivial. But shouldn’t you start with that or at least because it is so complex? It’s funny to me when people say I don’t have that kind of a strategy. Maybe just like what Bryan said why utilizing, spreading out your data through regions, that is a strategy in itself, and there’s more to it. [00:14:00] JR: Yeah. I think I oversimplified too much. Disaster recovery could theoretically be anything I suppose. Going back to what you were saying, Brian, the recovery aspect of it. Recovery for some of the customers I work with is literally to stand on a brand-new cluster, whatever that cluster is, a cluster, that is their platform. Then redeploy all the applications on top of it. That is a recovery strategy. It might not be the most elegant and it might make assumptions about the apps that run on it, but it is a recovery strategy that somewhat simple, simple to kind of conceptualize and get started with. I think a lot of the customers that I work with when they’re first getting their bearings with distributed system of sorts, they’re a lot more concerned about solving for high availability, which is what you just said, Carlisia, where we’re spreading across maybe multiple sites. There’s the notion of different parts of the world, but there’s also the idea of like what I think Amazon has coined availability zones. Making sure if there is a disaster, you’re somewhat resilient to that disaster like Brian was saying with moving connections over and so on. Then once we’ve done high-availability somewhat well, depending on the workloads that are running, we might try to get a more fancy recovery solution in place. One that’s not just rebuild everything and redeploy, because the downtime might not be acceptable. [00:15:19] BL: I’m actually going to give some advice to all the people out there who might be listening to this and thinking about disaster recovery. First of all, all that complex stuff, that book you read, forget about it. Not because you don’t need to know. It’s because you should only think about what’s in scope at any given time. When you’re starting an application, let’s say I’m actually making a huge assumption that you’re using someone else’s cloud. You’re using public cloud. Whenever you’re in your data center, there’s a different problem. Whenever you’re using public cloud, think about what you already have. All the major public clouds had a durable object storage. Many 9s of durability and then fewer 9s, but still a lot of 9s of availability too. The canonical example there is S3. When you’re designing your applications and you know that you’re going to have disaster issues, realize that S3 is almost always going to be there, unless it was 2017 and it goes down, or the other two failures that it had. Pretty much, it will be there. Think about how do I get that data into S3. I’m just saying, you can use it for storage. It’s fairly cheap for how much storage you can get. You can make it sure it’s encrypted, and using IM, you can definitely make sure that people who have the right pillages can see it. The same goes with Azure and the same goes with Google. That’s the first phase. The second phase is that now you’re going to say, “Well, what is a relational database?” Once again, use your cloud provider. All the major cloud providers have great relational databases, and actually key value stores as well. The neat thing about them is you can actually set them up sometimes to run in a whole region. You can set them up to do automated backups. At least the minimum that you have, you actually use your cloud provider for what it’s valuable for. Now, you’re not using a cloud provider and you’re doing it on-premise, I’m going to tell you, the simple answer is I hope you have a little bit of money, because you’re going to have to pay somebody either one of Kubernetes architects or you’re going to pay somebody else to do it. There’s no easy button for this kind of solution. Just for this little mini-rant, I’m going to leave everyone with the biggest piece of advice, the best piece of advice that I can ever leave you if you’re running relational databases. If you are running a relational database, whether it’d be PostgreS, MySQL, Aurora, have it replicated. But here’s the kicker, have another replica that you delay and make it delay 10 minutes, 15 minutes, not much longer than that. Because what’s going to happen, especially in a young company, especially if you’re using Rails or something like that, you’re going to have somebody who is going to have access to production, because you’re a small company, you haven’t really federated this out yet. Who’s going to drop your main database table? They’re just going to do it and it’s going to happen and you’re going to panic. If you have it in a replica, that databases go in a replica, you have a 10-minute delay replica – 10 minutes to figure it out before the world ends. Hopefully someone deletes the master database. You’re going to know pretty quickly and you can just cut that replica out, pull that other one over. I’m not going to say where i learned this trick. We had to employ it multiple times, and it saves our butts multiple times. That’s my favorite thing to share. [00:18:24] OP: Is that replica on separate system? [00:18:26] BL: It was on a separate system. I actually don’t say, because it will be telling on who did it. Let’s say that it was physically separate from the other one in a different location as well. [00:18:37] OP: I think we’ve all been there. We’ve all have deleted something that maybe – [00:18:41] CC: I’m going to tell who did it. It was me. [00:18:45] BL: Oh no! It definitely wasn’t me. [00:18:46] OP: We mentioned HA. Will the panel think that there’s now a slightly inverse relationship between the amount of HA that you architect for versus the disaster recovery plan that you have implemented on the back of that? More you’re architecting around HA, like the less you architect or plan for DR. Not eliminating ether of them. [00:19:08] BL: I see it more. Mean, it used to be 15 years ago. [00:19:11] CC: Sorry. HA, we’re talking about high availability. [00:19:15] BL: When you think about high availability, a lot of sites were hosted. This is really before you had public cloud and a lot of people were hosting things on WebHost or they’re hosting themselves. Even if you are a company who had like a big equinox of level 3, you probably didn’t have two facilities at two different equinoxes or level 3, which probably does had one big cage and you just had diversity in the systems in there. We found people had these huge tape backups and we’re very diligent about swapping our tapes out. One thing you did was we made sure that – I mean, lots of practice of bringing this huge system down, because we assumed that the database would die and we would just spend a few hours bringing it back up, or days. Now with high availability, we can architect systems where that is less of a problem, because we could run more things that manage our data. Then we can also do high availability in the backend on the database side too. We can do things like multi-writes and multi-reads. We can actually write our data in multiple places. What we find when we do this is that the loss of a single database or a slice of processing/webhosts just means that our services degraded, which means we don’t really have a disaster in this point and we’re trying to avoid disasters. [00:20:28] JR: I think on that point, the way I’ve always thought about it, and I’ll admit this is super overly simplified, but like successful high availability or HA could make your lead to perform disaster recovery less likely, can, maybe, right? It’s possible. [00:20:45] BL: Also realize that everybody is running in public cloud. In that case, well, you can still back your stuff up to public cloud even if you’re not running in public cloud. There are still people out there who are running big tape arrays, and I’ve seen them. I’ve seen tape arrays that are wider. I’m sitting in an 80-inch wide table, bigger than this table with robotic arms and takes the restic and you had to make sure that you got the text right for that particular day doing your implementation. I guess what I’m saying is that there is a balance. HA, high availability, if you’re doing it in a truly high available way, you can’t miss whole classes of disaster. But I’m not saying that you will not have disaster, because if that was the case, we won’t be having this discussion right now. I’d like to move the conversation just a little bit to more cloud native. If you’re running on Kubernetes, what should you think about for disaster recovery? What are the types of disasters we could have? How could we recover them? [00:21:39] JR: Yeah. I think one thing that comes to mind, I was actually reading the Kubernetes Best Practices book last night, but I just got an O’Reilly membership. Awesome. Really cool book. One of the things that they had recommended early on, which I thought was a really good pull out is that since Kubernetes is a declarative system where we write these manifests to describe the desired state of our application and how it should run, recommending that we make sure to keep that declarative state in source control, just like we would our code so that if something were to go wrong, it is somewhat more trivial to redeploy the application should we need to recover. That does assume we’re not worried about like data and things like that, but it is a good call out I think. I think the book made a good call out. [00:22:22] OP: That’s on the declarative system and enable to bring your systems back up to the exact way they were before kind of itself adds comfort to the whole notion that they could be disaster. If they was, we can spin up backup relatively quickly. That’s back from the days of automation where the guys originally – I came from Red Hat, so fork at Ansible. We’re kind of trying to do the infrastructure as a code, being able to deploy, redeploy, redeploy in the same manner as the previous installation, because I’ve been in this game long-time now and I’ve spent a lot of time working with processes in and around building physical servers. That process will get handled over to lots of different teams. It was a huge thing to build these things, to get one of these things built and signed off, because it literally has to pass through the different teams to do their own different bits of things. The idea that you would get a language that had the functionality that suited the needs of all those different teams, of the store team, could automate their piece, which they were doing. They just wasn’t interactive with any of the other teams. The network people would automate theirs and the application install people would do their bit. The server OS people would do their bit. Having a process that could tie those teams together in terms of a language, so Ansible, Puppet, Chef, those kinds of things try to unite those teams and it can all do your automation, but we have a tool that can take that code and run it as one system end-to-end. At the end of that, you get an up and running system. If you run it again, you get all the systems exactly the same as the previous one. If you run it again, you get another one. Reducing the time to build these things plays very importantly into this space. Disaster is only disaster in terms of time, because things break all the time. How that affects you and how quickly you can recover. If you can recover in like seconds, in minutes and it hasn’t affected your business at all, then it wasn’t really a disaster. The time it takes you to recover, to build your things back is key. All that automation and then leading on to Kubernetes, which is the next step, I think, this whole declarative, self-healing and implementing the desired state on a regular basis really plays well into this space. [00:24:25] CC: That makes me think, I don’t completely understand because I’m not out there architecting people’s systems. The one thing that I do is building this backup tool, which happens to be for Kubernetes. I don’t completely get the limitations and use cases, but my question is, is it enough to have the declarations of how your infrastructure should be in source control? Because what if you’re running applications on the platform and your applications are interacting with a platform, change in the state of the platform. Is that not something that happens? Of course, ideally, having those declarations and source control of course is a great backup, but don’t you also want to back up the changes to state as they keep happening? [00:25:14] BL: Yeah, of course. That has been used for a long-time. That’s how replication works. Literally, you take the change and you push it over the wire and it gets applied to the remote system. The problem is, is that there isn’t just one way to do this, because if you do only transaction-based. If you only do the changes, you need a good base to start with, because you have to apply those changes to something. How do you get that piece? I’m not asking you to answer that. It’s just something to think about. [00:25:44] JR: I think you’ve hit a fatal flaw too, Carlisia, and like what that simplified just like having source control model kind of falls over. I think having that declarative kind of stamped out, this is the ideal nature of the world to this deployment and source control has benefits beyond just that of disaster recovery scenario, right? For stateless applications especially, like we talked about in the previous podcast, it can actually be all lead potentially, which is so great. Move your CI system over to cluster B. Boom! You’re back up and running. That’s really neat. A lot of our customers we work with, once we get them to a point where they’re at that stage, they then go, “Well, what about all these persisted volumes?” which by the way is evolving on a computer, which is a Kubernetes term. But like what about all these parts on like disk that I don’t want to lose if I lose my cluster? That it totally feeds into why tools like the one you work on are so helpful. Maybe I don’t know if now would be a good time. But maybe, Carlisia, you could expand on that tool. What it tries to solve for? [00:26:41] CC: I want to back up a little though. Let’s put aside stateful workloads and volumes and databases. I was talking about the infrastructure itself, the state of the infrastructure. I mean, isn’t that common? I don’t know the answer to this. I might be completely off. Isn’t that common for you to develop a cloud native application that is changing the state of the infrastructure, or is this something that’s not good to do? [00:27:05] JR: It’s possible that you can write applications that can change infrastructure, but think about that. What happens when you have bad code? We all have bad code. Our people like to separate those two things. You can still have infrastructure as code, but it’s separated from the application itself, and that’s just to protect your app people from your not app people and vice versa. A lot of that is being handled through systems that people are writing right now. You have Ansible from IBM. You have things like HashiCorp and all the things that they’re doing. They have their hosted thing. They have their own premise thing. They have their local thing. People are looking at that problem. The good thing is that that problem hasn’t been solved. I guess good and bad at the same time, because it hasn’t been solved. So someone can solve it better. But the bad thing is that if we’re looking for good infrastructure as code software, that has not been solved yet. [00:27:57] OP: I think if we’re talking about containerized applications, I think if there was systems that interacted or affected or changed the infrastructure, they would be separate from the applications. As you were saying, Brian, you just expanded a little bit [inaudible 00:28:11] containerized or sandboxed, processes that were running separate to the main application. You’re separating out what’s actually running and doing function in terms of application versus systems that have to edit that infrastructure first before that main application runs. They’re two separate things. If you had to restore the infrastructure back to the way it was without rebuilding it, but perhaps have a system whereby if you have something editing the infrastructure, you would always have something that would edit it back. If you have the process that runs to stop something, you’d also have a process that start at something. If you’re trying to [inaudible 00:28:45] your applications and if it needs to interact with other things, then that application design should include the consideration of what do I need to do to interact with the infrastructure. If I’m doing something left-wise, I have to do the opposite in equal reaction right-wise to have an effectively clean application. That’s the kind of stuff I’ve seen anyway. [00:29:04] JR: I think it maybe even fold into a whole other topic that we could even cover on another podcast, which is like the notion of the concern of mutating infrastructure. If you have a ton of hands in those cookie jars and they’re like changing things all over the place, you’re losing that potential single source of declarative truth even, right? It just could become very complicated. I think maybe to the crux of your original point, Carlisia. Hopefully I’m not super off. If that is happening a lot, I think it could actually make recover more complicated, or maybe recovery is not the way to put it, but recreating the infrastructure, if that makes sense. [00:29:36] BL: Your infrastructure should be deterministic, and that’s why I said you could. I know we talked about this before about having applications modify infrastructure. Think about that. Can and should are two different things. If you have it happen within your application due to input of any kind, then you’re no longer deterministic, unless you can figure out what that input is going to be. Be very careful about that. That’s why people split infrastructure as code from their other code. You could still have CI, continuous integration and continuous delivery/deployment for both, but they’re on different pipelines with different release metrics and different monitoring and different validation to make sure they work correctly. [00:30:18] OP: Application design plays a very important role now, especially in terms of cloud native architecture. We’re talking a lot about microservices. A lot of companies are looking to re-architect their applications. Maybe mistakes that were made in the past, or maybe not mistakes. It’s perhaps a strong word. But maybe things that were allowed in the past perhaps are now best practices going forward. If we’re looking to be able to run things independently of each other, and by definition, applications independent on the infrastructure, that should be factored in into the architecture of those applications going forward. [00:30:50] CC: Josh asked me to talk a little bit about Velerao. I will touch up on it quickly. First of all, we’d love to have a whole show just about infrastructure code, GitOps. Maybe that would be two episodes. Velero doesn’t do any backup of the infrastructure itself. It works at the Kubernetes level. We back up the Kubernetes clusters including the volumes. If you have any sort of stateful app attached to a pod that can get backed up as well. If you want to restore that to even a different service provider, then the one you backed up from, we have a restic plugin that you can use. It’s embedded in the Velero tool. So you can do that using this plugin. There are few really cool things that I find really cool about Velero is, one, you can do selective backups, which really, really don’t recommend. We recommend you always back up everything, but you can do selective restores. That would be – If you don’t need to restore a whole cluster, why would you do it? You can just do parts of it. It’s super simple to use. Why would you not have a backup? Because this is ridiculously simple. You do it through a command line, and we have a scheduler. You can just put your backup on scheduler. Determine the expiration date of each backup. A lot of neat simple features and we are actively developing things all the time. Velero is not the only one. It’d be fair to mention, and I’m not a super well versed on the tools out there, but etcd itself has a backup tool. I’m not familiar with any of these other tools. One thing to highlight is that we do everything through the Kubernetes API. That’s for example one reason why we can do selective backup or restores. Yes, you can backup etcd completely yourself, but you have to back up the whole thing. If you’re on a managed service, you wouldn’t be able to do that, because you just wouldn’t have access. All the tools like we use to back up to the etcd offers or a service provider. PX-motion. I’m not sure what this is. I’m reading the documentation here. There is this K10 from [inaudible 00:33:13] Canister. I haven’t used any of these tools. [inaudible 00:33:16]. [00:33:17] OP: I just want to say, Velero, the last customer I worked on, they wanted to use Velero in its capacity to be able to back up a whole cluster and then restore that whole cluster on a different cloud provider, as you mentioned. They weren’t thoroughly using it as – Well, they were using it as backup, but their primary function was that they wanted to populate the cluster as it was on a brand-new cloud provider. [00:33:38] CC: Yeah. It’s a migration. One thing that, like I said, Velero does, is back up the cluster, like all the Kubernetes objects, because why would we want to do that? Because if you’re declaring – Someone explain to everybody who’s listening, including myself. Some people bring this up and they say, “Well, I don’t need to back up the Kubernetes objects if all of that is declared and I have the declaration is source control. If something happens, I can just do it again. [00:34:10] BL: Untrue, because just for any given Kubernetes object, there is a configuration that you created. Let’s say if you’re creating an appointment, you need spec replicas, you need the spec templates, you need labels and selectors. But if you actually go and pull down that object afterwards, what you’ll see is there is other things inside of that object. If you didn’t specify any replicas, you get the defaults or other things that you should get defaults for. You don’t want to have a lousy backup and restore, because then you get yourself into a place where if I go back this thing up and then I restore it to a different cluster to actually test it out to see if it works, it will be different. Just keep that in mind when you’re doing that. [00:34:51] JR: I think it just comes down to knowing exactly what Brian just said, because there certainly are times where when I’m working with a customer, there’s just such a simple use case at the notion of redeploying the application and potentially losing some of those factors that may have mutated overtime. They just shrug to it and go, “Whatever.” It is so awesome that tools like Velero and other tools are bridging that gap, and I think to a point that Olive made, not only just backing that stuff up and capturing it state as it was in the cluster, but providing us with a good way to section out one namespace or one group of applications and just move those potentially over and so on. Yeah, it just kind of comes to knowing what exactly are you going to have to solve for and how complex your solution should be. [00:35:32] BL: Yeah. We’re getting towards the end, and I wanted to make sure that we talked about testing your backup, because that’s a popular thing here. People take backups. I’ve done my backups, whether I dump to S3, or I have Velero dumping to S3, or I have some other method that is in an invalid backup, it’s not valid until someone comes and takes that backup, restore it somewhere and actually verifies that it works, because there’ll be nothing worse than having yourself in a situation where you need a backup and you’re in some kind of disaster, whether small or large, and going to find out that, “Oh my gosh! We didn’t even backup the important thing.” [00:36:11] CC: That is so true. I have only been in this backup world for a minute, but I mean I’ve needed to backup things before. I don’t think I’ve learned this concept after coming here. I think I’ve known this concept. It just became stronger in my mind, so I always tell people, if you haven’t done that restore, you don’t have a backup. [00:36:29] JR: One thing I love to add on to that concept too is having my customers run like fire drills if they’re open to it. Effectively, having a list of potential terrible things that can happen, from losing a cluster to just like losing an important component. Unlike one person the team, let’s say, once a week or one a month, depending on their tolerance, just chooses something from that list and does it, not in production, but does it. It gives you the opportunity to test everything end-to-end. Did your learning fire off? When you did restore to your points, was the backup valid? Did the application come back online? It’s kind of a lot of like semi-fun, using the word fun loosely there. Fun ways that you can approach it, and it really is a good way to kind of stress test. [00:37:09] BL: I do have one small follow up on that. You’re doing backups, and no matter how you’re doing them, think about your strategy and then how long to keep data. I mean, whether it’s due to regulation or just physical space and it costs money. You just don’t backup yesterday and then you’d backup again. Backup every day and keep the last 8 days and then, like old school, would actually then have a full backup and keep that for a while just in case, because you never know. [00:37:37] CC: Good point too. Yeah. I think a lot of what we said goes to what – It was Olive I think who said it first. You have to understand your needs. [00:37:46] OP: Yeah, just which bits have different varying degrees of importance in terms of application functionality for your end user. Which bits are absolutely critical and which bits can buy you a little bit more time to recover. [00:37:58] CC: Yeah. That would definitely vary from product to product. As we are getting into this idea of ephemeral clusters and automation and we get really good at automating things and bringing things back up, is it possible that we get to a point where we don’t even talk about disasters anymore, or you just have to grow, bring this up cluster or this system, and does it even matter why [inaudible 00:38:25]. We’re not going to talk about this aspect, because what I’m thinking is in the past, in a long, long time ago, or maybe not so long time ago. When I was working with application, and that was a disaster, it was a disaster, because it felt like a disaster. Somebody had to go in manually and find out what happened and what to fix and fix it manually. It was complete chaos and stress. Now if they just like keep rolling and automate it, something goes down, you bring it back up. Do you know what I mean? It won’t matter why. Are we going to talk about this in terms of it was a disaster? Does it even matter what caused it? Maybe it was a – Recovery from a disaster wouldn’t look any different than a planned update, for example. [00:39:12] BL: I think we’re getting to a place – And I don’t know whether we’re 5 years away or 10 years away or 20 years away, a place where we won’t have the same class of disaster that we have now. Think about where we’ve come over the past 20 years. Over the past 20 years, be basically looked at hardware in a rack is replace. I can think about 1988, 1999 and 2000. We rack a whole bunch of servers, and that server will be special. Now, at these scales, we don’t care about that anymore. When a server goes away, we have 50 more just like it. The reason we were able to do that across large platforms is because of Linux. Now with Kubernetes, if Kubernetes keeps on going in the same trajectory, we’re going to basically codify these patterns that makes hardware loss not a thing. We don’t really care if we lose a server. You have 50 more nodes that look just like it. We’re going to start having the software – The software is always available. Think about like the Google Spanner. Google Spanner is multi-location, and it can lose notes and it doesn’t lose data, and it’s relational as well. That’s what CockroachDB is about as well, about Spanner, and we’re going into the place where this kind of technology is available for anyone and we’re going to see that we’re not going to have these kinds of disasters that we’re having now. I think what we’ll have now is bigger distributed systems things where we have timing issues and things like that and leader election issues. But I think those cool stuff can’t be phased out at least over the next computing generation. [00:40:39] OP: It’s maybe more around architectures these days and applications designers and infrastructure architects in the container space and with Kubernetes orchestrating and maintaining your desired state. You’re thinking that things will fail, and that’s okay, because it will go back to the way it was before. The concept of something stopping in mid-run is not so scary anymore, because it would get put back to its state. Maybe you might need to investigate if it keeps stopping and starting and Kubernetes keeps bringing it back. The system is actually still fully functional in terms of end users. You as the operator might need to investigate why that’s so. But the actual endpoint is still that your application is still up and running. Things fail and it’s okay. That’s maybe a thing that’s changed from maybe 5 years ago, 10 years ago. [00:41:25] CC: This is a great conversation. I want to thank everybody, Olive Power, Josh Rosso, Brian Lyles. I’m Carlisia Campos singing off. Make sure to subscribe. This was Episode 8. We’ll be back next week. See you. [END OF EPISODE] [0:50:00.3] KN: Thank you for listening to The Podlets Cloud Native Podcast. Find us on Twitter at https://twitter.com/ThePodlets and on the http://thepodlets.io/ website, where you'll find transcripts and show notes. We'll be back next week. Stay tuned by subscribing. [END]See omnystudio.com/listener for privacy information.

Pivotal Insights
Next-generation SQL (with Peter Mattis)

Pivotal Insights

Play Episode Listen Later Oct 1, 2019 21:04


Learn more:Cockroach LabsCockroachDBDistributed SQL: An Evolution of the DatabaseSpring Data and YugaByte DB: A Developer's DreamFollow everyone on Twitter:IntersectPeter MattisDerrick HarrisCockroach LabsPivotal

TechCrunch Startups – Spoken Edition
Cockroach Labs announces $55M Series C to battle industry giants

TechCrunch Startups – Spoken Edition

Play Episode Listen Later Aug 9, 2019 3:41


Cockroach Labs, makers of CockroachDB, sits in a tough position in the database market. On one side, it has traditional database vendors like Oracle, and on the other there's AWS and its family of databases. It takes some good technology and serious dollars to compete with those companies. Cockroach took care of the latter with a $55 million Series C round today. The round was led by Altimeter Capital and Tiger Global along with existing investor GV.

BSD Now
151: Fuzzy Auditing

BSD Now

Play Episode Listen Later Jul 20, 2016 69:55


This week on BSDNow, we have all sorts of interesting news, including a Kernel Fuzzing audit done for OpenBSD, a much improved This episode was brought to you by Headlines Multiple Bugs in OpenBSD Kernel (http://marc.info/?l=oss-security&m=146853062403622&w=2) Its patch Wednesday! (OR last Thursday if you were watching the mailing lists) Jesse Hertz and Tim Newsham (part of the NCC Group calling themselves project Triforce) have been working with the OpenBSD team to fix some newly discovered bugs in the kernel using fuzzing. Specifically they were able to track down several potential methods to corrupt memory or panic the kernel: mmap_panic: Malicious calls to mmap() can trigger an allocation panic or trigger memory corruption. kevent_panic: Any user can panic the kernel with the kevent system call. thrsleeppanic: Any user can panic the kernel with the _thrsleep system Call. thrsigdivertpanic: Any user can panic the kernel with the _thrsigdivert system call. ufsgetdentspanic: Any user can panic the kernel with the getdents system call. mount_panic: Root users, or users on systems with kern.usermount set to true, can trigger a kernel panic when mounting a tmpfs filesystem. unmount_panic: Root users, or users on systems with kern.usermount set to true, can trigger a kernel panic when unmounting a filesystem. tmpfsmknodpanic: Root can panic kernel with mknod on a tmpfs filesystem. This was a great find, and we have a link to more of the results, if you would like to explore them in more detail. NCC Group OpenBSD Kernel fuzzing results (http://www.openwall.com/lists/oss-security/2016/07/14/5) Would like to see more work like this done in all of the BSDs *** Running CockroachDB in a FreeBSD Jail (https://www.cockroachlabs.com/blog/critters-in-a-jar-running-cockroachdb-in-a-freebsd-jail/) The developers behind CockroachDB have written up a nice walkthrough of getting their software to run inside FreeBSD jails. “Manually encapsulating CockroachDB using Linux cgroups is no easy task, which is why tools like Docker exist in the first place. By comparison, running server processes natively in FreeBSD jails is straightforward and robust.” The walkthrough begins with compiling CockroachDB straight from source (A port is pending), which is pretty easy relying upon bash / git / gmake and GO. With the compile finished, the next step will be mounting linprocfs, although that may be going away in the future: “(Note: Linux compatibility files / packages / libraries are not needed further. CockroachDB uses Linux's procfs to inspect system properties via gosigar. If/when gosigar evolves to read FreeBSD properties natively, CockroachDB will not need linprocfs any more.)” With the initial setup complete, the walkthrough then takes us through the process of creating the rc.d script (Which should be included with the port) and ultimately setting up ezjail and deploying CockroachDB within. With the word getting out about jails and their functionality, we hope to see more projects also provide walkthroughs and FreeBSD support natively. Kudos to the CockroachDB team! *** Usermount bugs (https://marc.info/?l=openbsd-announce&m=146854517406640&w=2) kern.usermount, (vfs.usermount on FreeBSD) is a sysctl that can be enabled to allow an unprivileged user to mount filesystems. It is very useful for allowing non-root users to mount a USB stick or other external media. It is not without its dangers though: > “kern.usermount=1 is unsafe for everyone, since it allows any non-pledged program to call the mount/umount system calls. There is no way any user can be expected to keep their system safe / reliable with this feature. Ignore setting to =1, and after release we'll delete the sysctl entirely.” In OpenBSD 6.0 and forward, the setting will no longer work, and root privileges will be required to mount a filesystem If there is a bug in the filesystem driver, the user could potentially exploit that and root the system > “In addition to the patched bugs, several panics were discovered by NCC that can be triggered by root or users with the usermount option set. These bugs are not getting patched because we believe they are only the tip of the iceberg. The mount system call exposes too much code to userland to be considered secure” This is a very pragmatic way of dealing with these issues, as it is not really possible to be sure that EVERY bug has been fixed, and that this feature is no longer an exploit vector usermount being removed from OpenBSD (http://undeadly.org/cgi?action=article&sid=20160715125022) I use this facility in FreeBSD extensively, combined with ZFS permission delegation, to allow non-root users to create and mount new ZFS datasets, and to do replication without requiring any root access There are some safety belts, for instance: the user must own the directory that the new filesystem will be mounted to, so they can't mount to /etc and replace the password file with their own *** Let's Encrypt client from BSD in C (https://kristaps.bsd.lv/letskencrypt/) File this one under the category of “It's about time!”, but Kristaps (Who we've interviewed in the past) has released some new software for interacting with letsencrypt. The header for the project site sums it up nicely: “Be up-front about security: OpenSSL is known to have issues, you can't trust what comes down the pipe, and your private key's integrity is a hard requirement. Not a situation where you can be careless. letskencrypt is a client for Let's Encrypt users, but one designed for security. No Python. No Ruby. No Bash.A straightforward, open source implementation in C that isolates each step of the sequence.” What specifically does it isolate you ask? Right now it is broken down into 6 steps: read and parse an account and domain private key authenticate with the Let's Encrypt server authorise each domain listed for the certificate submit the X509 request receive and serialise the signed X509 certificate request, receive, and serialise the certificate chain from the issuer I don't know about all of you, but I'm going to be switching over one of my systems this weekend. *** News Roundup Videos from the FOSDEM BSD Dev room are now online (https://video.fosdem.org/2016/k4601/) The videos from the BSD Dev room at FOSDEM have been stealthily posted online at some point since last I checked The videos are individually linked from the talks on the Schedule (https://archive.fosdem.org/2016/schedule/track/bsd/) The talk pages also include the slides, which can help you to follow along *** FreeBSD on Jetson TK1 (http://kernelnomicon.org/?p=628) The nVidia Jetson TK1 is a medium sized ARM device that is a big more than your standard Raspberry Pi The device has: NVIDIA 4-Plus-1™ Quad-Core ARM® Cortex™-A15 CPU (2.3 GHz) NVIDIA Kepler GPU with 192 CUDA Cores 2 GB DDR3L x16 Memory with 64-bit Width 16 GB 4.51 eMMC Memory 1 Half Mini-PCIE Slot 1 Full-Size SD/MMC Connector 1 Full-Size HDMI Port 1 USB 2.0 Port, Micro AB 1 USB 3.0 Port, A 1 RS232 Serial Port 1 ALC5639 Realtek Audio Codec with Mic In and Line Out 1 RTL8111GS Realtek GigE LAN 1 SATA Data Port SPI 4 MByte Boot Flash The following signals are available through an expansion port: DP/LVDS Touch SPI 1x4 + 1x1 CSI-2 GPIOs UART HSIC i2c The device costs $192 USD from nVidia or Amazon Oleksandr Tymoshenko (gonzo@freebsd.org) has a post describing what it takes to get FreeBSD running on the Jetson TK1 > “First of all – my TK1 didn't have U-Boot. Type of bootloader depends on the version of Linux4Tegra TK1 comes with. Mine had L4T R19, with some kind of “not u-boot” bootloader.” They tried using the provided tool, compiled on FreeBSD since it uses libusb, but it gave an error. Falling back to trying from Ubuntu, they got the same error. They then flashed the TK1 with newer firmware, and suddenly, uboot is available. The post then walks through pxe booting FreeBSD on the TK1 The guide then walks through replacing the UBoot with a version compatible with UBLDR, for more features We'll have to wait for another post to get FreeBSD burned onto the device, but at this point, you can reliably boot it without any user interaction I have one of these devices, so I am very interested in this work *** Why we use OpenBSD at VidiGaurd (https://blog.vidiguard.com/why-we-use-openbsd-at-vidiguard-4521f217b2b7#.9r86v742v) VidiGuard (Which makes autonomous drone solutions for security monitoring) has posted an interesting write-up on why they use OpenBSD. Specifically they start by mentioning while they are in business to provide physical security, they just as equally value their data security, especially their customer data. They name 4 specific features that matter to them, starting with Uncompromising Quality and Security: “Over the past 20 years, OpenBSD's focus on uncompromising quality and code correctness has yielded an operating system second-to-none. Code auditing and review is core to the project's development process. The team's focus on security includes integrated cryptography, new security mitigation techniques, and an optional-security-is-no-security stance, making it arguably the most secure operating system available today. This approach pays off in the form of only a few security updates for a given release, compared to other operating systems that might release a handful of updates every week.” High praise indeed! They also mention the sane-defaults, documentation and last but not least, the license as also winning factors in making OpenBSD their operating system of choice. Thanks to VidiGuard for publically detailing the use of BSD, and we hope to see other business follow suit! *** "You can (and should) slow down and learn how things work" – Interview with Dru Lavigne (https://bsdmag.org/dru_lavigne/) If you've been around the BSD community for any length of time, you no doubt have heard of Dru Lavigne (Or perhaps own one of her books!) She was recently interviewed by Luca Ferrari for BSD Magazine and you may find it a fascinating read. The 2nd question asked sounded a lot like our opener to an interview (How did you get into BSD) “ In the mid 90s, I went back to school to learn network and system administration. As graduation grew near and I started looking for a work, I noticed that all the interesting jobs wanted Unix skills. Wanting to increase my skills, and not having any money, I did an Internet search for “Free Unix”. The first hit was freebsd.org. I went to the website and started reading the Handbook and thought “I can do this”. Since I only had access to one computer and wanted to ramp up my skills quickly, I printed out the installation and networking chapters of the Handbook. I replaced the current operating system with FreeBSD and forced myself to learn how to do everything I needed to do on that computer in FreeBSD. It was a painful (and scary) few weeks as I figured out how to transition the family's workflow to FreeBSD, but it was also exhilarating to learn that “yes, I can do this!. Since then, I've had the opportunity to try out or administer the other BSDs, several Linux distros, SCO, and Solaris. I found that the layout, logic, and release engineering process of the BSDs makes the most sense to me and I'm happiest when on a BSD system.” When asked, Dru also had a good response to what challenges potential new UNIX or BSD users may face: “Students who haven't been exposed to open source before are used to thinking of technology in terms of a purchasable brand consisting of “black boxes” that are supposed to “just work”, without having to think about how they work. You can (and should) slow down and learn how things work. It can be a mind shift to learn that the freedom to use and change how something works does exist, and isn't considered stealing. And that learning how something works, while hard, can be fun. BSD culture, in particular, is well suited for those who have the time and temperament to dive into how things work. With over 40 years of freely available source and commit messages, you can dive as deep as you want into learning how things came to be, how they evolved over the years, how they work now, and how they can be improved. There is a diverse range of stuff to choose from: from user tools to networking to memory management to hardware drivers to security mechanisms and so on. There is also a culture of sharing and learning and encouragement for users who demonstrate that they have done their homework and have their own ideas to contribute.” The interview is quite long, and Dru provides fantastic insights into more aspects of BSD in general. Well worth your time to read! *** Beastie Bits: Ed Maste is seeking testing 'withoutgpldtc' (https://twitter.com/ed_maste/status/755474764479672321) “PAM Mastery” tech reviewers wanted (http://blather.michaelwlucas.com/archives/2717) OPNsense 16.7 RC2 (https://opnsense.org/opnsense-16-7-rc2-released/) Jupyter Notebook for bootstrapping Arduino on FreeBSD (https://nbviewer.jupyter.org/github/DadAtH-me/Projects/blob/master/arduino-on-nix.ipynb) The Design and Implementation of the Anykernel and Rump Kernels (second edition) (http://www.fixup.fi/misc/rumpkernel-book/) Complete desktop synchronisation with Unison and FreeBSD jails (xjails) (https://github.com/kbs1/freebsd-synced-xjails) Feedback/Questions Eric - List most popular files (http://pastebin.com/S7u0VeVi) Robroy - ZFS Write Cache (http://pastebin.com/81Zmj0cX) Luis - FreeNAS HW Setup (http://pastebin.com/SfeKR7v2) Emett - Python Followup (http://pastebin.com/wy4ar0YH) Peter - Multicast + Jails (http://pastebin.com/zd2QAu25) ***