Podcasts about postgres

Free and open-source relational database management system

  • 301PODCASTS
  • 1,103EPISODES
  • 41mAVG DURATION
  • 5WEEKLY NEW EPISODES
  • Jun 20, 2025LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about postgres

Show all podcasts related to postgres

Latest podcast episodes about postgres

Postgres FM
Multi-tenant options

Postgres FM

Play Episode Listen Later Jun 20, 2025 50:18


Nikolay and Michael are joined by Gwen Shapira to discuss multi-tenant architectures — the high level options, the pros and cons of each, and how they're trying to help with Nile. Here are some links to things they mentioned:Gwen Shapira https://postgres.fm/people/gwen-shapiraNile https://www.thenile.devSaaS Tenant Isolation Strategies (AWS whitepaper) https://docs.aws.amazon.com/whitepapers/latest/saas-tenant-isolation-strategies/saas-tenant-isolation-strategies.html Row Level Security https://www.postgresql.org/docs/current/ddl-rowsecurity.htmlCitus https://github.com/citusdata/citusPostgres.AI Bot https://postgres.ai/blog/20240127-postgres-ai-bot RLS Performance and Best Practices https://supabase.com/docs/guides/troubleshooting/rls-performance-and-best-practices-Z5JjwvCase Gwen mentioned about the planner thinking an optimisation was unsafe Re-engineering Postgres for Millions of Tenants (Gwen's recent talk at PGConf.dev) https://www.youtube.com/watch?v=EfAStGb4s88 Multi-tenant database the good, the bad, the ugly (talk by Pierre Ducroquet at PgDay Paris) https://www.youtube.com/watch?v=4uxuPfSvTGU ~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith special thanks to:Jessie Draws for the elephant artwork 

Path To Citus Con, for developers who love Postgres
12 years of Postgres Weekly with Peter Cooper

Path To Citus Con, for developers who love Postgres

Play Episode Listen Later Jun 20, 2025 76:54


What drives someone to publish 600+ issues of a Postgres newsletter for over a decade? In Episode 28 of Talking Postgres with Claire Giordano, Peter Cooper—creator of Postgres Weekly—shares how his days of rustic programming and QBASIC fanzines on Usenet led to a newsletter empire that now reaches nearly half a million developers each week. We dig into the BBC's "big tent" editorial influence, an accidental business model that just worked, and the perils of "temporary" hacks. Plus: spam filters, a Photoshop addiction, and one very cheesy story (dairy-free).Links mentioned in this episode:Newsletter: Postgres WeeklyCooperpress: List of newslettersNewsletter: Latest issue of Postgres Weekly on Jun 19, 2025Newsletter: Postgres Weekly issue with horrible graphicNewsletter: Very first issue of Postgres Weekly on Mar 13, 2013Newsletter: Ruby Weekly, the first Cooperpress newsletterBook: Beginning Ruby Third Edition, by Peter CooperPodcast episode: How I got started as a developer (& in Postgres) with David RowleyFeed reader: FeedbinGitHub repo: feedbin/feedbinFeed reader: FeederEmail testing software: LitmusGitHub repo: MGML markup language for emailPaper: The Design of PostgresGitHub repo: PGRX for building Postgres extensions in RustPodcast news: Podnews.net for daily briefings about podcastsWikipedia page: BBC MicroWikipedia page: ZX SpectrumCal invite: LIVE recording of Ep29 of Talking Postgres to happen on Wed Jul 9, 2025

Software Engineering Daily
SED News: Corporate Spies, Postgres, and the Weird Life of Devs Right Now

Software Engineering Daily

Play Episode Listen Later Jun 17, 2025 43:39


Welcome back to SED News, a podcast series from Software Engineering Daily where hosts Gregor Vand and Sean Falconer break down the latest stories in software engineering, Silicon Valley, and wider tech world. In this episode, Gregor and Sean unpack what's going with Deel and Rippling, explore why Databricks and Snowflake are making big bets The post SED News: Corporate Spies, Postgres, and the Weird Life of Devs Right Now appeared first on Software Engineering Daily.

The Changelog
Stop uploading your data to Google (News)

The Changelog

Play Episode Listen Later Jun 16, 2025 8:19


Lukas Mathis tells us to stop uploading our data to Google, Robert Vitonsky wants web devs to not guess his language using his IP, Tom from GameTorch reminds us that software talent is gold right now, Austin Parker from Honeycomb describes how LLMs are upending the observability industry, and Vitess co-creator, Sugu Sougoumarane, joins Supabase to lead their Multigres effort to bring Vitess to Postgres.

Postgres FM
Mean vs p99

Postgres FM

Play Episode Listen Later Jun 13, 2025 38:51


Nikolay and Michael discuss looking at queries by mean time — when it makes sense, why ordering by a percentile (like p99) might be better, and the merits of approximating percentiles in pg_stat_statements using the standard deviation column. Here are some links to things they mentioned:Approximate the p99 of a query with pg_stat_statements (blog post by Michael) https://www.pgmustard.com/blog/approximate-the-p99-of-a-query-with-pgstatstatementspg_stat_statements https://www.postgresql.org/docs/current/pgstatstatements.html Our episode about track_planning https://postgres.fm/episodes/pg-stat-statements-track-planning pg_stat_monitor https://github.com/percona/pg_stat_monitorstatement_timeout https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-STATEMENT-TIMEOUT~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork

Oracle University Podcast
Oracle GoldenGate: Distribution Path, Target Initiated Path, Receiver Server, and Initial Load

Oracle University Podcast

Play Episode Listen Later Jun 10, 2025 10:43


In this episode, Lois Houston and Nikita Abraham dive into key components of Oracle GoldenGate 23ai with expert insights from Nick Wagner, Senior Director of Product Management.   They break down the Distribution Service, explaining how it moves trail files between environments, replaces the classic extract pump, and ensures secure data transfer. Nick also introduces Target Initiated Paths, a method for connecting less secure environments to more secure ones, and discusses how the Receiver Service simplifies monitoring and management. The episode wraps up with a look into Initial Load, covering different methods for syncing source and target databases without downtime.   Oracle GoldenGate 23ai: Fundamentals: https://mylearn.oracle.com/ou/course/oracle-goldengate-23ai-fundamentals/145884/237273 Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ X: https://x.com/Oracle_Edu   Special thanks to Arijit Ghosh, David Wright, Kris-Ann Nansen, Radhika Banka, and the OU Studio Team for helping us create this episode. ----------------------------------------------------------------- Episode Transcript: 00:00 Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we'll bring you foundational training on the most popular Oracle technologies. Let's get started! 00:25 Nikita: Welcome to the Oracle University Podcast! I'm Nikita Abraham, Team Lead of Editorial Services with Oracle University, and with me is Lois Houston, Director of Innovation Programs.  Lois: Hey there! Last week, we spoke about the Extract process and today we're going to spend time discussing the Distribution Path, Target Initiated Path, Receiver Server, and Initial Load. These are all critical components of the GoldenGate architecture, and understanding how they work together is essential for successful data replication. 00:58 Nikita: To help us navigate these topics, we've got Nick Wagner joining us again. Nick is a Senior Director of Product Management for Oracle GoldenGate. Hi Nick! Thanks for being with us today. To kick things off, can you tell us what the distribution service is and how it works? Nick: A distribution path is used when we need to send trail files between two different GoldenGate environments. The distribution service replaces the extract pump that was used in GoldenGate classic architecture. And so the distribution service will send the trail files as they're being created to that receiver service and it will write the trail files over on the target system. The distribution service works in a kind of a streaming fashion, so it's constantly pulling the trail files that the extract is creating to see if there's any new data. As soon as it sees new data, it'll packet it up and send it across the network to the receiver service. It can use a couple of different methods to do this. The most secure and recommended method is using a WebSocket secure connection or WSS. If you're going between a microservices and a classic architecture, you can actually tell the distribution service to send it using the classic architecture method. In that case, it's the OGG option when you're configuring the distribution service. There's also some unsecured methods that would send the trail files in plain text. The receiver service is then responsible for taking that data and rewriting it into the trail file on the target site. 02:23 Lois: Nick, what are some of the key features and responsibilities of the distribution service? Nick: It's responsible for command deployment. So any time that you're going to actually make a command to the distribution service, it gets handled there directly. It can handle multiple commands concurrently. It's going to dispatch trail files to one or more receiver servers so you can actually have a single distribution path, send trail files to multiple targets. It can provide some lightweight filtering so you can decide which tables get sent to the target system. And it also is integrated in with our data streams, our pub and subscribe model that we've added in GoldenGate 23ai. 03:01 Lois: Interesting. And are there any protocols to remember when using the distribution service? Nick: We always recommend a secure WebSocket. You also have proxy support for use within cloud environments. And then if you're going to a classic architecture GoldenGate, you would use the Oracle GoldenGate protocol. So in order to communicate with the distribution service and send it commands, you can communicate directly from any web browser, client software-- installation is not required-- or you can also do it through the admin client if necessary, but you can do it directly through browsers. 03:33 Nikita: Ok, let's move on to the target initiated path. Nick, what is it and what does it do essentially? Nick: This is used when you're communicating from a less secure environment to a more secure environment. Often, this requires going through some sort of DMZ. In these situations, a connection cannot be established from the less secure environment into the more secure environment. It actually needs to be established from the more secure environment out. And so if we need to replicate data into a more secure environment, we need to actually have the target GoldenGate environment initiate that connection so that it can be established.  And that's what a target-initiated path does. 04:12 Lois: And how do you set it up? Nick: It's pretty straightforward to set up. You actually don't even need to worry about it on the source side. You actually set it up and configure it from the target. The receiver service is responsible for receiving the trail file data and writing it to the local trail file. In this situation, we have a target-initiated path created. And so that receiver service is going to write the trail files locally and the replicat is going to apply that data into that target system. 04:37 Nikita: I also want to ask you about the Receiver service. What is it really? Nick: Receiver service is pretty straightforward. It's a centrally controlled service. It allows you to view the status of your distribution path and replaces target side collectors that were available in the classic architecture of GoldenGate. You can also get statistics about the receiver service directly from the web UI.  You can get detailed information about these paths by going into the receiver service and identifying information like network details, transfer protocols, how many bytes it's received, how many bytes it's sent out. If you need to issue commands from the admin client to the receiver service, you can use the info command to get details about it. Info all will tell you everything that's running. And you can see that your receiver service is up and running. 05:28 Are you working towards an Oracle Certification this year? Join us at one of our certification prep live events in the Oracle University Learning Community. Get insider tips from seasoned experts and learn from others who have already taken their certifications. Go to community.oracle.com/ou to jump-start your journey towards certification today! 05:53 Nikita: Welcome back. In the last section of today's episode, we'll cover what Initial Load is. Nick, can you break down the basics for us? Nick: So, the initial load is really used when you need to synchronize the source and target systems. Because GoldenGate is designed for 24/7 environments, we need to be able to do that initial load without taking downtime on the source. And so all the methods that we talk about do not require any downtime for that source database. 06:18 Lois: How do you do the initial load? Nick: So there's a couple of different ways to do the initial load. And it really depends on what your topology is. If I'm doing like-to-like replication in a homogeneous environment, we'll say Oracle-to-Oracle, the best options are to use something that's integrated with GoldenGate, some sort of precise instantiation method that does not require HandleCollisions. That's something like a database backup and restoring it to a specific SDN or CSN value using a Database Snapshot. Or in some cases, we can use Oracle Data Pump integration with GoldenGate. There are some less precise instantiation options, which do require HandleCollisions. We also have dissimilar initial load methods. And this is typically when you're going between heterogeneous environments. When my source and target databases don't match and there isn't any kind of fast unload or fast load utility that I could use between those two databases. In almost all cases, this does require HandleCollisions to be used. 07:16 Nikita: Got it. So, with so many options available, are there any advantages to using GoldenGate's own initial load method?  Nick: While some databases do have very good fast load and unload utilities, there are some advantages to using GoldenGate's own initial load method. One, it supports heterogeneous replication environments. So if I'm going from Postgres to Oracle, it'll do all the data type transformation, character set transformation for me. It doesn't require any downtime, if certain conditions are met.  It actually performs transformation as the data is loaded, too, as well as filtering. And so any transformation that you would be doing in your normal transaction log replication or CDC replication can also go through the same transformation for the initial load process. GoldenGate's initial load process does read directly from the source tables. And it fetches the data in arrays. It also uses parallel processing to speed up the replication. It does also handle activity on the source tables during the initial load process, so you do not need to worry about quiescing that source database. And a lot of the initial load methods directly built into GoldenGate support distributed application analytics targets, including things like Databricks, Snowflake, BigQuery. 08:28 Lois: And what about its limitations? Or to put it differently, when should users consider using different methods? Nick: So the first thing to consider is system proximity. We want to make sure that the two systems we're working with are close together. Or if not, how are we going to send the data across? One thing to keep in mind, when we do the initial load, the source database is not quiesced. So if it takes an hour to do the initial load or 10 hours, it really doesn't matter to GoldenGate. So that's something to keep in mind. Even though we talk about performance of this, the performance really isn't as critical as one might suspect. So the important thing about data system proximity is the proximity to the extract and replicat processes that are going to be pulling the data out and pushing it across. And then how much data is generated? Are we talking about a database that's just a couple of gigabytes? Or are we talking about a database that's hundreds of terabytes? Do we want to consider outage time? Would it be faster to take a little bit of outage and use some other method to move the data across? What kind of outage or downtime windows do we have for these environments? And then another consideration is disk space. As we're pulling the data out of that source database, we need to have somewhere to store it. And if we don't have enough disk space, we need to run to temporary space or to use multiple external drives to be able to support it. So these are all different considerations. 09:50 Nikita: I think we can wind up our episode with that. Thanks, Nick, for giving us your insights.  Lois: If you'd like to learn more about the topics we covered today, head over to mylearn.oracle.com and check out the Oracle GoldenGate 23ai: Fundamentals course. Nikita: In our next episode, Nick will take us through the Replicat process. Until then, this is Nikita Abraham… Lois: And, Lois Houston signing off! 10:14 That's all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We'd also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.

Postgres FM
What to log

Postgres FM

Play Episode Listen Later Jun 6, 2025 48:34


Nikolay and Michael discuss logging in Postgres — mostly what to log, and why changing quite a few settings can pay off big time in the long term. Here are some links to things they mentioned:What to log https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHATOur episode about Auditing https://postgres.fm/episodes/auditing Our episode on auto_explain https://postgres.fm/episodes/auto_explain Here are the parameters they mentioned changing:log_checkpointslog_autovacuum_min_duration log_statementlog_connections and log_disconnectionslog_lock_waitslog_temp_fileslog_min_duration_statement log_min_duration_sample and log_statement_sample_rate And finally, some very useful tools they meant to mention but forgot to!   https://pgpedia.infohttps://postgresqlco.nfhttps://why-upgrade.depesz.com/show?from=16.9&to=17.5 ~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork

Thinking Elixir Podcast
255: OTP 28 and Vibe Coding Phoenix Apps

Thinking Elixir Podcast

Play Episode Listen Later Jun 3, 2025 32:02


News includes the major OTP 28 release with priority messages functionality, ElixirConf EU 2025 videos starting to appear including Chris McCord's keynote on his new phoenix.new service and James Arthur's introduction of Phoenix Sync for real-time database synchronization, the EEF board election results and their new role as a CVE Numbering Authority for the Hex ecosystem, upcoming co-located hooks and macro components in LiveView, updates to the Elixir Lua package and MDEx with its new Markdown sigil, a new convention for AI-friendly usage_rules.md files in hex packages, and more! Show Notes online - http://podcast.thinkingelixir.com/255 (http://podcast.thinkingelixir.com/255) Elixir Community News https://www.honeybadger.io/ (https://www.honeybadger.io/?utm_source=thinkingelixir&utm_medium=podcast) – Honeybadger.io is sponsoring today's show! Keep your apps healthy and your customers happy with Honeybadger! It's free to get started, and setup takes less than five minutes. https://www.erlang.org/news/180 (https://www.erlang.org/news/180?utm_source=thinkingelixir&utm_medium=shownotes) – OTP 28 release announcement with new priority messages functionality and SBOM support https://www.erlang.org/eeps/eep-0076 (https://www.erlang.org/eeps/eep-0076?utm_source=thinkingelixir&utm_medium=shownotes) – EEP 76 specification for priority messages in OTP 28 https://www.youtube.com/playlist?list=PLvL2NEhYV4Zu421KzHuLICUqieJXI2o_Z (https://www.youtube.com/playlist?list=PLvL2NEhYV4Zu421KzHuLICUqieJXI2o_Z?utm_source=thinkingelixir&utm_medium=shownotes) – ElixirConf EU 2025 YouTube playlist with conference videos https://www.youtube.com/watch?v=ojLVHc4gLk&list=PLvL2NEhYV4Zu421KzHuLICUqieJXI2oZ&index=3 (https://www.youtube.com/watch?v=ojL_VHc4gLk&list=PLvL2NEhYV4Zu421KzHuLICUqieJXI2o_Z&index=3?utm_source=thinkingelixir&utm_medium=shownotes) – Chris McCord's keynote "Code Generators are Dead. Long Live Code Generators" https://x.com/chris_mccord/status/1923417060593356889 (https://x.com/chris_mccord/status/1923417060593356889?utm_source=thinkingelixir&utm_medium=shownotes) – Chris McCord's announcement about phoenix.new paid service https://phoenix.new/ (https://phoenix.new/?utm_source=thinkingelixir&utm_medium=shownotes) – Chris McCord's new phoenix.new paid service at Fly.io https://www.youtube.com/watch?v=4IWShnVuRCg&list=PLvL2NEhYV4Zu421KzHuLICUqieJXI2o_Z&index=2 (https://www.youtube.com/watch?v=4IWShnVuRCg&list=PLvL2NEhYV4Zu421KzHuLICUqieJXI2o_Z&index=2?utm_source=thinkingelixir&utm_medium=shownotes) – James Arthur's keynote "Introducing Phoenix Sync" from ElixirConf EU https://github.com/electric-sql/phoenix_sync/ (https://github.com/electric-sql/phoenix_sync/?utm_source=thinkingelixir&utm_medium=shownotes) – Phoenix Sync GitHub repository for real-time sync to Postgres-backed Phoenix apps https://hexdocs.pm/phoenix_sync/readme.html (https://hexdocs.pm/phoenix_sync/readme.html?utm_source=thinkingelixir&utm_medium=shownotes) – Phoenix Sync documentation on HexDocs https://github.com/josevalim/sync (https://github.com/josevalim/sync?utm_source=thinkingelixir&utm_medium=shownotes) – José Valim's sync project that inspired Phoenix Sync https://erlef.org/blog/eef/election-2025-results (https://erlef.org/blog/eef/election-2025-results?utm_source=thinkingelixir&utm_medium=shownotes) – EEF board election results for Cohort C https://x.com/TheErlef/status/1924531926008004633 (https://x.com/TheErlef/status/1924531926008004633?utm_source=thinkingelixir&utm_medium=shownotes) – EEF Twitter announcement of election results https://erlef.org/blog/eef/election-2025-candidates (https://erlef.org/blog/eef/election-2025-candidates?utm_source=thinkingelixir&utm_medium=shownotes) – Information about the EEF election candidates https://erlef.org/blog/security/eef-cna-announcement (https://erlef.org/blog/security/eef-cna-announcement?utm_source=thinkingelixir&utm_medium=shownotes) – EEF becomes CVE Numbering Authority for Hex and BEAM ecosystem https://github.com/erlef-cna (https://github.com/erlef-cna?utm_source=thinkingelixir&utm_medium=shownotes) – EEF CNA GitHub organization https://cna.erlef.org/ (https://cna.erlef.org/?utm_source=thinkingelixir&utm_medium=shownotes) – EEF CNA website https://github.com/surface-ui/surface (https://github.com/surface-ui/surface?utm_source=thinkingelixir&utm_medium=shownotes) – Surface UI project for server-side rendering components https://github.com/phoenixframework/phoenixliveview/pull/3810 (https://github.com/phoenixframework/phoenix_live_view/pull/3810?utm_source=thinkingelixir&utm_medium=shownotes) – Draft PR for co-located hooks and macro components in LiveView https://github.com/tv-labs/lua (https://github.com/tv-labs/lua?utm_source=thinkingelixir&utm_medium=shownotes) – Elixir Lua package v0.2.x release by TvLabs https://x.com/davydog187/status/1925186045156463034 (https://x.com/davydog187/status/1925186045156463034?utm_source=thinkingelixir&utm_medium=shownotes) – Dave's tweet about ElixirConf EU Luerl talk https://www.youtube.com/watch?v=4YBBoXXH_98 (https://www.youtube.com/watch?v=4YBBoXXH_98?utm_source=thinkingelixir&utm_medium=shownotes) – "Lua on the BEAM" talk by Dave Lucia & Robert Virding https://discord.gg/6Ukp9vpj (https://discord.gg/6Ukp9vpj?utm_source=thinkingelixir&utm_medium=shownotes) – Discord link for Lua community https://x.com/germsvel/status/1922602086065148093 (https://x.com/germsvel/status/1922602086065148093?utm_source=thinkingelixir&utm_medium=shownotes) – German Velasco's video highlighting LiveDebugger tool https://bsky.app/profile/germsvel.com/post/3lp4snnkpj225 (https://bsky.app/profile/germsvel.com/post/3lp4snnkpj225?utm_source=thinkingelixir&utm_medium=shownotes) – German Velasco's BlueSky post about LiveDebugger https://podcast.thinkingelixir.com/249 (https://podcast.thinkingelixir.com/249?utm_source=thinkingelixir&utm_medium=shownotes) – Thinking Elixir episode 249 featuring LiveDebugger discussion https://hexdocs.pm/mdex/MDEx.Sigil.html (https://hexdocs.pm/mdex/MDEx.Sigil.html?utm_source=thinkingelixir&utm_medium=shownotes) – MDEx v0.7 documentation for new ~MD sigil https://hexdocs.pm/autumn (https://hexdocs.pm/autumn?utm_source=thinkingelixir&utm_medium=shownotes) – Autumn syntax highlighter package that works with MDEx https://github.com/leandrocp/mdex_mermaid (https://github.com/leandrocp/mdex_mermaid?utm_source=thinkingelixir&utm_medium=shownotes) – MDEx Mermaid plugin for adding mermaid support to Markdown https://bsky.app/profile/zachdaniel.dev/post/3lpofyykwds2i (https://bsky.app/profile/zachdaniel.dev/post/3lpofyykwds2i?utm_source=thinkingelixir&utm_medium=shownotes) – Zach Daniel's BlueSky post about usage_rules.md convention https://hexdocs.pm/usage_rules (https://hexdocs.pm/usage_rules?utm_source=thinkingelixir&utm_medium=shownotes) – Usage rules package documentation https://github.com/ash-project/usage_rules/ (https://github.com/ash-project/usage_rules/?utm_source=thinkingelixir&utm_medium=shownotes) – Usage rules GitHub repository https://blogs.windows.com/windowsdeveloper/2025/05/19/the-windows-subsystem-for-linux-is-now-open-source/ (https://blogs.windows.com/windowsdeveloper/2025/05/19/the-windows-subsystem-for-linux-is-now-open-source/?utm_source=thinkingelixir&utm_medium=shownotes) – Microsoft announcement about Windows Subsystem for Linux going open source https://www.zdnet.com/article/believe-it-or-not-microsoft-just-announced-a-linux-distribution-service-heres-why/ (https://www.zdnet.com/article/believe-it-or-not-microsoft-just-announced-a-linux-distribution-service-heres-why/?utm_source=thinkingelixir&utm_medium=shownotes) – ZDNet article explaining Microsoft's Linux strategy and Azure statistics Do you have some Elixir news to share? Tell us at @ThinkingElixir (https://twitter.com/ThinkingElixir) or email at show@thinkingelixir.com (mailto:show@thinkingelixir.com) Find us online - Message the show - Bluesky (https://bsky.app/profile/thinkingelixir.com) - Message the show - X (https://x.com/ThinkingElixir) - Message the show on Fediverse - @ThinkingElixir@genserver.social (https://genserver.social/ThinkingElixir) - Email the show - show@thinkingelixir.com (mailto:show@thinkingelixir.com) - Mark Ericksen on X - @brainlid (https://x.com/brainlid) - Mark Ericksen on Bluesky - @brainlid.bsky.social (https://bsky.app/profile/brainlid.bsky.social) - Mark Ericksen on Fediverse - @brainlid@genserver.social (https://genserver.social/brainlid) - Dave Lucia - @davydog187 (https://x.com/davydog187)

Postgres FM
How to move off RDS

Postgres FM

Play Episode Listen Later May 30, 2025 47:33


Nikolay and Michael discuss moving off managed services — when and why you might want to, and some tips on how for very large databases. Here are some links to things they mentioned:Patroni https://github.com/patroni/patronipgBackRest https://github.com/pgbackrest/pgbackrestWAL-G https://github.com/wal-g/wal-gHetzner Cloud https://www.hetzner.com/cloudPostgres Extensions Day https://pgext.daypg_wait_sampling https://github.com/postgrespro/pg_wait_samplingpg_stat_kcache https://github.com/powa-team/pg_stat_kcacheauto_explain https://www.postgresql.org/docs/current/auto-explain.htmlFivetran https://www.fivetran.compgcopydb https://github.com/dimitri/pgcopydbKafka https://kafka.apache.orgDebezium https://debezium.iomax_slot_wal_keep_size https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-MAX-SLOT-WAL-KEEP-SIZElog_statement DDL https://www.postgresql.org/docs/current/runtime-config-logging.html#GUC-LOG-STATEMENTPgBouncer pause/resume https://www.pgbouncer.org/usage.html#pause-db~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork

Remote Ruby
Bites and Bytes – Cheesesteaks and One Month Rails

Remote Ruby

Play Episode Listen Later May 30, 2025 38:10


In this episode of Remote Ruby, Chris and Andrew catch up on recent travels and food experiences, including the best Philly cheesesteaks they've ever had. The conversation shifts towards development topics, particularly testing challenges and solutions in Ruby on Rails, featuring discussions about emoji pickers, asset pipelines, and the prawn library. Chris shares updates on acquiring an old Rails app, One Month, and future plans for this project. They also explore various development hiccups and solutions, including using libraries for faster system tests and streamlining asset pipelines. The episode wraps up with insights into new tools like an official Postgres extension for VS Code and plans for future video content on their platform.LinksJudoscale- Remote Ruby listener giftOne MonthRunning Rails System Tests With Playwright Instead of Selenium by Justin SearlsAnnouncing a new IDE for PostgreSQL in VS Code from MicrosoftLou Malnati's Pizzeria Chris Oliver X/Twitter Andrew Mason X/Twitter Jason Charnes X/Twitter

Postgres FM
Locks

Postgres FM

Play Episode Listen Later May 23, 2025 38:53


Nikolay and Michael discuss heavyweight locks in Postgres — how to think about them, why you can't avoid them, and some tips for minimising issues. Here are some links to things they mentioned:Locking (docs) https://www.postgresql.org/docs/current/explicit-locking.htmlPostgres rocks, except when it blocks (blog post by Marco Slot) https://www.citusdata.com/blog/2018/02/15/when-postgresql-blocks/Lock Conflicts (tool by Hussein Nasser) https://pglocks.org/log_lock_waits (docs) https://www.postgresql.org/docs/current/runtime-config-logging.html#GUC-LOG-LOCK-WAITSHow to analyze heavyweight lock trees (guide by Nikolay) https://gitlab.com/postgres-ai/postgresql-consulting/postgres-howtos/-/blob/main/0042_how_to_analyze_heavyweight_locks_part_2.mdLock management (docs) https://www.postgresql.org/docs/current/runtime-config-locks.htmlOur episode on zero-downtime migrations https://postgres.fm/episodes/zero-downtime-migrations~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork

The Data Stack Show
245: The Future of Data: Postgres, Iceberg, and Operational Analytics with Pranav Aurora of Mooncake Labs

The Data Stack Show

Play Episode Listen Later May 22, 2025 44:05


Highlights from this week's conversation include:Pranav's Background and Journey in Data (1:10)Backstory of Mooncake Labs (2:05)PostgreSQL as a Force (4:47)Curiosity in Product Management (7:33)Challenges with Iceberg (11:12)Go-to-Market Strategy (13:52)Building Community Engagement (15:56)Importance of Feedback (18:26)AI Integration in Mooncake Labs (21:29)Innovation in data interaction (23:49)PostgreSQL and startup growth (28:41)Core component of business strategy (31:20)The Origin of the name Mooncake Labs (34:12)Upcoming Product Release (38:40)Connecting with Mooncake Labs and Parting Thoughts (42:49)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

The Data Stack Show
244: Postgres to ClickHouse: Simplifying the Modern Data Stack with Aaron Katz & Sai Krishna Srirampur

The Data Stack Show

Play Episode Listen Later May 20, 2025 34:51


Highlights from this week's conversation include:Background of ClickHouse (1:14)PostgreSQL Data Replication Tool (3:19)Emerging Technologies Observations (7:25)Observability and Market Dynamics (11:26)Product Development Challenges (12:39)Challenges with PostgreSQL Performance (15:30)Philosophy of Open Source (18:01)Open Source Advantages (22:56)Simplified Stack Vision (24:48)End-to-End Use Cases (28:13)Migration Strategies (30:21)Final Thoughts and Takeaways (33:29)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

The Data Stack Show
The PRQL: Data Migration Made Easy: Postgres, ClickHouse, and the Future of Analytics with Aaron Katz and Sai Krishna Srirampur

The Data Stack Show

Play Episode Listen Later May 19, 2025 5:47


The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Equity
$1 Billion a lot of money these days?

Equity

Play Episode Listen Later May 16, 2025 23:53


Databricks just snatched up another AI company. This week, data analytics giant announced a $1 billion acquisition of Neon, a startup building an open-source alternative to AWS Aurora Postgres. It's the latest in a spree of high-profile buys, joining MosaicML and Tabular, as Databricks positions itself as the place to build, deploy, and scale AI-native applications.  Today, on TechCrunch's Equity podcast, hosts Kirsten Korosec, Max Zeff, and Anthony Ha unpack the Databricks–Neon deal, where Neon's serverless Postgres tech fits into the larger vision, and whether $1 billion still counts as “a lot of money” these days (spoiler: Kirsten and Anthony are on the fence). Listen to the full episode to hear about: Chime's long-awaited IPO plans and what the neobank's S-1 did (and didn't) reveal. AWS entering a ‘strategic partnership' that could shake up cloud infrastructure, especially as the Middle East ramps up its AI ambitions The return of the web series. Yes, really. Short-form scripted content is back, and investors are placing big bets on nostalgic trend Equity will be back next week, so don't miss it! Equity is TechCrunch's flagship podcast, produced by Theresa Loconsolo, and posts every Wednesday and Friday.  Subscribe to us on Apple Podcasts, Overcast, Spotify and all the casts. You also can follow Equity on X and Threads, at @EquityPod. For the full episode transcript, for those who prefer reading over listening, check out our full archive of episodes here. Credits: Equity is produced by Theresa Loconsolo with editing by Kell. We'd also like to thank TechCrunch's audience development team. Thank you so much for listening, and we'll talk to you next time. Learn more about your ad choices. Visit megaphone.fm/adchoices

This Week in Startups
Chime's IPO, Databricks' $1B Acquisition & Dave Rubin's Media Empire | E2126

This Week in Startups

Play Episode Listen Later May 15, 2025 66:37


Today's show: Chime is finally going public with strong financials and a shot at matching its $25B 2021 valuation, signaling real momentum in the IPO market. Databricks just made a $1B bet on agentic AI by acquiring Neon, a Postgres-as-a-service startup riding the new database wave. Then, Dave Rubin joins to share how he built and sold Locals, his uncancellable creator platform, all while navigating the intense media landscape.Timestamps:(0:00) Episode Teaser(1:14) Jason and Alex open the show(1:42) Why Chime's IPO is such a promising sign(7:22) Chime's financials and valuation(10:12) Squarespace - Use offer code TWIST to save 10% off your first purchase of a website or domain at https://www.Squarespace.com/TWIST(12:17) So why did Databricks buy Neon?(13:22) Where is the AI Integration Desktop App?(17:29) Jason's plan to bring Americans back to the movies(20:10) Northwest Registered Agent. Form your entire business identity in just 10 clicks and 10 minutes. Get more privacy, more options, and more done—visit northwestregisteredagent.com/twist today!(21:58) Make movies All-you-can-eat!(26:26) Special Guest: Dave Rubin(30:39) Lemon.io - Get 15% off your first 4 weeks of developer time at https://Lemon.io/twist(31:41) Why Dave Rubin goes phone free for weeks at a time(45:24) Why Identity politics is killing business and sports(49:23) Can Locals reinvent subscription models?Subscribe to the TWiST500 newsletter: https://ticker.thisweekinstartups.comCheck out the TWIST500: https://www.twist500.comSubscribe to This Week in Startups on Apple: https://rb.gy/v19fcpLinks from episode:Rubin Report on Locals: https://rubinreport.locals.com/Follow Dave:X: https://x.com/RubinReportYouTube: https://www.youtube.com/channel/UCJdKr0Bgd_5saZYqLCa9mngFollow Lon:X: https://x.com/lonsFollow Alex:X: https://x.com/alexLinkedIn: ⁠https://www.linkedin.com/in/alexwilhelmFollow Jason:X: https://twitter.com/JasonLinkedIn: https://www.linkedin.com/in/jasoncalacanisThank you to our partners:(10:12) Squarespace - Use offer code TWIST to save 10% off your first purchase of a website or domain at https://www.Squarespace.com/TWIST(20:10) Northwest Registered Agent. Form your entire business identity in just 10 clicks and 10 minutes. Get more privacy, more options, and more done—visit northwestregisteredagent.com/twist today!(30:39) Lemon.io - Get 15% off your first 4 weeks of developer time at https://Lemon.io/twistGreat TWIST interviews: Will Guidara, Eoghan McCabe, Steve Huffman, Brian Chesky, Bob Moesta, Aaron Levie, Sophia Amoruso, Reid Hoffman, Frank Slootman, Billy McFarlandCheck out Jason's suite of newsletters: https://substack.com/@calacanisFollow TWiST:Twitter: https://twitter.com/TWiStartupsYouTube: https://www.youtube.com/thisweekinInstagram: https://www.instagram.com/thisweekinstartupsTikTok: https://www.tiktok.com/@thisweekinstartupsSubstack: https://twistartups.substack.comSubscribe to the Founder University Podcast: https://www.youtube.com/@founderuniversity1916

Postgres FM
Top ten dangerous issues

Postgres FM

Play Episode Listen Later May 9, 2025 46:28


Nikolay and Michael discuss ten dangerous Postgres related issues — ones that might be painful enough to get onto the CTO and even CEOs desk, and then what you can do proactively. The ten issues discussed are:Heavy lock contentionBloat control and index maintenance  Lightweight lock contentionTransaction ID wraparound4-byte integer PKs hitting the limitReplication limitsHard limitsData lossPoor HA choice (split brain)Corruption of various kindsSome previous episodes they mentioned that cover the issues in more detail: PgDog https://postgres.fm/episodes/pgdogPerformance cliffs https://postgres.fm/episodes/performance-cliffsZero-downtime migrations https://postgres.fm/episodes/zero-downtime-migrations Queues in Postgres https://postgres.fm/episodes/queues-in-postgresBloat https://postgres.fm/episodes/bloatIndex maintenance https://postgres.fm/episodes/index-maintenanceSubtransactions https://postgres.fm/episodes/subtransactionsFour million TPS https://postgres.fm/episodes/four-million-tpsTransaction ID wraparound https://postgres.fm/episodes/transaction-id-wraparoundpg_squeeze https://postgres.fm/episodes/pg_squeeze synchronous_commit https://postgres.fm/episodes/synchronous_commitManaged service support https://postgres.fm/episodes/managed-service-support And finally, some other things they mentioned: A great recent SQL Server-related podcast episode on tuning techniques https://kendralittle.com/2024/05/20/erik-darling-and-kendra-little-rate-sql-server-performance-tuning-techniques/Postgres Indexes, Partitioning and LWLock:LockManager Scalability (blog post by Jeremy Schneider) https://ardentperf.com/2024/03/03/postgres-indexes-partitioning-and-lwlocklockmanager-scalability/Do you vacuum everyday? (talk by Hannu Krosing) https://www.youtube.com/watch?v=JcRi8Z7rkPgpg_stat_wal https://pgpedia.info/p/pg_stat_wal.htmlThe benefit of lz4 and zstd for Postgres WAL compression (Small Datum blog, Mark Callaghan) https://smalldatum.blogspot.com/2022/05/the-benefit-of-lz4-and-zstd-for.htmlSplit-brain in case of network partition (CloudNativePG issue/discussion) https://github.com/cloudnative-pg/cloudnative-pg/discussions/7462 ~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork

Path To Citus Con, for developers who love Postgres
How I got started with FerretDB (& why we chose Postgres) with Peter Farkas

Path To Citus Con, for developers who love Postgres

Play Episode Listen Later May 9, 2025 89:53


How does a trek to K2 base camp in the Himalayas spark the idea for a database company? In Episode 27 of Talking Postgres with Claire Giordano, guest Peter Farkas—CEO and co-founder of FerretDB—shares the origin story of this open source MongoDB alternative. (Spoiler: “Ferret” wasn't the original name). We dig into why Postgres was the obvious choice, what “true open source” means to Peter, and how FerretDB is now powered by the open source DocumentDB extension from Microsoft. Plus, why Hungarian Trappist cheese might deserve a footnote in database history. Links mentioned in this episode:GitHub: FerretDB/FerretDB repoBlog: FerretDB 2.0 GA: Open Source MongoDB alternative, ready for productionACM SIGMOD: The Design of Postgres, published 15 June 1986Postgres Weekly: Issue 591 featuring FerretDBGitHub: Microsoft/DocumentDB open source repoConference talk: From MongoDB to Postgres: Building an Open Standard for Document Databases at POSETTE 2025OSI Blog: The SSL is Not an Open Source LicenseRedMonk Blog: OSS: Two Steps Forward, One Step Back, by Stephen O'GradyTalking Postgres Ep18: How I got started as a developer (& in Postgres) with David RowleyOpenDocDB: initiative to define an open standardWikipedia: K2 (yes, the mountain)Go Blog: The Go Gopherxkcd: webcomic 927 on StandardsWikipedia: Trappista cheeseCal invite: LIVE recording of Ep28 of Talking Postgres to happen on Wed Jun 18, 2025 

No Hacks Marketing
[SOLO] Magician vs. Conductor: How to Build (and Fix) Products with AI in 2025

No Hacks Marketing

Play Episode Listen Later May 8, 2025 13:15


Heads-up: I use some salty language. Nothing hateful, just passionate about this topic. Skip if that's not your vibe.It's 2 AM, your side-project just went viral, and the signup flow is on fire. Do you keep “vibe-coding” blind prompts, or step up as the conductor who actually knows the score? In this first-ever solo episode, I unpack why “anyone can code with AI” is 2025's biggest myth and show you how to turn large language models into the ultimate co-pilot instead of a ticking time-bomb.Key TakeawaysIllusions break at scale. Vibe-coding can get you an MVP, but you'll pay interest when production fires start.Your new super-power isn't “no knowledge,” it's “faster knowledge.” LLMs shrink the gap between “I don't know” and “I can ship.”Learning beats prompting. Prompting is great, prompt-and-probe is better. Use back-and-forth to understand, not just generate.Career moat = curiosity. The people who thrive next year aren't the ones with the fanciest prompts; they're the ones who ask better questions and close their gaps daily.7-Day Knowledge-Gap ChallengePick one concept you avoid (CSS Flexbox? Indexing in Postgres?).Spend 15 min/day grilling an LLM: “Explain it like I'm 7… now show real-world code… now debug this snippet…”Log what surprised you, then share your aha momentsCall to ActionTry the challenge. Tag me with your progress by next week.Rate & Review. If this episode saved you from a 2 AM meltdown, drop a ⭐⭐⭐⭐⭐ on your favorite app.Share. Forward the LinkedIn post or the episode link to one builder who still thinks vibe-coding is a strategy.---If you enjoyed the episode, please share it with a friend!No Hacks websiteYouTubeLinkedInInstagram

The Data Stack Show
240: Data Council Insights from a Waymo: Postgres and the Future of the Data Stack

The Data Stack Show

Play Episode Listen Later May 7, 2025 18:20


Highlights from this week's conversation include:Recording from a Waymo (0:54)Future of Data Technology (2:45)AI Integration in Data Work (4:20)Speeding Up Data Experiences (5:29)Snapshot Conversations with Founders (9:52)Diversity of Perspectives on Postgres (12:37)Cultural Significance of Database Mascots (14:09)Incubation and Success of Open Source Projects (16:43)Final Thoughts and Takeaways (17:34)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

The Data Engineering Show
How Rising Wave Is Redefining Real-Time Data with Postgres Power

The Data Engineering Show

Play Episode Listen Later May 7, 2025 31:36


In this episode of The Data Engineering Show, the bros sit with Yingjun Wu, founder and CEO of Rising Wave, to explore the innovative world of stream processing systems. Yingjun shares his journey from academic research to creating a Postgres-compatible streaming system that drastically reduces resource usage. They discuss how Rising Wave's S3-based architecture and Postgres compatibility provide advantages over traditional systems like Flink, and explore the increasing role of Apache Iceberg in data pipelines.

Oracle University Podcast
Oracle GoldenGate 23ai: New Features & Product Family

Oracle University Podcast

Play Episode Listen Later May 6, 2025 17:39


In this episode, Lois Houston and Nikita Abraham continue their deep dive into Oracle GoldenGate 23ai, focusing on its evolution and the extensive features it offers. They are joined once again by Nick Wagner, who provides valuable insights into the product's journey.   Nick talks about the various iterations of Oracle GoldenGate, highlighting the significant advancements from version 12c to the latest 23ai release. The discussion then shifts to the extensive new features in 23ai, including AI-related capabilities, UI enhancements, and database function integration.   Oracle GoldenGate 23ai: Fundamentals: https://mylearn.oracle.com/ou/course/oracle-goldengate-23ai-fundamentals/145884/237273 Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ X: https://x.com/Oracle_Edu   Special thanks to Arijit Ghosh, David Wright, Kris-Ann Nansen, Radhika Banka, and the OU Studio Team for helping us create this episode.   -----------------------------------------------------------------   Episode Transcript: 00:00 Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we'll bring you foundational training on the most popular Oracle technologies. Let's get started! 00:25 Lois: Hello and welcome to the Oracle University Podcast! I'm Lois Houston, Director of Innovation Programs with Oracle University, and with me is Nikita Abraham, Team Lead: Editorial Services.  Nikita: Hi everyone! Last week, we introduced Oracle GoldenGate and its capabilities, and also spoke about GoldenGate 23ai. In today's episode, we'll talk about the various iterations of Oracle GoldenGate since its inception. And we'll also take a look at some new features and the Oracle GoldenGate product family. 00:57 Lois: And we have Nick Wagner back with us. Nick is a Senior Director of Product Management for GoldenGate at Oracle. Hi Nick! I think the last time we had an Oracle University course was when Oracle GoldenGate 12c was out. I'm sure there's been a lot of advancements since then. Can you walk us through those? Nick: GoldenGate 12.3 introduced the microservices architecture. GoldenGate 18c introduced support for Oracle Autonomous Data Warehouse and Autonomous Transaction Processing Databases. In GoldenGate 19c, we added the ability to do cross endian remote capture for Oracle, making it easier to set up the GoldenGate OCI service to capture from environments like Solaris, Spark, and HP-UX and replicate into the Cloud. Also, GoldenGate 19c introduced a simpler process for upgrades and installation of GoldenGate where we released something called a unified build. This means that when you install GoldenGate for a particular database, you don't need to worry about the database version when you install GoldenGate. Prior to this, you would have to install a version-specific and database-specific version of GoldenGate. So this really simplified that whole process. In GoldenGate 23ai, which is where we are now, this really is a huge release.  02:16 Nikita: Yeah, we covered some of the distributed AI features and high availability environments in our last episode. But can you give us an overview of everything that's in the 23ai release? I know there's a lot to get into but maybe you could highlight just the major ones? Nick: Within the AI and streaming environments, we've got interoperability for database vector types, heterogeneous capture and apply as well. Again, this is not just replication between Oracle-to-Oracle vector or Postgres to Postgres vector, it is heterogeneous just like the rest of GoldenGate. The entire UI has been redesigned and optimized for high speed. And so we have a lot of customers that have dozens and dozens of extracts and replicats and processes running and it was taking a long time for the UI to refresh those and to show what's going on within those systems. So the UI has been optimized to be able to handle those environments much better. We now have the ability to call database functions directly from call map. And so when you do transformation with GoldenGate, we have about 50 or 60 built-in transformation routines for string conversion, arithmetic operation, date manipulation. But we never had the ability to directly call a database function. 03:28 Lois: And now we do? Nick: So now you can actually call that database function, database stored procedure, database package, return a value and that can be used for transformation within GoldenGate. We have integration with identity providers, being able to use token-based authentication and integrate in with things like Azure Active Directory and your other single sign-on for the GoldenGate product itself. Within Oracle 23ai, there's a number of new features. One of those cool features is something called lock-free reservation columns. So this allows you to have a row, a single row within a table and you can identify a column within that row that's like an inventory column. And you can have multiple different users and multiple different transactions all updating that column within that same exact row at that same time. So you no longer have row-level locking for these reservation columns. And it allows you to do things like shopping carts very easily. If I have 500 widgets to sell, I'm going to let any number of transactions come in and subtract from that inventory column. And then once it gets below a certain point, then I'll start enforcing that row-level locking. 04:43 Lois: That's really cool… Nick: The one key thing that I wanted to mention here is that because of the way that the lock-free reservations work, you can have multiple transactions open on the same row. This is only supported for Oracle to Oracle. You need to have that same lock-free reservation data type and availability on that target system if GoldenGate is going to replicate into it. 05:05 Nikita: Are there any new features related to the diagnosability and observability of GoldenGate?  Nick: We've improved the AWR reports in Oracle 23ai. There's now seven sections that are specific to Oracle GoldenGate to allow you to really go in and see exactly what the GoldenGate processes are doing and how they're behaving inside the database itself. And there's a Replication Performance Advisor package inside that database, and that's been integrated into the Web UI as well. So now you can actually get information out of the replication advisor package in Oracle directly from the UI without having to log into the database and try to run any database procedures to get it. We've also added the ability to support a per-PDB Extract.  So in the past, when GoldenGate would run on a multitenant database, a multitenant database in Oracle, all the redo data from any pluggable database gets sent to that one redo stream. And so you would have to configure GoldenGate at the container or root level and it would be able to access anything at any PDB. Now, there's better security and better performance by doing what we call per-PDB Extract. And this means that for a single pluggable database, I can have an extract that runs at that database level that's going to capture information just from that pluggable database. 06:22 Lois And what about non-Oracle environments, Nick? Nick: We've also enhanced the non-Oracle environments as well. For example, in Postgres, we've added support for precise instantiation using Postgres snapshots. This eliminates the need to handle collisions when you're doing Postgres to Postgres replication and initial instantiation. On the GoldenGate for big data side, we've renamed that product more aptly to distributed applications in analytics, which is really what it does, and we've added a whole bunch of new features here too. The ability to move data into Databricks, doing Google Pub/Sub delivery. We now have support for XAG within the GoldenGate for distributed applications and analytics. What that means is that now you can follow all of our MAA best practices for GoldenGate for Oracle, but it also works for the DAA product as well, meaning that if it's running on one node of a cluster and that node fails, it'll restart itself on another node in the cluster. We've also added the ability to deliver data to Redis, Google BigQuery, stage and merge functionality for better performance into the BigQuery product. And then we've added a completely new feature, and this is something called streaming data and apps and we're calling it AsyncAPI and CloudEvent data streaming. It's a long name, but what that means is that we now have the ability to publish changes from a GoldenGate trail file out to end users. And so this allows through the Web UI or through the REST API, you can now come into GoldenGate and through the distributed applications and analytics product, actually set up a subscription to a GoldenGate trail file. And so this allows us to push data into messaging environments, or you can simply subscribe to changes and it doesn't have to be the whole trail file, it can just be a subset. You can specify exactly which tables and you can put filters on that. You can also set up your topologies as well. So, it's a really cool feature that we've added here. 08:26 Nikita: Ok, you've given us a lot of updates about what GoldenGate can support. But can we also get some specifics? Nick: So as far as what we have, on the Oracle Database side, there's a ton of different Oracle databases we support, including the Autonomous Databases and all the different flavors of them, your Oracle Database Appliance, your Base Database Service within OCI, your of course, Standard and Enterprise Edition, as well as all the different flavors of Exadata, are all supported with GoldenGate. This is all for capture and delivery. And this is all versions as well. GoldenGate supports Oracle 23ai and below. We also have a ton of non-Oracle databases in different Cloud stores. On an non-Oracle side, we support everything from application-specific databases like FairCom DB, all the way to more advanced applications like Snowflake, which there's a vast user base for that. We also support a lot of different cloud stores and these again, are non-Oracle, nonrelational systems, or they can be relational databases. We also support a lot of big data platforms and this is part of the distributed applications and analytics side of things where you have the ability to replicate to different Apache environments, different Cloudera environments. We also support a number of open-source systems, including things like Apache Cassandra, MySQL Community Edition, a lot of different Postgres open source databases along with MariaDB. And then we have a bunch of streaming event products, NoSQL data stores, and even Oracle applications that we support. So there's absolutely a ton of different environments that GoldenGate supports. There are additional Oracle databases that we support and this includes the Oracle Metadata Service, as well as Oracle MySQL, including MySQL HeatWave. Oracle also has Oracle NoSQL Spatial and Graph and times 10 products, which again are all supported by GoldenGate. 10:23 Lois: Wow, that's a lot of information! Nick: One of the things that we didn't really cover was the different SaaS applications, which we've got like Cerner, Fusion Cloud, Hospitality, Retail, MICROS, Oracle Transportation, JD Edwards, Siebel, and on and on and on.  And again, because of the nature of GoldenGate, it's heterogeneous. Any source can talk to any target. And so it doesn't have to be, oh, I'm pulling from Oracle Fusion Cloud, that means I have to go to an Oracle Database on the target, not necessarily.  10:51 Lois: So, there's really a massive amount of flexibility built into the system.  11:00 Unlock the power of AI Vector Search with our new course and certification. Get more accurate search results, handle complex datasets easily, and supercharge your data-driven decisions. From now through May 15, 2025, we are waiving the certification exam fee (valued at $245). Visit mylearn.oracle.com to enroll. 11:26 Nikita: Welcome back! Now that we've gone through the base product, what other features or products are in the GoldenGate family itself, Nick? Nick: So we have quite a few. We've kind of touched already on GoldenGate for Oracle databases and non-Oracle databases. We also have something called GoldenGate for Mainframe, which right now is covered under the GoldenGate for non-Oracle, but there is a licensing difference there. So that's something to be aware of. We also have the OCI GoldenGate product. We are announcing and we have announced that OCI GoldenGate will also be made available as part of the Oracle Database@Azure and Oracle Database@ Google Cloud partnerships.  And then you'll be able to use that vendor's cloud credits to actually pay for the OCI GoldenGate product. One of the cool things about this is it will have full feature parity with OCI GoldenGate running in OCI. So all the same features, all the same sources and targets, all the same topologies be able to migrate data in and out of those clouds at will, just like you do with OCI GoldenGate today running in OCI.  We have Oracle GoldenGate Free.  This is a completely free edition of GoldenGate to use. It is limited on the number of platforms that it supports as far as sources and targets and the size of the database.  12:45 Lois: But it's a great way for developers to really experience GoldenGate without worrying about a license, right? What's next, Nick? Nick: We have GoldenGate for Distributed Applications and Analytics, which was formerly called GoldenGate for big data, and that allows us to do all the streaming. That's also where the GoldenGate AsyncAPI integration is done. So in order to publish the GoldenGate trail files or allow people to subscribe to them, it would be covered under the Oracle GoldenGate Distributed Applications and Analytics license. We also have OCI GoldenGate Marketplace, which allows you to run essentially the on-premises version of GoldenGate but within OCI. So a little bit more flexibility there. It also has a hub architecture. So if you need that 99.99% availability, you can get it within the OCI Marketplace environment. We have GoldenGate for Oracle Enterprise Manager Cloud Control, which used to be called Oracle Enterprise Manager. And this allows you to use Enterprise Manager Cloud Control to get all the statistics and details about GoldenGate. So all the reporting information, all the analytics, all the statistics, how fast GoldenGate is replicating, what's the lag, what's the performance of each of the processes, how much data am I sending across a network. All that's available within the plug-in. We also have Oracle GoldenGate Veridata. This is a nice utility and tool that allows you to compare two databases, whether or not GoldenGate is running between them and actually tell you, hey, these two systems are out of sync. And if they are out of sync, it actually allows you to repair the data too. 14:25 Nikita: That's really valuable…. Nick: And it does this comparison without locking the source or the target tables. The other really cool thing about Veridata is it does this while there's data in flight. So let's say that the GoldenGate lag is 15 or 20 seconds and I want to compare this table that has 10 million rows in it. The Veridata product will go out, run its comparison once. Once that comparison is done the first time, it's then going to have a list of rows that are potentially out of sync. Well, some of those rows could have been moved over or could have been modified during that 10 to 15 second window. And so the next time you run Veridata, it's actually going to go through. It's going to check just those rows that were potentially out of sync to see if they're really out of sync or not. And if it comes back and says, hey, out of those potential rows, there's two out of sync, it'll actually produce a script that allows you to resynchronize those systems and repair them. So it's a very cool product.  15:19 Nikita: What about GoldenGate Stream Analytics? I know you mentioned it in the last episode, but in the context of this discussion, can you tell us a little more about it?  Nick: This is the ability to essentially stream data from a GoldenGate trail file, and they do a real time analytics on it. And also things like geofencing or real-time series analysis of it.  15:40 Lois: Could you give us an example of this? Nick: If I'm working in tracking stock market information and stocks, it's not really that important on how much or how far down a stock goes. What's really important is how quickly did that stock rise or how quickly did that stock fall. And that's something that GoldenGate Stream Analytics product can do. Another thing that it's very valuable for is the geofencing. I can have an application on my phone and I can track where the user is based on that application and all that information goes into a database. I can then use the geofencing tool to say that, hey, if one of those users on that app gets within a certain distance of one of my brick-and-mortar stores, I can actually send them a push notification to say, hey, come on in and you can order your favorite drink just by clicking Yes, and we'll have it ready for you. And so there's a lot of things that you can do there to help upsell your customers and to get more revenue just through GoldenGate itself. And then we also have a GoldenGate Migration Utility, which allows customers to migrate from the classic architecture into the microservices architecture. 16:44 Nikita: Thanks Nick for that comprehensive overview.  Lois: In our next episode, we'll have Nick back with us to talk about commonly used terminology and the GoldenGate architecture. And if you want to learn more about what we discussed today, visit mylearn.oracle.com and take a look at the Oracle GoldenGate 23ai Fundamentals course. Until next time, this is Lois Houston… Nikita: And Nikita Abraham, signing off! 17:10 That's all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We'd also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.

The New Stack Podcast
Prequel: Software Errors Be Gone

The New Stack Podcast

Play Episode Listen Later May 5, 2025 5:13


Prequel is launching a new developer-focused service aimed at democratizing software error detection—an area typically dominated by large cloud providers. Co-founded by Lyndon Brown and Tony Meehan, both former NSA engineers, Prequel introduces a community-driven observability approach centered on Common Reliability Enumerations (CREs). CREs categorize recurring production issues, helping engineers detect, understand, and communicate problems without reinventing solutions or working in isolation. Their open-source tools, cre and prereq, allow teams to build and share detectors that catch bugs and anti-patterns in real time—without exposing sensitive data, thanks to edge processing using WebAssembly.The urgency behind Prequel's mission stems from the rapid pace of AI-driven development, increased third-party code usage, and rising infrastructure costs. Traditional observability tools may surface symptoms, but Prequel aims to provide precise problem definitions and actionable insights. While observability giants like Datadog and Splunk dominate the market, Brown and Meehan argue that engineers still feel overwhelmed by data and underpowered in diagnostics—something they believe CREs can finally change.Learn more from The New Stack about the latest Observability insights Why Consolidating Observability Tools Is a Smart MoveBuilding an Observability Culture: Getting Everyone Onboard Join our community of newsletter subscribers to stay on top of the news and at the top of your game. 

Postgres FM
synchronous_commit

Postgres FM

Play Episode Listen Later May 2, 2025 50:53


Nikolay and Michael discuss synchronous_commit — what it means on single node setups, for synchronous replication setups, and the pros and cons of the different options for each. Here are some links to things they mentioned:synchronous_commit https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-SYNCHRONOUS-COMMITsynchronous_commit history on pgPedia https://pgpedia.info/s/synchronous_commit.htmlPatroni's maximum_lag_on_failover setting https://patroni.readthedocs.io/en/master/replication_modes.html#asynchronous-mode-durabilitywal_writer_delay https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-WAL-WRITER-DELAYSelective asynchronous commits in PostgreSQL - balancing durability and performance (blog post by Shayon Mukherjee) https://www.shayon.dev/post/2025/75/selective-asynchronous-commits-in-postgresql-balancing-durability-and-performance/Asynchronous Commit https://www.postgresql.org/docs/current/wal-async-commit.htmlsynchronous_standby_names https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-SYNCHRONOUS-STANDBY-NAMESJepson article about Amazon RDS multi-AZ clusters (by Kyle Kingsbury, aka "Aphyr”) https://jepsen.io/analyses/amazon-rds-for-postgresql-17.4~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork

Oracle University Podcast
What is Oracle GoldenGate 23ai?

Oracle University Podcast

Play Episode Listen Later Apr 29, 2025 18:03


In a new season of the Oracle University Podcast, Lois Houston and Nikita Abraham dive into the world of Oracle GoldenGate 23ai, a cutting-edge software solution for data management. They are joined by Nick Wagner, a seasoned expert in database replication, who provides a comprehensive overview of this powerful tool.   Nick highlights GoldenGate's ability to ensure continuous operations by efficiently moving data between databases and platforms with minimal overhead. He emphasizes its role in enabling real-time analytics, enhancing data security, and reducing costs by offloading data to low-cost hardware. The discussion also covers GoldenGate's role in facilitating data sharing, improving operational efficiency, and reducing downtime during outages.   Oracle GoldenGate 23ai: Fundamentals: https://mylearn.oracle.com/ou/course/oracle-goldengate-23ai-fundamentals/145884/237273 Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ X: https://x.com/Oracle_Edu   Special thanks to Arijit Ghosh, David Wright, Kris-Ann Nansen, Radhika Banka, and the OU Studio Team for helping us create this episode. ---------------------------------------------------------------   Episode Transcript: 00:00 Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we'll bring you foundational training on the most popular Oracle technologies. Let's get started! 00:25 Nikita: Welcome to the Oracle University Podcast! I'm Nikita Abraham, Team Lead: Editorial Services with Oracle University, and with me is Lois Houston: Director of Innovation Programs. Lois: Hi everyone! Welcome to a new season of the podcast. This time, we're focusing on the fundamentals of Oracle GoldenGate. Oracle GoldenGate helps organizations manage and synchronize their data across diverse systems and databases in real time.  And with the new Oracle GoldenGate 23ai release, we'll uncover the latest innovations and features that empower businesses to make the most of their data. Nikita: Taking us through this is Nick Wagner, Senior Director of Product Management for Oracle GoldenGate. He's been doing database replication for about 25 years and has been focused on GoldenGate on and off for about 20 of those years.  01:18 Lois: In today's episode, we'll ask Nick to give us a general overview of the product, along with some use cases and benefits. Hi Nick! To start with, why do customers need GoldenGate? Nick: Well, it delivers continuous operations, being able to continuously move data from one database to another database or data platform in efficiently and a high-speed manner, and it does this with very low overhead. Almost all the GoldenGate environments use transaction logs to pull the data out of the system, so we're not creating any additional triggers or very little overhead on that source system. GoldenGate can also enable real-time analytics, being able to pull data from all these different databases and move them into your analytics system in real time can improve the value that those analytics systems provide. Being able to do real-time statistics and analysis of that data within those high-performance custom environments is really important. 02:13 Nikita: Does it offer any benefits in terms of cost?  Nick: GoldenGate can also lower IT costs. A lot of times people run these massive OLTP databases, and they are running reporting in those same systems. With GoldenGate, you can offload some of the data or all the data to a low-cost commodity hardware where you can then run the reports on that other system. So, this way, you can get back that performance on the OLTP system, while at the same time optimizing your reporting environment for those long running reports. You can improve efficiencies and reduce risks. Being able to reduce the amount of downtime during planned and unplanned outages can really make a big benefit to the overall operational efficiencies of your company.  02:54 Nikita: What about when it comes to data sharing and data security? Nick: You can also reduce barriers to data sharing. Being able to pull subsets of data, or just specific pieces of data out of a production database and move it to the team or to the group that needs that information in real time is very important. And it also protects the security of your data by only moving in the information that they need and not the entire database. It also provides extensibility and flexibility, being able to support multiple different replication topologies and architectures. 03:24 Lois: Can you tell us about some of the use cases of GoldenGate? Where does GoldenGate truly shine?  Nick: Some of the more traditional use cases of GoldenGate include use within the multicloud fabric. Within a multicloud fabric, this essentially means that GoldenGate can replicate data between on-premise environments, within cloud environments, or hybrid, cloud to on-premise, on-premise to cloud, or even within multiple clouds. So, you can move data from AWS to Azure to OCI. You can also move between the systems themselves, so you don't have to use the same database in all the different clouds. For example, if you wanted to move data from AWS Postgres into Oracle running in OCI, you can do that using Oracle GoldenGate. We also support maximum availability architectures. And so, there's a lot of different use cases here, but primarily geared around reducing your recovery point objective and recovery time objective. 04:20 Lois: Ah, reducing RPO and RTO. That must have a significant advantage for the customer, right? Nick: So, reducing your RPO and RTO allows you to take advantage of some of the benefits of GoldenGate, being able to do active-active replication, being able to set up GoldenGate for high availability, real-time failover, and it can augment your active Data Guard and Data Guard configuration. So, a lot of times GoldenGate is used within Oracle's maximum availability architecture platinum tier level of replication, which means that at that point you've got lots of different capabilities within the Oracle Database itself. But to help eke out that last little bit of high availability, you want to set up an active-active environment with GoldenGate to really get true zero RPO and RTO. GoldenGate can also be used for data offloading and data hubs. Being able to pull data from one or more source systems and move it into a data hub, or into a data warehouse for your operational reporting. This could also be your analytics environment too. 05:22 Nikita: Does GoldenGate support online migrations? Nick: In fact, a lot of companies actually get started in GoldenGate by doing a migration from one platform to another. Now, these don't even have to be something as complex as going from one database like a DB2 on-premise into an Oracle on OCI, it could even be simple migrations. A lot of times doing something like a major application or a major database version upgrade is going to take downtime on that production system. You can use GoldenGate to eliminate that downtime. So this could be going from Oracle 19c to Oracle 23ai, or going from application version 1.0 to application version 2.0, because GoldenGate can do the transformation between the different application schemas. You can use GoldenGate to migrate your database from on premise into the cloud with no downtime as well. We also support real-time analytic feeds, being able to go from multiple databases, not only those on premise, but being able to pull information from different SaaS applications inside of OCI and move it to your different analytic systems. And then, of course, we also have the ability to stream events and analytics within GoldenGate itself.  06:34 Lois: Let's move on to the various topologies supported by GoldenGate. I know GoldenGate supports many different platforms and can be used with just about any database. Nick: This first layer of topologies is what we usually consider relational database topologies. And so this would be moving data from Oracle to Oracle, Postgres to Oracle, Sybase to SQL Server, a lot of different types of databases. So the first architecture would be unidirectional. This is replicating from one source to one target. You can do this for reporting. If I wanted to offload some reports into another server, I can go ahead and do that using GoldenGate. I can replicate the entire database or just a subset of tables. I can also set up GoldenGate for bidirectional, and this is what I want to set up GoldenGate for something like high availability. So in the event that one of the servers crashes, I can almost immediately reconnect my users to the other system. And that almost immediately depends on the amount of latency that GoldenGate has at that time. So a typical latency is anywhere from 3 to 6 seconds. So after that primary system fails, I can reconnect my users to the other system in 3 to 6 seconds. And I can do that because as GoldenGate's applying data into that target database, that target system is already open for read and write activity. GoldenGate is just another user connecting in issuing DML operations, and so it makes that failover time very low. 07:59 Nikita: Ok…If you can get it down to 3 to 6 seconds, can you bring it down to zero? Like zero failover time?   Nick: That's the next topology, which is active-active. And in this scenario, all servers are read/write all at the same time and all available for user activity. And you can do multiple topologies with this as well. You can do a mesh architecture, which is where every server talks to every other server. This works really well for 2, 3, 4, maybe even 5 environments, but when you get beyond that, having every server communicate with every other server can get a little complex. And so at that point we start looking at doing what we call a hub and spoke architecture, where we have lots of different spokes. At the end of each spoke is a read/write database, and then those communicate with a hub. So any change that happens on one spoke gets sent into the hub, and then from the hub it gets sent out to all the other spokes. And through that architecture, it allows you to really scale up your environments. We have customers that are doing up to 150 spokes within that hub architecture. Within active-active replication as well, we can do conflict detection and resolution, which means that if two users modify the same row on two different systems, GoldenGate can actually determine that there was an issue with that and determine what user wins or which row change wins, which is extremely important when doing active-active replication. And this means that if one of those systems fails, there is no downtime when you switch your users to another active system because it's already available for activity and ready to go. 09:35 Lois: Wow, that's fantastic. Ok, tell us more about the topologies. Nick: GoldenGate can do other things like broadcast, sending data from one system to multiple systems, or many to one as far as consolidation. We can also do cascading replication, so when data moves from one environment that GoldenGate is replicating into another environment that GoldenGate is replicating. By default, we ignore all of our own transactions. But there's actually a toggle switch that you can flip that says, hey, GoldenGate, even though you wrote that data into that database, still push it on to the next system. And then of course, we can also do distribution of data, and this is more like moving data from a relational database into something like a Kafka topic or a JMS queue or into some messaging service. 10:24 Raise your game with the Oracle Cloud Applications skills challenge. Get free training on Oracle Fusion Cloud Applications, Oracle Modern Best Practice, and Oracle Cloud Success Navigator. Pass the free Oracle Fusion Cloud Foundations Associate exam to earn a Foundations Associate certification. Plus, there's a chance to win awards and prizes throughout the challenge! What are you waiting for? Join the challenge today by visiting visit oracle.com/education. 10:58 Nikita: Welcome back! Nick, does GoldenGate also have nonrelational capabilities?  Nick: We have a number of nonrelational replication events in topologies as well. This includes things like data lake ingestion and streaming ingestion, being able to move data and data objects from these different relational database platforms into data lakes and into these streaming systems where you can run analytics on them and run reports. We can also do cloud ingestion, being able to move data from these databases into different cloud environments. And this is not only just moving it into relational databases with those clouds, but also their data lakes and data fabrics. 11:38 Lois: You mentioned a messaging service earlier. Can you tell us more about that? Nick: Messaging replication is also possible. So we can actually capture from things like messaging systems like Kafka Connect and JMS, replicate that into a relational data, or simply stream it into another environment. We also support NoSQL replication, being able to capture from MongoDB and replicate it onto another MongoDB for high availability or disaster recovery, or simply into any other system. 12:06 Nikita: I see. And is there any integration with a customer's SaaS applications? Nick: GoldenGate also supports a number of different OCI SaaS applications. And so a lot of these different applications like Oracle Financials Fusion, Oracle Transportation Management, they all have GoldenGate built under the covers and can be enabled with a flag that you can actually have that data sent out to your other GoldenGate environment. So you can actually subscribe to changes that are happening in these other systems with very little overhead. And then of course, we have event processing and analytics, and this is the final topology or flexibility within GoldenGate itself. And this is being able to push data through data pipelines, doing data transformations. GoldenGate is not an ETL tool, but it can do row-level transformation and row-level filtering.  12:55 Lois: Are there integrations offered by Oracle GoldenGate in automation and artificial intelligence? Nick: We can do time series analysis and geofencing using the GoldenGate Stream Analytics product. It allows you to actually do real time analysis and time series analysis on data as it flows through the GoldenGate trails. And then that same product, the GoldenGate Stream Analytics, can then take the data and move it to predictive analytics, where you can run MML on it, or ONNX or other Spark-type technologies and do real-time analysis and AI on that information as it's flowing through.  13:29 Nikita: So, GoldenGate is extremely flexible. And given Oracle's focus on integrating AI into its product portfolio, what about GoldenGate? Does it offer any AI-related features, especially since the product name has “23ai” in it? Nick: With the advent of Oracle GoldenGate 23ai, it's one of the two products at this point that has the AI moniker at Oracle. Oracle Database 23ai also has it, and that means that we actually do stuff with AI. So the Oracle GoldenGate product can actually capture vectors from databases like MySQL HeatWave, Postgres using pgvector, which includes things like AlloyDB, Amazon RDS Postgres, Aurora Postgres. We can also replicate data into Elasticsearch and OpenSearch, or if the data is using vectors within OCI or the Oracle Database itself. So GoldenGate can be used for a number of things here. The first one is being able to migrate vectors into the Oracle Database. So if you're using something like Postgres, MySQL, and you want to migrate the vector information into the Oracle Database, you can. Now one thing to keep in mind here is a vector is oftentimes like a GPS coordinate. So if I need to know the GPS coordinates of Austin, Texas, I can put in a latitude and longitude and it will give me the GPS coordinates of a building within that city. But if I also need to know the altitude of that same building, well, that's going to be a different algorithm. And GoldenGate and replicating vectors is the same way. When you create a vector, it's essentially just creating a bunch of numbers under the screen, kind of like those same GPS coordinates. The dimension and the algorithm that you use to generate that vector can be different across different databases, but the actual meaning of that data will change. And so GoldenGate can replicate the vector data as long as the algorithm and the dimensions are the same. If the algorithm and the dimensions are not the same between the source and the target, then you'll actually want GoldenGate to replicate the base data that created that vector. And then once GoldenGate replicates the base data, it'll actually call the vector embedding technology to re-embed that data and produce that numerical formatting for you.  15:42 Lois: So, there are some nuances there… Nick: GoldenGate can also replicate and consolidate vector changes or even do the embedding API calls itself. This is really nice because it means that we can take changes from multiple systems and consolidate them into a single one. We can also do the reverse of that too. A lot of customers are still trying to find out which algorithms work best for them. How many dimensions? What's the optimal use? Well, you can now run those in different servers without impacting your actual AI system. Once you've identified which algorithm and dimension is going to be best for your data, you can then have GoldenGate replicate that into your production system and we'll start using that instead. So it's a nice way to switch algorithms without taking extensive downtime. 16:29 Nikita: What about in multicloud environments?  Nick: GoldenGate can also do multicloud and N-way active-active Oracle replication between vectors. So if there's vectors in Oracle databases, in multiple clouds, or multiple on-premise databases, GoldenGate can synchronize them all up. And of course we can also stream changes from vector information, including text as well into different search engines. And that's where the integration with Elasticsearch and OpenSearch comes in. And then we can use things like NVIDIA and Cohere to actually do the AI on that data.  17:01 Lois: Using GoldenGate with AI in the database unlocks so many possibilities. Thanks for that detailed introduction to Oracle GoldenGate 23ai and its capabilities, Nick.  Nikita: We've run out of time for today, but Nick will be back next week to talk about how GoldenGate has evolved over time and its latest features. And if you liked what you heard today, head over to mylearn.oracle.com and take a look at the Oracle GoldenGate 23ai Fundamentals course to learn more. Until next time, this is Nikita Abraham… Lois: And Lois Houston, signing off! 17:33 That's all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We'd also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.

Postgres FM
Managed service support

Postgres FM

Play Episode Listen Later Apr 25, 2025 35:37


Nikolay and Michael discuss managed service support — some tips on how to handle cases that aren't going well, tips for requesting features, whether to factor in support when choosing service provider, and whether to use one at all. Here are some links to things they mentioned:YugabyteDB's new upgrade framework https://www.yugabyte.com/blog/postgresql-upgrade-frameworkEpisode on Blue-green deployments https://postgres.fm/episodes/blue-green-deploymentspg_createsubscriber https://www.postgresql.org/docs/current/app-pgcreatesubscriber.html~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork

Software Huddle
From ORM to Infra: Prisma Postgres with Søren Bramer Schmidt

Software Huddle

Play Episode Listen Later Apr 22, 2025 62:22


Today we have Søren from Prisma on the show. Prisma has been the most popular ORM in the TypeScript world for a while, and now they're moving more into hosted infrastructure. We spend a lot of time talking about their new offering called Prisma Postgres, which is this unikernel-based Postgres offering. It's a really unique offering from both a technical and a product perspective. On the technical side, they're doing some interesting work compared to other Postgres providers. They're running on bare metal in a colocation facility rather than the default public clouds like AWS, GCP, and Azure. Further, they're using unikernels in a Firecracker VM, giving them unique startup and security characteristics. These technical decisions give them unique economics compared to standard providers, so they're able to have a generous free tier and a unique billing model that works great for serverless applications with spiky workloads. Around all of this, it's very interesting to see a company with such a unique spread of products — a popular, mature open-source library paired with a mission-critical infrastructure service offering. We talked about the difficulties in building a company that accommodates these two very different products. Timestamps 01:51 Start 06:08 Prisma Postgres 09:10 Accelerate 11:39 Why Postgres 17:32 How Prisma Postgres Works 21:32 Colocation Facility 22:05 Unikernels 27:56 CoLo vs Public Cloud 29:11 Building the team 31:46 Missing Features that are being worked on 32:31 Use Cases 33:37 Colo Locations 34:53 Cloudflare 35:42 Biggest surprises since release 37:34 More Unikernel adoption? 39:08 Supporting Prisma ORM 46:43 Mongo 47:51 Life as A CEO 53:04 MCP 57:23 Søren Questions Alex Software Huddle ⤵︎ X: https://twitter.com/SoftwareHuddle Substack: https://softwarehuddle.substack.com

What the Dev?
305: Why PostgreSQL became the database of choice for cloud native development (with Neon's Heikki Linnakangas)

What the Dev?

Play Episode Listen Later Apr 22, 2025 11:06


In this episode, Dave Rubinstein interviews Heikki Linnakangas, co-founder of Neon, a company that provides Postgres solutions. They discuss: The factors that have contributed to adoption of PostgreSQLWhy PostgreSQL has leapfrogged over MySQL in popularityWhat to expect in PostgreSQL 18

IGeometry
Sequential Scans in Postgres just got faster

IGeometry

Play Episode Listen Later Apr 18, 2025 27:36


This new PostgreSQL 17 feature is game changer. They know can combine IOs when performing sequential scan. Grab my database coursehttps://courses.husseinnasser.com

Syntax - Tasty Web Development Treats
893: Everyone Is Talking About MCP

Syntax - Tasty Web Development Treats

Play Episode Listen Later Apr 14, 2025 33:59


Scott and Wes break down the Model Context Protocol (MCP), a new open standard that gives AI agents secure, tool-like access to your dev environment. They cover how it works, why it's a big deal for AI coding workflows, and real-world use cases like GitHub, Sentry, and YouTube. Show Notes 00:00 Welcome to Syntax! 00:49 The lore of ICP. Wes MCP Shirt. 03:09 Brought to you by Sentry.io. 03:33 What is MCP? 05:06 The steps of AI coding. 07:11 MCP hosts. 07:28 MCP clients. 07:35 MCP servers. 08:24 Why you might want to do this. 10:39 How this works in VS Code. 14:10 Wes built an MCP server. SVGL. 14:57 Playwright. 17:24 Sentry's implementation. Building Sentry's MCP with David Cramer. 18:54 YouTube implementation. 21:19 DaVinci Resolve implementation. Smithery. 23:02 Postgres. 24:40 Transport protocols. 24:49 STDIO. 25:19 SSE. 25:32 Streaming. 26:24 Writing you own MCP server. 26:28 FastMCP. 27:00 Cloudflare. 28:01 Data validation. 28:47 Standard schema. Episode 873. 29:27 Other parts of MCP. 29:35 MCP resources. 30:37 MCP prompts. 30:48 MCP roots. Hit us up on Socials! Syntax: X Instagram Tiktok LinkedIn Threads Wes: X Instagram Tiktok LinkedIn Threads Scott: X Instagram Tiktok LinkedIn Threads Randy: X Instagram YouTube Threads

Machine Learning Guide
MLA 024 Code AI MCP Servers, ML Engineering

Machine Learning Guide

Play Episode Listen Later Apr 13, 2025 43:38


Tool Use and Model Context Protocol (MCP) Notes and resources at  ocdevel.com/mlg/mla-24 Try a walking desk to stay healthy while you study or work! Tool Use in Vibe Coding Agents File Operations: Agents can read, edit, and search files using sophisticated regular expressions. Executable Commands: They can recommend and perform installations like pip or npm installs, with user approval. Browser Integration: Allows agents to perform actions and verify outcomes through browser interactions. Model Context Protocol (MCP) Standardization: MCP was created by Anthropic to standardize how AI tools and agents communicate with each other and with external tools. Implementation: MCP Client: Converts AI agent requests into structured commands. MCP Server: Executes commands and sends structured responses back to the client. Local and Cloud Frameworks: Local (S-T-D-I-O MCP): Examples include utilizing Playwright for local browser automation and connecting to local databases like Postgres. Cloud (SSE MCP): SaaS providers offer cloud-hosted MCPs to enhance external integrations. Expanding AI Capabilities with MCP Servers Directories: Various directories exist listing MCP servers for diverse functions beyond programming. modelcontextprotocol/servers Use Cases: Automation Beyond Coding: Implementing MCPs that extend automation into non-programming tasks like sales, marketing, or personal project management. Creative Solutions: Encourages innovation in automating routine tasks by integrating diverse MCP functionalities. AI Tools in Machine Learning Automating ML Process: Auto ML and Feature Engineering: AI tools assist in transforming raw data, optimizing hyperparameters, and inventing new ML solutions. Pipeline Construction and Deployment: Facilitates the use of infrastructure as code for deploying ML models efficiently. Active Experimentation: Jupyter Integration Challenges: While integrations are possible, they often lag and may not support the latest models. Practical Strategies: Suggests alternating between Jupyter and traditional Python files to maximize tool efficiency. Conclusion Action Plan for ML Engineers: Setup structured folders and documentation to leverage AI tools effectively. Encourage systematic exploration of MCPs to enhance both direct programming tasks and associated workflows.

Postgres FM
Time-series considerations

Postgres FM

Play Episode Listen Later Apr 11, 2025 42:42


Nikolay and Michael discuss time-series considerations for Postgres — including when it matters, some tips for avoiding issues, performance considerations, and more. Here are some links to things they mentioned:Time series data https://en.wikipedia.org/wiki/Time_seriesTimescaleDB https://github.com/timescale/timescaledb13 Tips to Improve PostgreSQL Insert Performance https://www.timescale.com/blog/13-tips-to-improve-postgresql-insert-performanceWhy we're leaving the cloud (37 Signals / Basecamp / David Heinemeier Hansson) https://world.hey.com/dhh/why-we-re-leaving-the-cloud-654b47e0UUID v7 and partitioning (“how to” by Nikolay) https://gitlab.com/postgres-ai/postgresql-consulting/postgres-howtos/-/blob/main/0065_uuid_v7_and_partitioning_timescaledb.mdpg_cron https://github.com/citusdata/pg_cronpg_partman https://github.com/pgpartman/pg_partmanOur episode on BRIN indexes https://postgres.fm/episodes/brin-indexesTutorial from Citus (Andres Freund and Marco Slot) including rollups https://www.youtube.com/watch?v=0ybz6zuXCPoIoT with PostgreSQL (talk by Chris Ellis) https://youtube.com/watch?v=KnUoDBGv4aw&t=58pg_timeseries https://github.com/tembo-io/pg_timeseriesDuckDB https://duckdb.org~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork

The EdUp Experience
LIVE from Ellucian LIVE 2025 - with Mike Stone⁠, Systems Manager, ⁠Lois Kellermann⁠, Programmer/Analyst, & ⁠George Kriss⁠, CIO, ⁠Kaskaskia College⁠

The EdUp Experience

Play Episode Listen Later Apr 9, 2025 15:07


It's YOUR time to #EdUpIn this episode, recorded LIVE from Ellucian LIVE 2025 in Orlando, Florida,YOUR guests are Mike Stone, Systems Manager, Lois Kellermann, Programmer/Analyst, & George Kriss, CIO, Kaskaskia CollegeYOUR host is ⁠⁠⁠⁠⁠⁠⁠⁠⁠Dr. Joe Sallustio⁠⁠⁠⁠⁠How did Kaskaskia College complete their SAS modernization in just 12 months?What challenges did they face during their on-premise to SAS transition?How did they manage change across the institution during implementation?What benefits have they seen from their Ellucian Colleague modernization?Why was their financial aid implementation surprisingly smooth?Topics include:Winning the Ellucian Impact Award for SAS modernizationMoving from SQL to Postgres database seamlesslyCreating an inclusive college-wide project rather than just an IT initiativeUsing effective communication strategies for change managementFreeing up IT resources to focus on strategic student-facing initiativesListen in to #EdUpDo YOU want to accelerate YOUR professional development?Do YOU want to get exclusive early access to ad-free episodes, extended episodes, bonus episodes, original content, invites to special events, & more?Then ⁠⁠⁠⁠⁠⁠BECOME AN #EdUp PREMIUM SUBSCRIBER TODAY⁠⁠ - $19.99/month or $199.99/year (Save 17%)!Want YOUR org to cover costs? Email: EdUp@edupexperience.comThank YOU so much for tuning in. Join us on the next episode for YOUR time to EdUp!Connect with YOUR EdUp Team - ⁠⁠⁠⁠⁠⁠⁠⁠⁠Elvin Freytes⁠⁠⁠⁠⁠⁠⁠⁠⁠ & ⁠⁠⁠⁠⁠⁠⁠⁠⁠Dr. Joe Sallustio⁠⁠⁠⁠● Join YOUR EdUp community at ⁠⁠⁠⁠⁠⁠⁠⁠⁠The EdUp Experience⁠⁠⁠⁠⁠⁠⁠⁠⁠!We make education YOUR business!

Thinking Elixir Podcast
248: Security Insights with Paraxial

Thinking Elixir Podcast

Play Episode Listen Later Apr 8, 2025 57:43


News includes a new Elixir case study about Cyanview's camera shading technology used at major events like the Olympics and Super Bowl, Oban Pro 1.6 with 20x faster queue partitioning, the openid_connect package reaching version 1.0, Supabase's new Postgres Language Server for developer tooling, and ElixirEvents.net as a community resource. Plus, we interview Michael Lubas, founder of Paraxial.io, about web application security in Elixir, what's involved in a security audit, and how his Elixir-focused security company is helping teams and businesses in the community. Show Notes online - http://podcast.thinkingelixir.com/248 (http://podcast.thinkingelixir.com/248) Elixir Community News https://elixir-lang.org/blog/2025/03/25/cyanview-elixir-case/ (https://elixir-lang.org/blog/2025/03/25/cyanview-elixir-case/?utm_source=thinkingelixir&utm_medium=shownotes) – New Elixir case study about Cyanview, a Belgian company whose Remote Control Panel for camera shading is used at major events like the Olympics and Super Bowl. Their Elixir-powered solution enables remote camera control across challenging network conditions. https://oban.pro/docs/pro/1.6.0-rc.1/changelog.html (https://oban.pro/docs/pro/1.6.0-rc.1/changelog.html?utm_source=thinkingelixir&utm_medium=shownotes) – Oban Pro 1.6 released with subworkflows, improved queue partitioning (20x faster), and a new guide explaining different job composition approaches. https://oban.pro/docs/pro/1.6.0-rc.1/composition.html (https://oban.pro/docs/pro/1.6.0-rc.1/composition.html?utm_source=thinkingelixir&utm_medium=shownotes) – New Oban Pro guide explaining when to use chains, workflows, chunks, or batches for job composition. https://github.com/DockYard/openid_connect (https://github.com/DockYard/openid_connect?utm_source=thinkingelixir&utm_medium=shownotes) – The Elixir package 'openid_connect' reached version 1.0, providing client library support for working with various OpenID Connect providers like Google, Microsoft Azure AD, Auth0, and others. https://hexdocs.pm/openid_connect/readme.html (https://hexdocs.pm/openid_connect/readme.html?utm_source=thinkingelixir&utm_medium=shownotes) – Documentation for the newly released openid_connect 1.0 package. https://bsky.app/profile/davelucia.com/post/3llqwsbyutc2z (https://bsky.app/profile/davelucia.com/post/3llqwsbyutc2z?utm_source=thinkingelixir&utm_medium=shownotes) – Announcement that openid_connect is maintained by tvlabs. https://bsky.app/profile/germsvel.com/post/3llee5lyerk2b (https://bsky.app/profile/germsvel.com/post/3llee5lyerk2b?utm_source=thinkingelixir&utm_medium=shownotes) – PhoenixTest v0.6.0 has been released with significant changes, including a breaking change. https://github.com/germsvel/phoenix_test (https://github.com/germsvel/phoenix_test?utm_source=thinkingelixir&utm_medium=shownotes) – GitHub repository for PhoenixTest. https://hexdocs.pm/phoenixtest/upgradeguides.html#upgrading-to-0-6-0 (https://hexdocs.pm/phoenix_test/upgrade_guides.html#upgrading-to-0-6-0?utm_source=thinkingelixir&utm_medium=shownotes) – Upgrade guide for updating to PhoenixTest v0.6.0 with its breaking change. https://hexdocs.pm/phoenix_test/changelog.html#0-6-0 (https://hexdocs.pm/phoenix_test/changelog.html#0-6-0?utm_source=thinkingelixir&utm_medium=shownotes) – Changelog for PhoenixTest v0.6.0. https://supabase.com/blog/postgres-language-server (https://supabase.com/blog/postgres-language-server?utm_source=thinkingelixir&utm_medium=shownotes) – Supabase has released a new Postgres Language Server for developers, providing IDE intellisense and autocomplete for PostgreSQL. https://marketplace.visualstudio.com/items?itemName=Supabase.postgrestools (https://marketplace.visualstudio.com/items?itemName=Supabase.postgrestools?utm_source=thinkingelixir&utm_medium=shownotes) – VSCode extension for Supabase's new Postgres developer tools. https://github.com/supabase-community/postgres-language-server (https://github.com/supabase-community/postgres-language-server?utm_source=thinkingelixir&utm_medium=shownotes) – GitHub repository for Supabase's Postgres Language Server. https://pgtools.dev/ (https://pgtools.dev/?utm_source=thinkingelixir&utm_medium=shownotes) – Official website for Postgres Tools with documentation and features. https://pgtools.dev/checking_migrations/ (https://pgtools.dev/checking_migrations/?utm_source=thinkingelixir&utm_medium=shownotes) – Feature in Postgres Tools that lints database migrations to check for problematic schema changes. https://github.com/fly-apps/safe-ecto-migrations (https://github.com/fly-apps/safe-ecto-migrations?utm_source=thinkingelixir&utm_medium=shownotes) – Resource for ensuring safe Ecto migrations. https://fly.io/phoenix-files/safe-ecto-migrations/ (https://fly.io/phoenix-files/safe-ecto-migrations/?utm_source=thinkingelixir&utm_medium=shownotes) – Article about safe Ecto migrations posted on Fly.io. https://elixirevents.net/ (https://elixirevents.net/?utm_source=thinkingelixir&utm_medium=shownotes) – Community resource created by Johanna Larsson for tracking, sharing, and learning about Elixir events worldwide. https://bsky.app/profile/elixirevents.net (https://bsky.app/profile/elixirevents.net?utm_source=thinkingelixir&utm_medium=shownotes) – Bluesky account for ElixirEvents.net for following Elixir community events. Do you have some Elixir news to share? Tell us at @ThinkingElixir (https://twitter.com/ThinkingElixir) or email at show@thinkingelixir.com (mailto:show@thinkingelixir.com) Discussion Resources https://paraxial.io/ (https://paraxial.io/?utm_source=thinkingelixir&utm_medium=shownotes) https://paraxial.io/blog/index (https://paraxial.io/blog/index?utm_source=thinkingelixir&utm_medium=shownotes) – Blog with posts about security for Elixir, Rails, and the Paraxial service https://www.cnn.com/2025/03/18/tech/google-wiz-acquisition/index.html (https://www.cnn.com/2025/03/18/tech/google-wiz-acquisition/index.html?utm_source=thinkingelixir&utm_medium=shownotes) https://podcast.thinkingelixir.com/93 (https://podcast.thinkingelixir.com/93?utm_source=thinkingelixir&utm_medium=shownotes) – Our last discussion was 3 years ago in episode 93! Titled "Preventing Service Abuse with Michael Lubas" https://www.amazon.com/Innovators-Dilemma-Revolutionary-Change-Business/dp/0062060244 (https://www.amazon.com/Innovators-Dilemma-Revolutionary-Change-Business/dp/0062060244?utm_source=thinkingelixir&utm_medium=shownotes) https://www.merriam-webster.com/dictionary/Kafkaesque - having a nightmarishly complex, bizarre, or illogical quality (https://www.merriam-webster.com/dictionary/Kafkaesque - having a nightmarishly complex, bizarre, or illogical quality?utm_source=thinkingelixir&utm_medium=shownotes) https://paraxial.io/blog/oban-pentest (https://paraxial.io/blog/oban-pentest?utm_source=thinkingelixir&utm_medium=shownotes) – Completed a Security Audit of Oban Pro - this is after ObanPro went free and OpenSource https://paraxial.io/blog/elixir-best (https://paraxial.io/blog/elixir-best?utm_source=thinkingelixir&utm_medium=shownotes) – Elixir and Phoenix Security Checklist: 11 Best Practices https://paraxial.io/blog/rails-command-injection (https://paraxial.io/blog/rails-command-injection?utm_source=thinkingelixir&utm_medium=shownotes) – Ruby on Rails Security: Preventing Command Injection https://paraxial.io/blog/paraxial-three (https://paraxial.io/blog/paraxial-three?utm_source=thinkingelixir&utm_medium=shownotes) – Paraxial.io v3 blog post Guest Information - Michael Lubas, Paraxial.io Founder - michael@paraxial.io - https://x.com/paraxialio (https://x.com/paraxialio?utm_source=thinkingelixir&utm_medium=shownotes) – on Twitter/X - https://x.com/paraxialio (https://x.com/paraxialio?utm_source=thinkingelixir&utm_medium=shownotes) – on Twitter/X - https://github.com/paraxialio/ (https://github.com/paraxialio/?utm_source=thinkingelixir&utm_medium=shownotes) – on Github - https://www.youtube.com/@paraxial5874 (https://www.youtube.com/@paraxial5874?utm_source=thinkingelixir&utm_medium=shownotes) – Paraxial.io channel on YouTube - https://genserver.social/paraxial (https://genserver.social/paraxial?utm_source=thinkingelixir&utm_medium=shownotes) – on Fediverse - https://paraxial.io/ (https://paraxial.io/?utm_source=thinkingelixir&utm_medium=shownotes) – Blog Find us online - Message the show - Bluesky (https://bsky.app/profile/thinkingelixir.com) - Message the show - X (https://x.com/ThinkingElixir) - Message the show on Fediverse - @ThinkingElixir@genserver.social (https://genserver.social/ThinkingElixir) - Email the show - show@thinkingelixir.com (mailto:show@thinkingelixir.com) - Mark Ericksen on X - @brainlid (https://x.com/brainlid) - Mark Ericksen on Bluesky - @brainlid.bsky.social (https://bsky.app/profile/brainlid.bsky.social) - Mark Ericksen on Fediverse - @brainlid@genserver.social (https://genserver.social/brainlid) - David Bernheisel on Bluesky - @david.bernheisel.com (https://bsky.app/profile/david.bernheisel.com) - David Bernheisel on Fediverse - @dbern@genserver.social (https://genserver.social/dbern)

Software Sessions
Brandon Liu on Protomaps

Software Sessions

Play Episode Listen Later Apr 6, 2025 59:57


Brandon Liu is an open source developer and creator of the Protomaps basemap project. We talk about how static maps help developers build sites that last, the PMTiles file format, the role of OpenStreetMap, and his experience funding and running an open source project full time. Protomaps Protomaps PMTiles (File format used by Protomaps) Self-hosted slippy maps, for novices (like me) Why Deploy Protomaps on a CDN User examples Flickr Pinball Map Toilet Map Related projects OpenStreetMap (Dataset protomaps is based on) Mapzen (Former company that released details on what to display based on zoom levels) Mapbox GL JS (Mapbox developed source available map rendering library) MapLibre GL JS (Open source fork of Mapbox GL JS) Other links HTTP range requests (MDN) Hilbert curve Transcript You can help correct transcripts on GitHub. Intro [00:00:00] Jeremy: I'm talking to Brandon Liu. He's the creator of Protomaps, which is a way to easily create and host your own maps. Let's get into it. [00:00:09] Brandon: Hey, so thanks for having me on the podcast. So I'm Brandon. I work on an open source project called Protomaps. What it really is, is if you're a front end developer and you ever wanted to put maps on a website or on a mobile app, then Protomaps is sort of an open source solution for doing that that I hope is something that's way easier to use than, um, a lot of other open source projects. Why not just use Google Maps? [00:00:36] Jeremy: A lot of people are gonna be familiar with Google Maps. Why should they worry about whether something's open source? Why shouldn't they just go and use the Google maps API? [00:00:47] Brandon: So Google Maps is like an awesome thing it's an awesome product. Probably one of the best tech products ever right? And just to have a map that tells you what restaurants are open and something that I use like all the time especially like when you're traveling it has all that data. And the most amazing part is that it's free for consumers but it's not necessarily free for developers. Like if you wanted to embed that map onto your website or app, that usually has an API cost which still has a free tier and is affordable. But one motivation, one basic reason to use open source is if you have some project that doesn't really fit into that pricing model. You know like where you have to pay the cost of Google Maps, you have a side project, a nonprofit, that's one reason. But there's lots of other reasons related to flexibility or customization where you might want to use open source instead. Protomaps examples [00:01:49] Jeremy: Can you give some examples where people have used Protomaps and where that made sense for them? [00:01:56] Brandon: I follow a lot of the use cases and I also don't know about a lot of them because I don't have an API where I can track a hundred percent of the users. Some of them use the hosted version, but I would say most of them probably use it on their own infrastructure. One of the cool projects I've been seeing is called Toilet Map. And what toilet map is if you're in the UK and you want find a public restroom then it maps out, sort of crowdsourced all of the public restrooms. And that's important for like a lot of people if they have health issues, they need to find that information. And just a lot of different projects in the same vein. There's another one called Pinball Map which is sort of a hobby project to find all the pinball machines in the world. And they wanted to have a customized map that fit in with their theme of pinball. So these sorts of really cool indie projects are the ones I'm most excited about. Basemaps vs Overlays [00:02:57] Jeremy: And if we talk about, like the pinball map as an example, there's this concept of a basemap and then there's the things that you lay on top of it. What is a basemap and then is the pinball locations is that part of it or is that something separate? [00:03:12] Brandon: It's usually something separate. The example I usually use is if you go to a real estate site, like Zillow, you'll open up the map of Seattle and it has a bunch of pins showing all the houses, and then it has some information beneath it. That information beneath it is like labels telling, this neighborhood is Capitol Hill, or there is a park here. But all that information is common to a lot of use cases and it's not specific to real estate. So I think usually that's the distinction people use in the industry between like a base map versus your overlay. The overlay is like the data for your product or your company while the base map is something you could get from Google or from Protomaps or from Apple or from Mapbox that kind of thing. PMTiles for hosting the basemap and overlays [00:03:58] Jeremy: And so Protomaps in particular is responsible for the base map, and that information includes things like the streets and the locations of landmarks and things like that. Where is all that information coming from? [00:04:12] Brandon: So the base map information comes from a project called OpenStreetMap. And I would also, point out that for Protomaps as sort of an ecosystem. You can also put your overlay data into a format called PMTiles, which is sort of the core of what Protomaps is. So it can really do both. It can transform your data into the PMTiles format which you can host and you can also host the base map. So you kind of have both of those sides of the product in one solution. [00:04:43] Jeremy: And so when you say you have both are you saying that the PMTiles file can have, the base map in one file and then you would have the data you're laying on top in another file? Or what are you describing there? [00:04:57] Brandon: That's usually how I recommend to do it. Oftentimes there'll be sort of like, a really big basemap 'cause it has all of that data about like where the rivers are. Or while, if you want to put your map of toilets or park benches or pickleball courts on top, that's another file. But those are all just like assets you can move around like JSON or CSV files. Statically Hosted [00:05:19] Jeremy: And I think one of the things you mentioned was that your goal was to make Protomaps or the, the use of these PMTiles files easy to use. What does that look like for, for a developer? I wanna host a map. What do I actually need to, to put on my servers? [00:05:38] Brandon: So my usual pitch is that basically if you know how to use S3 or cloud storage, that you know how to deploy a map. And that, I think is the main sort of differentiation from most open source projects. Like a lot of them, they call themselves like, like some sort of self-hosted solution. But I've actually avoided using the term self-hosted because I think in most cases that implies a lot of complexity. Like you have to log into a Linux server or you have to use Kubernetes or some sort of Docker thing. What I really want to emphasize is the idea that, for Protomaps, it's self-hosted in the same way like CSS is self-hosted. So you don't really need a service from Amazon to host the JSON files or CSV files. It's really just a static file. [00:06:32] Jeremy: When you say static file that means you could use any static web host to host your HTML file, your JavaScript that actually renders the map. And then you have your PMTiles files, and you're not running a process or anything, you're just putting your files on a static file host. [00:06:50] Brandon: Right. So I think if you're a developer, you can also argue like a static file server is a server. It's you know, it's the cloud, it's just someone else's computer. It's really just nginx under the hood. But I think static storage is sort of special. If you look at things like static site generators, like Jekyll or Hugo, they're really popular because they're a commodity or like the storage is a commodity. And you can take your blog, make it a Jekyll blog, hosted on S3. One day, Amazon's like, we're charging three times as much so you can move it to a different cloud provider. And that's all vendor neutral. So I think that's really the special thing about static storage as a primitive on the web. Why running servers is a problem for resilience [00:07:36] Jeremy: Was there a prior experience you had? Like you've worked with maps for a very long time. Were there particular difficulties you had where you said I just gotta have something that can be statically hosted? [00:07:50] Brandon: That's sort of exactly why I got into this. I've been working sort of in and around the map space for over a decade, and Protomaps is really like me trying to solve the same problem I've had over and over again in the past, just like once and forever right? Because like once this problem is solved, like I don't need to deal with it again in the future. So I've worked at a couple of different companies before, mostly as a contractor, for like a humanitarian nonprofit for a design company doing things like, web applications to visualize climate change. Or for even like museums, like digital signage for museums. And oftentimes they had some sort of data visualization component, but always sort of the challenge of how to like, store and also distribute like that data was something that there wasn't really great open source solutions. So just for map data, that's really what motivated that design for Protomaps. [00:08:55] Jeremy: And in those, those projects in the past, were those things where you had to run your own server, run your own database, things like that? [00:09:04] Brandon: Yeah. And oftentimes we did, we would spin up an EC2 instance, for maybe one client and then we would have to host this server serving map data forever. Maybe the client goes away, or I guess it's good for business if you can sign some sort of like long-term support for that client saying, Hey, you know, like we're done with a project, but you can pay us to maintain the EC2 server for the next 10 years. And that's attractive. but it's also sort of a pain, because usually what happens is if people are given the choice, like a developer between like either I can manage the server on EC2 or on Rackspace or Hetzner or whatever, or I can go pay a SaaS to do it. In most cases, businesses will choose to pay the SaaS. So that's really like what creates a sort of lock-in is this preference for like, so I have this choice between like running the server or paying the SaaS. Like businesses will almost always go and pay the SaaS. [00:10:05] Jeremy: Yeah. And in this case, you either find some kind of free hosting or low-cost hosting just to host your files and you upload the files and then you're good from there. You don't need to maintain anything. [00:10:18] Brandon: Exactly, and that's really the ideal use case. so I have some users these, climate science consulting agencies, and then they might have like a one-off project where they have to generate the data once, but instead of having to maintain this server for the lifetime of that project, they just have a file on S3 and like, who cares? If that costs a couple dollars a month to run, that's fine, but it's not like S3 is gonna be deprecated, like it's gonna be on an insecure version of Ubuntu or something. So that's really the ideal, set of constraints for using Protomaps. [00:10:58] Jeremy: Yeah. Something this also makes me think about is, is like the resilience of sites like remaining online, because I, interviewed, Kyle Drake, he runs Neocities, which is like a modern version of GeoCities. And if I remember correctly, he was mentioning how a lot of old websites from that time, if they were running a server backend, like they were running PHP or something like that, if you were to try to go to those sites, now they're like pretty much all dead because there needed to be someone dedicated to running a Linux server, making sure things were patched and so on and so forth. But for static sites, like the ones that used to be hosted on GeoCities, you can go to the internet archive or other websites and they were just files, right? You can bring 'em right back up, and if anybody just puts 'em on a web server, then you're good. They're still alive. Case study of news room preferring static hosting [00:11:53] Brandon: Yeah, exactly. One place that's kind of surprising but makes sense where this comes up, is for newspapers actually. Some of the users using Protomaps are the Washington Post. And the reason they use it, is not necessarily because they don't want to pay for a SaaS like Google, but because if they make an interactive story, they have to guarantee that it still works in a couple of years. And that's like a policy decision from like the editorial board, which is like, so you can't write an article if people can't view it in five years. But if your like interactive data story is reliant on a third party, API and that third party API becomes deprecated, or it changes the pricing or it, you know, it gets acquired, then your journalism story is not gonna work anymore. So I have seen really good uptake among local news rooms and even big ones to use things like Protomaps just because it makes sense for the requirements. Working on Protomaps as an open source project for five years [00:12:49] Jeremy: How long have you been working on Protomaps and the parts that it's made up of such as PMTiles? [00:12:58] Brandon: I've been working on it for about five years, maybe a little more than that. It's sort of my pandemic era project. But the PMTiles part, which is really the heart of it only came in about halfway. Why not make a SaaS? [00:13:13] Brandon: So honestly, like when I first started it, I thought it was gonna be another SaaS and then I looked at it and looked at what the environment was around it. And I'm like, uh, so I don't really think I wanna do that. [00:13:24] Jeremy: When, when you say you looked at the environment around it what do you mean? Why did you decide not to make it a SaaS? [00:13:31] Brandon: Because there already is a lot of SaaS out there. And I think the opportunity of making something that is unique in terms of those use cases, like I mentioned like newsrooms, was clear. Like it was clear that there was some other solution, that could be built that would fit these needs better while if it was a SaaS, there are plenty of those out there. And I don't necessarily think that they're well differentiated. A lot of them all use OpenStreetMap data. And it seems like they mainly compete on price. It's like who can build the best three column pricing model. And then once you do that, you need to build like billing and metrics and authentication and like those problems don't really interest me. So I think, although I acknowledge sort of the indie hacker ethos now is to build a SaaS product with a monthly subscription, that's something I very much chose not to do, even though it is for sure like the best way to build a business. [00:14:29] Jeremy: Yeah, I mean, I think a lot of people can appreciate that perspective because it's, it's almost like we have SaaS overload, right? Where you have so many little bills for your project where you're like, another $5 a month, another $10 a month, or if you're a business, right? Those, you add a bunch of zeros and at some point it's just how many of these are we gonna stack on here? [00:14:53] Brandon: Yeah. And honestly. So I really think like as programmers, we're not really like great at choosing how to spend money like a $10 SaaS. That's like nothing. You know? So I can go to Starbucks and I can buy a pumpkin spice latte, and that's like $10 basically now, right? And it's like I'm able to make that consumer choice in like an instant just to spend money on that. But then if you're like, oh, like spend $10 on a SaaS that somebody put a lot of work into, then you're like, oh, that's too expensive. I could just do it myself. So I'm someone that also subscribes to a lot of SaaS products. and I think for a lot of things it's a great fit. Many open source SaaS projects are not easy to self host [00:15:37] Brandon: But there's always this tension between an open source project that you might be able to run yourself and a SaaS. And I think a lot of projects are at different parts of the spectrum. But for Protomaps, it's very much like I'm trying to move maps to being it is something that is so easy to run yourself that anyone can do it. [00:16:00] Jeremy: Yeah, and I think you can really see it with, there's a few SaaS projects that are successful and they're open source, but then you go to look at the self-hosting instructions and it's either really difficult to find and you find it, and then the instructions maybe don't work, or it's really complicated. So I think doing the opposite with Protomaps. As a user, I'm sure we're all appreciative, but I wonder in terms of trying to make money, if that's difficult. [00:16:30] Brandon: No, for sure. It is not like a good way to make money because I think like the ideal situation for an open source project that is open that wants to make money is the product itself is fundamentally complicated to where people are scared to run it themselves. Like a good example I can think of is like Supabase. Supabase is sort of like a platform as a service based on Postgres. And if you wanted to run it yourself, well you need to run Postgres and you need to handle backups and authentication and logging, and that stuff all needs to work and be production ready. So I think a lot of people, like they don't trust themselves to run database backups correctly. 'cause if you get it wrong once, then you're kind of screwed. So I think that fundamental aspect of the product, like a database is something that is very, very ripe for being a SaaS while still being open source because it's fundamentally hard to run. Another one I can think of is like tailscale, which is, like a VPN that works end to end. That's something where, you know, it has this networking complexity where a lot of developers don't wanna deal with that. So they'd happily pay, for tailscale as a service. There is a lot of products or open source projects that eventually end up just changing to becoming like a hosted service. Businesses going from open source to closed or restricted licenses [00:17:58] Brandon: But then in that situation why would they keep it open source, right? Like, if it's easy to run yourself well, doesn't that sort of cannibalize their business model? And I think that's really the tension overall in these open source companies. So you saw it happen to things like Elasticsearch to things like Terraform where they eventually change the license to one that makes it difficult for other companies to compete with them. [00:18:23] Jeremy: Yeah, I mean there's been a number of cases like that. I mean, specifically within the mapping community, one I can think of was Mapbox's. They have Mapbox gl. Which was a JavaScript client to visualize maps and they moved from, I forget which license they picked, but they moved to a much more restrictive license. I wonder what your thoughts are on something that releases as open source, but then becomes something maybe a little more muddy. [00:18:55] Brandon: Yeah, I think it totally makes sense because if you look at their business and their funding, it seems like for Mapbox, I haven't used it in a while, but my understanding is like a lot of their business now is car companies and doing in dash navigation. And that is probably way better of a business than trying to serve like people making maps of toilets. And I think sort of the beauty of it is that, so Mapbox, the story is they had a JavaScript renderer called Mapbox GL JS. And they changed that to a source available license a couple years ago. And there's a fork of it that I'm sort of involved in called MapLibre GL. But I think the cool part is Mapbox paid employees for years, probably millions of dollars in total to work on this thing and just gave it away for free. Right? So everyone can benefit from that work they did. It's not like that code went away, like once they changed the license. Well, the old version has been forked. It's going its own way now. It's quite different than the new version of Mapbox, but I think it's extremely generous that they're able to pay people for years, you know, like a competitive salary and just give that away. [00:20:10] Jeremy: Yeah, so we should maybe look at it as, it was a gift while it was open source, and they've given it to the community and they're on continuing on their own path, but at least the community running Map Libre, they can run with it, right? It's not like it just disappeared. [00:20:29] Brandon: Yeah, exactly. And that is something that I use for Protomaps quite extensively. Like it's the primary way of showing maps on the web and I've been trying to like work on some enhancements to it to have like better internationalization for if you are in like South Asia like not show languages correctly. So I think it is being taken in a new direction. And I think like sort of the combination of Protomaps and MapLibre, it addresses a lot of use cases, like I mentioned earlier with like these like hobby projects, indie projects that are almost certainly not interesting to someone like Mapbox or Google as a business. But I'm happy to support as a small business myself. Financially supporting open source work (GitHub sponsors, closed source, contracts) [00:21:12] Jeremy: In my previous interview with Tom, one of the main things he mentioned was that creating a mapping business is incredibly difficult, and he said he probably wouldn't do it again. So in your case, you're building Protomaps, which you've admitted is easy to self-host. So there's not a whole lot of incentive for people to pay you. How is that working out for you? How are you supporting yourself? [00:21:40] Brandon: There's a couple of strategies that I've tried and oftentimes failed at. Just to go down the list, so I do have GitHub sponsors so I do have a hosted version of Protomaps you can use if you don't want to bother copying a big file around. But the way I do the billing for that is through GitHub sponsors. If you wanted to use this thing I provide, then just be a sponsor. And that definitely pays for itself, like the cost of running it. And that's great. GitHub sponsors is so easy to set up. It just removes you having to deal with Stripe or something. 'cause a lot of people, their credit card information is already in GitHub. GitHub sponsors I think is awesome if you want to like cover costs for a project. But I think very few people are able to make that work. A thing that's like a salary job level. It's sort of like Twitch streaming, you know, there's a handful of people that are full-time streamers and then you look down the list on Twitch and it's like a lot of people that have like 10 viewers. But some of the other things I've tried, I actually started out, publishing the base map as a closed source thing, where I would sell sort of like a data package instead of being a SaaS, I'd be like, here's a one-time download, of the premium data and you can buy it. And quite a few people bought it I just priced it at like $500 for this thing. And I thought that was an interesting experiment. The main reason it's interesting is because the people that it attracts to you in terms of like, they're curious about your products, are all people willing to pay money. While if you start out everything being open source, then the people that are gonna be try to do it are only the people that want to get something for free. So what I discovered is actually like once you transition that thing from closed source to open source, a lot of the people that used to pay you money will still keep paying you money because like, it wasn't necessarily that that closed source thing was why they wanted to pay. They just valued that thought you've put into it your expertise, for example. So I think that is one thing, that I tried at the beginning was just start out, closed source proprietary, then make it open source. That's interesting to people. Like if you release something as open source, if you go the other way, like people are really mad if you start out with something open source and then later on you're like, oh, it's some other license. Then people are like that's so rotten. But I think doing it the other way, I think is quite valuable in terms of being able to find an audience. [00:24:29] Jeremy: And when you said it was closed source and paid to open source, do you still sell those map exports? [00:24:39] Brandon: I don't right now. It's something that I might do in the future, you know, like have small customizations of the data that are available, uh, for a fee. still like the core OpenStreetMap based map that's like a hundred gigs you can just download. And that'll always just be like a free download just because that's already out there. All the source code to build it is open source. So even if I said, oh, you have to pay for it, then someone else can just do it right? So there's no real reason like to make that like some sort of like paywall thing. But I think like overall if the project is gonna survive in the long term it's important that I'd ideally like to be able to like grow like a team like have a small group of people that can dedicate the time to growing the project in the long term. But I'm still like trying to figure that out right now. [00:25:34] Jeremy: And when you mentioned that when you went from closed to open and people were still paying you, you don't sell a product anymore. What were they paying for? [00:25:45] Brandon: So I have some contracts with companies basically, like if they need a feature or they need a customization in this way then I am very open to those. And I sort of set it up to make it clear from the beginning that this is not just a free thing on GitHub, this is something that you could pay for if you need help with it, if you need support, if you wanted it. I'm also a little cagey about the word support because I think like it sounds a little bit too wishy-washy. Pretty much like if you need access to the developers of an open source project, I think that's something that businesses are willing to pay for. And I think like making that clear to potential users is a challenge. But I think that is one way that you might be able to make like a living out of open source. [00:26:35] Jeremy: And I think you said you'd been working on it for about five years. Has that mostly been full time? [00:26:42] Brandon: It's been on and off. it's sort of my pandemic era project. But I've spent a lot of time, most of my time working on the open source project at this point. So I have done some things that were more just like I'm doing a customization or like a private deployment for some client. But that's been a minority of the time. Yeah. [00:27:03] Jeremy: It's still impressive to have an open source project that is easy to self-host and yet is still able to support you working on it full time. I think a lot of people might make the assumption that there's nothing to sell if something is, is easy to use. But this sort of sounds like a counterpoint to that. [00:27:25] Brandon: I think I'd like it to be. So when you come back to the point of like, it being easy to self-host. Well, so again, like I think about it as like a primitive of the web. Like for example, if you wanted to start a business today as like hosted CSS files, you know, like where you upload your CSS and then you get developers to pay you a monthly subscription for how many times they fetched a CSS file. Well, I think most developers would be like, that's stupid because it's just an open specification, you just upload a static file. And really my goal is to make Protomaps the same way where it's obvious that there's not really some sort of lock-in or some sort of secret sauce in the server that does this thing. How PMTiles works and building a primitive of the web [00:28:16] Brandon: If you look at video for example, like a lot of the tech for how Protomaps and PMTiles works is based on parts of the HTTP spec that were made for video. And 20 years ago, if you wanted to host a video on the web, you had to have like a real player license or flash. So you had to go license some server software from real media or from macromedia so you could stream video to a browser plugin. But now in HTML you can just embed a video file. And no one's like, oh well I need to go pay for my video serving license. I mean, there is such a thing, like YouTube doesn't really use that for DRM reasons, but people just have the assumption that video is like a primitive on the web. So if we're able to make maps sort of that same way like a primitive on the web then there isn't really some obvious business or licensing model behind how that works. Just because it's a thing and it helps a lot of people do their jobs and people are happy using it. So why bother? [00:29:26] Jeremy: You mentioned that it a tech that was used for streaming video. What tech specifically is it? [00:29:34] Brandon: So it is byte range serving. So when you open a video file on the web, So let's say it's like a 100 megabyte video. You don't have to download the entire video before it starts playing. It streams parts out of the file based on like what frames... I mean, it's based on the frames in the video. So it can start streaming immediately because it's organized in a way to where the first few frames are at the beginning. And what PMTiles really is, is it's just like a video but in space instead of time. So it's organized in a way where these zoomed out views are at the beginning and the most zoomed in views are at the end. So when you're like panning or zooming in the map all you're really doing is fetching byte ranges out of that file the same way as a video. But it's organized in, this tiled way on a space filling curve. IIt's a little bit complicated how it works internally and I think it's kind of cool but that's sort of an like an implementation detail. [00:30:35] Jeremy: And to the person deploying it, it just looks like a single file. [00:30:40] Brandon: Exactly in the same way like an mp3 audio file is or like a JSON file is. [00:30:47] Jeremy: So with a video, I can sort of see how as someone seeks through the video, they start at the beginning and then they go to the middle if they wanna see the middle. For a map, as somebody scrolls around the map, are you seeking all over the file or is the way it's structured have a little less chaos? [00:31:09] Brandon: It's structured. And that's kind of the main technical challenge behind building PMTiles is you have to be sort of clever so you're not spraying the reads everywhere. So it uses something called a hilbert curve, which is a mathematical concept of a space filling curve. Where it's one continuous curve that essentially lets you break 2D space into 1D space. So if you've seen some maps of IP space, it uses this crazy looking curve that hits all the points in one continuous line. And that's the same concept behind PMTiles is if you're looking at one part of the world, you're sort of guaranteed that all of those parts you're looking at are quite close to each other and the data you have to transfer is quite minimal, compared to if you just had it at random. [00:32:02] Jeremy: How big do the files get? If I have a PMTiles of the entire world, what kind of size am I looking at? [00:32:10] Brandon: Right now, the default one I distribute is 128 gigabytes, so it's quite sizable, although you can slice parts out of it remotely. So if you just wanted. if you just wanted California or just wanted LA or just wanted only a couple of zoom levels, like from zero to 10 instead of zero to 15, there is a command line tool that's also called PMTiles that lets you do that. Issues with CDNs and range queries [00:32:35] Jeremy: And when you're working with files of this size, I mean, let's say I am working with a CDN in front of my application. I'm not typically accustomed to hosting something that's that large and something that's where you're seeking all over the file. is that, ever an issue or is that something that's just taken care of by the browser and, and taken care of by, by the hosts? [00:32:58] Brandon: That is an issue actually, so a lot of CDNs don't deal with it correctly. And my recommendation is there is a kind of proxy server or like a serverless proxy thing that I wrote. That runs on like cloudflare workers or on Docker that lets you proxy those range requests into a normal URL and then that is like a hundred percent CDN compatible. So I would say like a lot of the big commercial installations of this thing, they use that because it makes more practical sense. It's also faster. But the idea is that this solution sort of scales up and scales down. If you wanted to host just your city in like a 10 megabyte file, well you can just put that into GitHub pages and you don't have to worry about it. If you want to have a global map for your website that serves a ton of traffic then you probably want a little bit more sophisticated of a solution. It still does not require you to run a Linux server, but it might require (you) to use like Lambda or Lambda in conjunction with like a CDN. [00:34:09] Jeremy: Yeah. And that sort of ties into what you were saying at the beginning where if you can host on something like CloudFlare Workers or Lambda, there's less time you have to spend keeping these things running. [00:34:26] Brandon: Yeah, exactly. and I think also the Lambda or CloudFlare workers solution is not perfect. It's not as perfect as S3 or as just static files, but in my experience, it still is better at building something that lasts on the time span of years than being like I have a server that is on this Ubuntu version and in four years there's all these like security patches that are not being applied. So it's still sort of serverless, although not totally vendor neutral like S3. Customizing the map [00:35:03] Jeremy: We've mostly been talking about how you host the map itself, but for someone who's not familiar with these kind of tools, how would they be customizing the map? [00:35:15] Brandon: For customizing the map there is front end style customization and there's also data customization. So for the front end if you wanted to change the water from the shade of blue to another shade of blue there is a TypeScript API where you can customize it almost like a text editor color scheme. So if you're able to name a bunch of colors, well you can customize the map in that way you can change the fonts. And that's all done using MapLibre GL using a TypeScript API on top of that for customizing the data. So all the pipeline to generate this data from OpenStreetMap is open source. There is a Java program using a library called PlanetTiler which is awesome, which is this super fast multi-core way of building map tiles. And right now there isn't really great hooks to customize what data goes into that. But that's something that I do wanna work on. And finally, because the data comes from OpenStreetMap if you notice data that's missing or you wanted to correct data in OSM then you can go into osm.org. You can get involved in contributing the data to OSM and the Protomaps build is daily. So if you make a change, then within 24 hours you should see the new base map. Have that change. And of course for OSM your improvements would go into every OSM based project that is ingesting that data. So it's not a protomap specific thing. It's like this big shared data source, almost like Wikipedia. OpenStreetMap is a dataset and not a map [00:37:01] Jeremy: I think you were involved with OpenStreetMap to some extent. Can you speak a little bit to that for people who aren't familiar, what OpenStreetMap is? [00:37:11] Brandon: Right. So I've been using OSM as sort of like a tools developer for over a decade now. And one of the number one questions I get from developers about what is Protomaps is why wouldn't I just use OpenStreetMap? What's the distinction between Protomaps and OpenStreetMap? And it's sort of like this funny thing because even though OSM has map in the name it's not really a map in that you can't... In that it's mostly a data set and not a map. It does have a map that you can see that you can pan around to when you go to the website but the way that thing they show you on the website is built is not really that easily reproducible. It involves a lot of c++ software you have to run. But OpenStreetMap itself, the heart of it is almost like a big XML file that has all the data in the map and global. And it has tagged features for example. So you can go in and edit that. It has a web front end to change the data. It does not directly translate into making a map actually. Protomaps decides what shows at each zoom level [00:38:24] Brandon: So a lot of the pipeline, that Java program I mentioned for building this basemap for protomaps is doing things like you have to choose what data you show when you zoom out. You can't show all the data. For example when you're zoomed out and you're looking at all of a state like Colorado you don't see all the Chipotle when you're zoomed all the way out. That'd be weird, right? So you have to make some sort of decision in logic that says this data only shows up at this zoom level. And that's really what is the challenge in optimizing the size of that for the Protomaps map project. [00:39:03] Jeremy: Oh, so those decisions of what to show at different Zoom levels those are decisions made by you when you're creating the PMTiles file with Protomaps. [00:39:14] Brandon: Exactly. It's part of the base maps build pipeline. and those are honestly very subjective decisions. Who really decides when you're zoomed out should this hospital show up or should this museum show up nowadays in Google, I think it shows you ads. Like if someone pays for their car repair shop to show up when you're zoomed out like that that gets surfaced. But because there is no advertising auction in Protomaps that doesn't happen obviously. So we have to sort of make some reasonable choice. A lot of that right now in Protomaps actually comes from another open source project called Mapzen. So Mapzen was a company that went outta business a couple years ago. They did a lot of this work in designing which data shows up at which Zoom level and open sourced it. And then when they shut down, they transferred that code into the Linux Foundation. So it's this totally open source project, that like, again, sort of like Mapbox gl has this awesome legacy in that this company funded it for years for smart people to work on it and now it's just like a free thing you can use. So the logic in Protomaps is really based on mapzen. [00:40:33] Jeremy: And so the visualization of all this... I think I understand what you mean when people say oh, why not use OpenStreetMaps because it's not really clear it's hard to tell is this the tool that's visualizing the data? Is it the data itself? So in the case of using Protomaps, it sounds like Protomaps itself has all of the data from OpenStreetMap and then it has made all the decisions for you in terms of what to show at different Zoom levels and what things to have on the map at all. And then finally, you have to have a separate, UI layer and in this case, it sounds like the one that you recommend is the Map Libre library. [00:41:18] Brandon: Yeah, that's exactly right. For Protomaps, it has a portion or a subset of OSM data. It doesn't have all of it just because there's too much, like there's data in there. people have mapped out different bushes and I don't include that in Protomaps if you wanted to go in and edit like the Java code to add that you can. But really what Protomaps is positioned at is sort of a solution for developers that want to use OSM data to make a map on their app or their website. because OpenStreetMap itself is mostly a data set, it does not really go all the way to having an end-to-end solution. Financials and the idea of a project being complete [00:41:59] Jeremy: So I think it's great that somebody who wants to make a map, they have these tools available, whether it's from what was originally built by Mapbox, what's built by Open StreetMap now, the work you're doing with Protomaps. But I wonder one of the things that I talked about with Tom was he was saying he was trying to build this mapping business and based on the financials of what was coming in he was stressed, right? He was struggling a bit. And I wonder for you, you've been working on this open source project for five years. Do you have similar stressors or do you feel like I could keep going how things are now and I feel comfortable? [00:42:46] Brandon: So I wouldn't say I'm a hundred percent in one bucket or the other. I'm still seeing it play out. One thing, that I really respect in a lot of open source projects, which I'm not saying I'm gonna do for Protomaps is the idea that a project is like finished. I think that is amazing. If a software project can just be done it's sort of like a painting or a novel once you write, finish the last page, have it seen by the editor. I send it off to the press is you're done with a book. And I think one of the pains of software is so few of us can actually do that. And I don't know obviously people will say oh the map is never finished. That's more true of OSM, but I think like for Protomaps. One thing I'm thinking about is how to limit the scope to something that's quite narrow to where we could be feature complete on the core things in the near term timeframe. That means that it does not address a lot of things that people want. Like search, like if you go to Google Maps and you search for a restaurant, you will get some hits. that's like a geocoding issue. And I've already decided that's totally outta scope for Protomaps. So, in terms of trying to think about the future of this, I'm mostly looking for ways to cut scope if possible. There are some things like better tooling around being able to work with PMTiles that are on the roadmap. but for me, I am still enjoying working on the project. It's definitely growing. So I can see on NPM downloads I can see the growth curve of people using it and that's really cool. So I like hearing about when people are using it for cool projects. So it seems to still be going okay for now. [00:44:44] Jeremy: Yeah, that's an interesting perspective about how you were talking about projects being done. Because I think when people look at GitHub projects and they go like, oh, the last commit was X months ago. They go oh well this is dead right? But maybe that's the wrong framing. Maybe you can get a project to a point where it's like, oh, it's because it doesn't need to be updated. [00:45:07] Brandon: Exactly, yeah. Like I used to do a lot of c++ programming and the best part is when you see some LAPACK matrix math library from like 1995 that still works perfectly in c++ and you're like, this is awesome. This is the one I have to use. But if you're like trying to use some like React component library and it hasn't been updated in like a year, you're like, oh, that's a problem. So again, I think there's some middle ground between those that I'm trying to find. I do like for Protomaps, it's quite dependency light in terms of the number of hard dependencies I have in software. but I do still feel like there is a lot of work to be done in terms of project scope that needs to have stuff added. You mostly only hear about problems instead of people's wins [00:45:54] Jeremy: Having run it for this long. Do you have any thoughts on running an open source project in general? On dealing with issues or managing what to work on things like that? [00:46:07] Brandon: Yeah. So I have a lot. I think one thing people point out a lot is that especially because I don't have a direct relationship with a lot of the people using it a lot of times I don't even know that they're using it. Someone sent me a message saying hey, have you seen flickr.com, like the photo site? And I'm like, no. And I went to flickr.com/map and it has Protomaps for it. And I'm like, I had no idea. But that's cool, if they're able to use Protomaps for this giant photo sharing site that's awesome. But that also means I don't really hear about when people use it successfully because you just don't know, I guess they, NPM installed it and it works perfectly and you never hear about it. You only hear about people's negative experiences. You only hear about people that come and open GitHub issues saying this is totally broken, and why doesn't this thing exist? And I'm like, well, it's because there's an infinite amount of things that I want to do, but I have a finite amount of time and I just haven't gone into that yet. And that's honestly a lot of the things and people are like when is this thing gonna be done? So that's, that's honestly part of why I don't have a public roadmap because I want to avoid that sort of bickering about it. I would say that's one of my biggest frustrations with running an open source project is how it's self-selected to only hear the negative experiences with it. Be careful what PRs you accept [00:47:32] Brandon: 'cause you don't hear about those times where it works. I'd say another thing is it's changed my perspective on contributing to open source because I think when I was younger or before I had become a maintainer I would open a pull request on a project unprompted that has a hundred lines and I'd be like, Hey, just merge this thing. But I didn't realize when I was younger well if I just merge it and I disappear, then the maintainer is stuck with what I did forever. You know if I add some feature then that person that maintains the project has to do that indefinitely. And I think that's very asymmetrical and it's changed my perspective a lot on accepting open source contributions. I wanna have it be open to anyone to contribute. But there is some amount of back and forth where it's almost like the default answer for should I accept a PR is no by default because you're the one maintaining it. And do you understand the shape of that solution completely to where you're going to support it for years because the person that's contributing it is not bound to those same obligations that you are. And I think that's also one of the things where I have a lot of trepidation around open source is I used to think of it as a lot more bazaar-like in terms of anyone can just throw their thing in. But then that creates a lot of problems for the people who are expected out of social obligation to continue this thing indefinitely. [00:49:23] Jeremy: Yeah, I can totally see why that causes burnout with a lot of open source maintainers, because you probably to some extent maybe even feel some guilt right? You're like, well, somebody took the time to make this. But then like you said you have to spend a lot of time trying to figure out is this something I wanna maintain long term? And one wrong move and it's like, well, it's in here now. [00:49:53] Brandon: Exactly. To me, I think that is a very common failure mode for open source projects is they're too liberal in the things they accept. And that's a lot of why I was talking about how that choice of what features show up on the map was inherited from the MapZen projects. If I didn't have that then somebody could come in and say hey, you know, I want to show power lines on the map. And they open a PR for power lines and now everybody who's using Protomaps when they're like zoomed out they see power lines are like I didn't want that. So I think that's part of why a lot of open source projects eventually evolve into a plugin system is because there is this demand as the project grows for more and more features. But there is a limit in the maintainers. It's like the demand for features is exponential while the maintainer amount of time and effort is linear. Plugin systems might reduce need for PRs [00:50:56] Brandon: So maybe the solution to smash that exponential down to quadratic maybe is to add a plugin system. But I think that is one of the biggest tensions that only became obvious to me after working on this for a couple of years. [00:51:14] Jeremy: Is that something you're considering doing now? [00:51:18] Brandon: Is the plugin system? Yeah. I think for the data customization, I eventually wanted to have some sort of programmatic API to where you could declare a config file that says I want ski routes. It totally makes sense. The power lines example is maybe a little bit obscure but for example like a skiing app and you want to be able to show ski slopes when you're zoomed out well you're not gonna be able to get that from Mapbox or from Google because they have a one size fits all map that's not specialized to skiing or to golfing or to outdoors. But if you like, in theory, you could do this with Protomaps if you changed the Java code to show data at different zoom levels. And that is to me what makes the most sense for a plugin system and also makes the most product sense because it enables a lot of things you cannot do with the one size fits all map. [00:52:20] Jeremy: It might also increase the complexity of the implementation though, right? [00:52:25] Brandon: Yeah, exactly. So that's like. That's really where a lot of the terrifying thoughts come in, which is like once you create this like config file surface area, well what does that look like? Is that JSON? Is that TOML, is that some weird like everything eventually evolves into some scripting language right? Where you have logic inside of your templates and I honestly do not really know what that looks like right now. That feels like something in the medium term roadmap. [00:52:58] Jeremy: Yeah and then in terms of bug reports or issues, now it's not just your code it's this exponential combination of whatever people put into these config files. [00:53:09] Brandon: Exactly. Yeah. so again, like I really respect the projects that have done this well or that have done plugins well. I'm trying to think of some, I think obsidian has plugins, for example. And that seems to be one of the few solutions to try and satisfy the infinite desire for features with the limited amount of maintainer time. Time split between code vs triage vs talking to users [00:53:36] Jeremy: How would you say your time is split between working on the code versus issue and PR triage? [00:53:43] Brandon: Oh, it varies really. I think working on the code is like a minority of it. I think something that I actually enjoy is talking to people, talking to users, getting feedback on it. I go to quite a few conferences to talk to developers or people that are interested and figure out how to refine the message, how to make it clearer to people, like what this is for. And I would say maybe a plurality of my time is spent dealing with non-technical things that are neither code or GitHub issues. One thing I've been trying to do recently is talk to people that are not really in the mapping space. For example, people that work for newspapers like a lot of them are front end developers and if you ask them to run a Linux server they're like I have no idea. But that really is like one of the best target audiences for Protomaps. So I'd say a lot of the reality of running an open source project is a lot like a business is it has all the same challenges as a business in terms of you have to figure out what is the thing you're offering. You have to deal with people using it. You have to deal with feedback, you have to deal with managing emails and stuff. I don't think the payoff is anywhere near running a business or a startup that's backed by VC money is but it's definitely not the case that if you just want to code, you should start an open source project because I think a lot of the work for an opensource project has nothing to do with just writing the code. It is in my opinion as someone having done a VC backed business before, it is a lot more similar to running, a tech company than just putting some code on GitHub. Running a startup vs open source project [00:55:43] Jeremy: Well, since you've done both at a high level what did you like about running the company versus maintaining the open source project? [00:55:52] Brandon: So I have done some venture capital accelerator programs before and I think there is an element of hype and energy that you get from that that is self perpetuating. Your co-founder is gungho on like, yeah, we're gonna do this thing. And your investors are like, you guys are geniuses. You guys are gonna make a killing doing this thing. And the way it's framed is sort of obvious to everyone that it's like there's a much more traditional set of motivations behind that, that people understand while it's definitely not the case for running an open source project. Sometimes you just wake up and you're like what the hell is this thing for, it is this thing you spend a lot of time on. You don't even know who's using it. The people that use it and make a bunch of money off of it they know nothing about it. And you know, it's just like cool. And then you only hear from people that are complaining about it. And I think like that's honestly discouraging compared to the more clear energy and clearer motivation and vision behind how most people think about a company. But what I like about the open source project is just the lack of those constraints you know? Where you have a mandate that you need to have this many customers that are paying by this amount of time. There's that sort of pressure on delivering a business result instead of just making something that you're proud of that's simple to use and has like an elegant design. I think that's really a difference in motivation as well. Having control [00:57:50] Jeremy: Do you feel like you have more control? Like you mentioned how you've decided I'm not gonna make a public roadmap. I'm the sole developer. I get to decide what goes in. What doesn't. Do you feel like you have more control in your current position than you did running the startup? [00:58:10] Brandon: Definitely for sure. Like that agency is what I value the most. It is possible to go too far. Like, so I'm very wary of the BDFL title, which I think is how a lot of open source projects succeed. But I think there is some element of for a project to succeed there has to be somebody that makes those decisions. Sometimes those decisions will be wrong and then hopefully they can be rectified. But I think going back to what I was talking about with scope, I think the overall vision and the scope of the project is something that I am very opinionated about in that it should do these things. It shouldn't do these things. It should be easy to use for this audience. Is it gonna be appealing to this other audience? I don't know. And I think that is really one of the most important parts of that leadership role, is having the power to decide we're doing this, we're not doing this. I would hope other developers would be able to get on board if they're able to make good use of the project, if they use it for their company, if they use it for their business, if they just think the project is cool. So there are other contributors at this point and I want to get more involved. But I think being able to make those decisions to what I believe is going to be the best project is something that is very special about open source, that isn't necessarily true about running like a SaaS business. [00:59:50] Jeremy: I think that's a good spot to end it on, so if people want to learn more about Protomaps or they wanna see what you're up to, where should they head? [01:00:00] Brandon: So you can go to Protomaps.com, GitHub, or you can find me or Protomaps on bluesky or Mastodon. [01:00:09] Jeremy: All right, Brandon, thank you so much for chatting today. [01:00:12] Brandon: Great. Thank you very much.

Postgres FM
Performance cliffs

Postgres FM

Play Episode Listen Later Apr 4, 2025 38:08


Nikolay and Michael are joined by Tomas Vondra to discuss single query performance cliffs — what they are, why they happen, some things we can do to make them less likely or less severe, and some potential improvements to Postgres that could help. Here are some links to things they mentioned:Tomas Vondra https://postgres.fm/people/tomas-vondraWhere do performance cliffs come from? (Talk by Tomas) https://www.youtube.com/watch?v=UzdAelm-QSYWhere do performance cliffs come from? (Slides) https://vondra.me/pdf/performance-cliffs-posette-2024.pdfIncrease the number of fast-path lock slots (committed for Postgres 18) https://www.postgresql.org/message-id/flat/E1ss4gX-000IvX-63%40gemulon.postgresql.org San Francisco Bay Area Postgres meet-up with Tomas on 8th April (online) https://www.meetup.com/postgresql-1/events/306484787Our episode on Extended Statistics https://postgres.fm/episodes/extended-statisticsLogging plan of the currently running query (proposed patch by Rafael Thofehrn Castro and Atsushi Torikoshi) https://commitfest.postgresql.org/patch/5330Our episode with Peter Geoghegan on Skip Scan https://postgres.fm/episodes/skip-scanIndex Prefetching patch that Tomas is collaborating with Peter Geoghegan on https://commitfest.postgresql.org/patch/4351A generalized join algorithm, G-Join (paper by Goetz Graefe) https://dl.gi.de/server/api/core/bitstreams/ce8e3fab-0bac-45fc-a6d4-66edaa52d574/content Smooth Scan: Robust Access Path Selection without Cardinality Estimation (paper by R. Borovica, S. Idreos, A. Ailamaki, M. Zukowski, C. Fraser) https://stratos.seas.harvard.edu/sites/g/files/omnuum4611/files/stratos/files/smoothscan.pdfJust-in-Time Compilation (JIT) https://www.postgresql.org/docs/current/jit.htmlNotes from a pgconf.dev unconference session in 2024 about JIT (discusses issues) https://wiki.postgresql.org/wiki/PGConf.dev_2024_Developer_Unconference#JIT_compilationImplementing an alternative JIT provider for PostgreSQL (by Xing Guo) https://higuoxing.com/archives/implementing-jit-provider-for-pgsqlTomas' Office Hours https://vondra.me/posts/office-hours-experiment ~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith special thanks to:Jessie Draws for the elephant artwork 

Path To Citus Con, for developers who love Postgres
Open Source Leadership with Bruce Momjian

Path To Citus Con, for developers who love Postgres

Play Episode Listen Later Apr 4, 2025 108:21


What does it take to lead a global open source project like Postgres? In Episode 26 of Talking Postgres with Claire Giordano, we sit down with Bruce Momjian—co-founder and core team member of the PostgreSQL Global Development Group—to explore the art of leadership in a volunteer-run open source community. Bruce shares what “servant leadership” really means; how saying I'm sorry can help make problems go away; and how letting go of who-gets-the-credit can fuel collaboration. We also dive into Bruce's origin story, from shaping Postgres's early days to mastering the art of public speaking. Pro tip: if you see a man in a bow tie at a Postgres conference, be sure to say hello—it's probably Bruce Momjian!Links mentioned in this episode:Open source project website: postgresql.orgWebsite: Bruce MomjianVideo of talk: Building Open Source Teams at FOSDEM 2023Slides: FOSDEM talk on Building Open Source TeamsWikipedia: John C. MaxwellHarry Truman quote: It is amazing what you can accomplish if you do not care who gets the creditThe New Stack: How to Generate AI From a DatabaseEDB Blog: Bruce Momjian's Insights from PGConf India 2025Conference schedule: PGConf India 2025Book: Why We Sleep by Matthew WalkerVideo of talk: Why Database Teams Need Crew Resource Management by Chris TraversWikipedia: Anna Karenina principleTalking Postgres podcast: Why mentor Postgres developers with Robert HaasDiscord invite: PostgreSQL Hacking serverMailing lists: PostgreSQL mailing listsConference: PostgreSQL Conference Nepal 2025 happening May 5-6Conference: PostgreSQL Conference Germany 2025 on May 8-9Conference: POSETTE: An Event for Postgres 2025 on Jun 10-12Upcoming POSETTE 2025 keynote: Databases in the AI Trenches by Bruce Momjian Conference: SouthEast | LinuxFest on Jun 13-15 in Charlotte NC Conference: Swiss PGDay 2025 happening Jun 26-27 Conference: PGDay Austria 2025 happening in Vienna on Sep 4Conference: PGDay UK 2025 happening in London on Sep 9Conference: PGDay Lowlands 2025 happening in Rotterdam on Sep 12Video from PGConf.dev 2024: Making PostgreSQL Hacking More InclusiveTalking Postgres podcast: How I got started as a developer (& in Postgres) with David RowleyWikipedia: O'Reilly Open Source Convention (OSCON)Calendar invite: LIVE recording of Ep27 of Talking Postgres to happen on Wed May 07 with guest Peter Farkas. The topic: “How I got started with FerretDB (& why we chose Postgres)”

SANS Internet Stormcenter Daily Network/Cyber Security and Information Security Stormcast
SANS Stormcast Wednesday Apr 2nd: Apple Updates Everything;

SANS Internet Stormcenter Daily Network/Cyber Security and Information Security Stormcast

Play Episode Listen Later Apr 2, 2025 7:16


Apple Patches Everything Apple released updates for all of its operating systems. Most were released on Monday with WatchOS patches released today on Tuesday. Two already exploited vulnerabilities, which were already patched in the latest iOS and macOS versions, are now patched for older operating systems as well. A total of 145 vulnerabilities were patched. https://isc.sans.edu/diary/Apple%20Patches%20Everything%3A%20March%2031st%202025%20Edition/31816 VMWare Workstation and Fusion update check broken VMWare s automatic update check in its Workstation and Fusion products is currently broken due to a redirect added as part of the Broadcom transition https://community.broadcom.com/vmware-cloud-foundation/question/certificate-error-is-occured-during-connecting-update-server NIM Postgres Vulnerability NIM Developers using prepared statements to send SQL queries to Postgres may expose themselves to a SQL injection vulnerability. NIM s Postgres library does not appear to use actual prepared statements; instead, it assembles the code and the user data as a string and passes them on to the database. This may lead to a SQL injection vulnerability https://blog.nns.ee/2025/03/28/nim-postgres-vulnerability/

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

If you're in SF: Join us for the Claude Plays Pokemon hackathon this Sunday!If you're not: Fill out the 2025 State of AI Eng survey for $250 in Amazon cards!We are SO excited to share our conversation with Dharmesh Shah, co-founder of HubSpot and creator of Agent.ai.A particularly compelling concept we discussed is the idea of "hybrid teams" - the next evolution in workplace organization where human workers collaborate with AI agents as team members. Just as we previously saw hybrid teams emerge in terms of full-time vs. contract workers, or in-office vs. remote workers, Dharmesh predicts that the next frontier will be teams composed of both human and AI members. This raises interesting questions about team dynamics, trust, and how to effectively delegate tasks between human and AI team members.The discussion of business models in AI reveals an important distinction between Work as a Service (WaaS) and Results as a Service (RaaS), something Dharmesh has written extensively about. While RaaS has gained popularity, particularly in customer support applications where outcomes are easily measurable, Dharmesh argues that this model may be over-indexed. Not all AI applications have clearly definable outcomes or consistent economic value per transaction, making WaaS more appropriate in many cases. This insight is particularly relevant for businesses considering how to monetize AI capabilities.The technical challenges of implementing effective agent systems are also explored, particularly around memory and authentication. Shah emphasizes the importance of cross-agent memory sharing and the need for more granular control over data access. He envisions a future where users can selectively share parts of their data with different agents, similar to how OAuth works but with much finer control. This points to significant opportunities in developing infrastructure for secure and efficient agent-to-agent communication and data sharing.Other highlights from our conversation* The Evolution of AI-Powered Agents – Exploring how AI agents have evolved from simple chatbots to sophisticated multi-agent systems, and the role of MCPs in enabling that.* Hybrid Digital Teams and the Future of Work – How AI agents are becoming teammates rather than just tools, and what this means for business operations and knowledge work.* Memory in AI Agents – The importance of persistent memory in AI systems and how shared memory across agents could enhance collaboration and efficiency.* Business Models for AI Agents – Exploring the shift from software as a service (SaaS) to work as a service (WaaS) and results as a service (RaaS), and what this means for monetization.* The Role of Standards Like MCP – Why MCP has been widely adopted and how it enables agent collaboration, tool use, and discovery.* The Future of AI Code Generation and Software Engineering – How AI-assisted coding is changing the role of software engineers and what skills will matter most in the future.* Domain Investing and Efficient Markets – Dharmesh's approach to domain investing and how inefficiencies in digital asset markets create business opportunities.* The Philosophy of Saying No – Lessons from "Sorry, You Must Pass" and how prioritization leads to greater productivity and focus.Timestamps* 00:00 Introduction and Guest Welcome* 02:29 Dharmesh Shah's Journey into AI* 05:22 Defining AI Agents* 06:45 The Evolution and Future of AI Agents* 13:53 Graph Theory and Knowledge Representation* 20:02 Engineering Practices and Overengineering* 25:57 The Role of Junior Engineers in the AI Era* 28:20 Multi-Agent Systems and MCP Standards* 35:55 LinkedIn's Legal Battles and Data Scraping* 37:32 The Future of AI and Hybrid Teams* 39:19 Building Agent AI: A Professional Network for Agents* 40:43 Challenges and Innovations in Agent AI* 45:02 The Evolution of UI in AI Systems* 01:00:25 Business Models: Work as a Service vs. Results as a Service* 01:09:17 The Future Value of Engineers* 01:09:51 Exploring the Role of Agents* 01:10:28 The Importance of Memory in AI* 01:11:02 Challenges and Opportunities in AI Memory* 01:12:41 Selective Memory and Privacy Concerns* 01:13:27 The Evolution of AI Tools and Platforms* 01:18:23 Domain Names and AI Projects* 01:32:08 Balancing Work and Personal Life* 01:35:52 Final Thoughts and ReflectionsTranscriptAlessio [00:00:04]: Hey everyone, welcome back to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Small AI.swyx [00:00:12]: Hello, and today we're super excited to have Dharmesh Shah to join us. I guess your relevant title here is founder of Agent AI.Dharmesh [00:00:20]: Yeah, that's true for this. Yeah, creator of Agent.ai and co-founder of HubSpot.swyx [00:00:25]: Co-founder of HubSpot, which I followed for many years, I think 18 years now, gonna be 19 soon. And you caught, you know, people can catch up on your HubSpot story elsewhere. I should also thank Sean Puri, who I've chatted with back and forth, who's been, I guess, getting me in touch with your people. But also, I think like, just giving us a lot of context, because obviously, My First Million joined you guys, and they've been chatting with you guys a lot. So for the business side, we can talk about that, but I kind of wanted to engage your CTO, agent, engineer side of things. So how did you get agent religion?Dharmesh [00:01:00]: Let's see. So I've been working, I'll take like a half step back, a decade or so ago, even though actually more than that. So even before HubSpot, the company I was contemplating that I had named for was called Ingenisoft. And the idea behind Ingenisoft was a natural language interface to business software. Now realize this is 20 years ago, so that was a hard thing to do. But the actual use case that I had in mind was, you know, we had data sitting in business systems like a CRM or something like that. And my kind of what I thought clever at the time. Oh, what if we used email as the kind of interface to get to business software? And the motivation for using email is that it automatically works when you're offline. So imagine I'm getting on a plane or I'm on a plane. There was no internet on planes back then. It's like, oh, I'm going through business cards from an event I went to. I can just type things into an email just to have them all in the backlog. When it reconnects, it sends those emails to a processor that basically kind of parses effectively the commands and updates the software, sends you the file, whatever it is. And there was a handful of commands. I was a little bit ahead of the times in terms of what was actually possible. And I reattempted this natural language thing with a product called ChatSpot that I did back 20...swyx [00:02:12]: Yeah, this is your first post-ChatGPT project.Dharmesh [00:02:14]: I saw it come out. Yeah. And so I've always been kind of fascinated by this natural language interface to software. Because, you know, as software developers, myself included, we've always said, oh, we build intuitive, easy-to-use applications. And it's not intuitive at all, right? Because what we're doing is... We're taking the mental model that's in our head of what we're trying to accomplish with said piece of software and translating that into a series of touches and swipes and clicks and things like that. And there's nothing natural or intuitive about it. And so natural language interfaces, for the first time, you know, whatever the thought is you have in your head and expressed in whatever language that you normally use to talk to yourself in your head, you can just sort of emit that and have software do something. And I thought that was kind of a breakthrough, which it has been. And it's gone. So that's where I first started getting into the journey. I started because now it actually works, right? So once we got ChatGPT and you can take, even with a few-shot example, convert something into structured, even back in the ChatGP 3.5 days, it did a decent job in a few-shot example, convert something to structured text if you knew what kinds of intents you were going to have. And so that happened. And that ultimately became a HubSpot project. But then agents intrigued me because I'm like, okay, well, that's the next step here. So chat's great. Love Chat UX. But if we want to do something even more meaningful, it felt like the next kind of advancement is not this kind of, I'm chatting with some software in a kind of a synchronous back and forth model, is that software is going to do things for me in kind of a multi-step way to try and accomplish some goals. So, yeah, that's when I first got started. It's like, okay, what would that look like? Yeah. And I've been obsessed ever since, by the way.Alessio [00:03:55]: Which goes back to your first experience with it, which is like you're offline. Yeah. And you want to do a task. You don't need to do it right now. You just want to queue it up for somebody to do it for you. Yes. As you think about agents, like, let's start at the easy question, which is like, how do you define an agent? Maybe. You mean the hardest question in the universe? Is that what you mean?Dharmesh [00:04:12]: You said you have an irritating take. I do have an irritating take. I think, well, some number of people have been irritated, including within my own team. So I have a very broad definition for agents, which is it's AI-powered software that accomplishes a goal. Period. That's it. And what irritates people about it is like, well, that's so broad as to be completely non-useful. And I understand that. I understand the criticism. But in my mind, if you kind of fast forward months, I guess, in AI years, the implementation of it, and we're already starting to see this, and we'll talk about this, different kinds of agents, right? So I think in addition to having a usable definition, and I like yours, by the way, and we should talk more about that, that you just came out with, the classification of agents actually is also useful, which is, is it autonomous or non-autonomous? Does it have a deterministic workflow? Does it have a non-deterministic workflow? Is it working synchronously? Is it working asynchronously? Then you have the different kind of interaction modes. Is it a chat agent, kind of like a customer support agent would be? You're having this kind of back and forth. Is it a workflow agent that just does a discrete number of steps? So there's all these different flavors of agents. So if I were to draw it in a Venn diagram, I would draw a big circle that says, this is agents, and then I have a bunch of circles, some overlapping, because they're not mutually exclusive. And so I think that's what's interesting, and we're seeing development along a bunch of different paths, right? So if you look at the first implementation of agent frameworks, you look at Baby AGI and AutoGBT, I think it was, not Autogen, that's the Microsoft one. They were way ahead of their time because they assumed this level of reasoning and execution and planning capability that just did not exist, right? So it was an interesting thought experiment, which is what it was. Even the guy that, I'm an investor in Yohei's fund that did Baby AGI. It wasn't ready, but it was a sign of what was to come. And so the question then is, when is it ready? And so lots of people talk about the state of the art when it comes to agents. I'm a pragmatist, so I think of the state of the practical. It's like, okay, well, what can I actually build that has commercial value or solves actually some discrete problem with some baseline of repeatability or verifiability?swyx [00:06:22]: There was a lot, and very, very interesting. I'm not irritated by it at all. Okay. As you know, I take a... There's a lot of anthropological view or linguistics view. And in linguistics, you don't want to be prescriptive. You want to be descriptive. Yeah. So you're a goals guy. That's the key word in your thing. And other people have other definitions that might involve like delegated trust or non-deterministic work, LLM in the loop, all that stuff. The other thing I was thinking about, just the comment on Baby AGI, LGBT. Yeah. In that piece that you just read, I was able to go through our backlog and just kind of track the winter of agents and then the summer now. Yeah. And it's... We can tell the whole story as an oral history, just following that thread. And it's really just like, I think, I tried to explain the why now, right? Like I had, there's better models, of course. There's better tool use with like, they're just more reliable. Yep. Better tools with MCP and all that stuff. And I'm sure you have opinions on that too. Business model shift, which you like a lot. I just heard you talk about RAS with MFM guys. Yep. Cost is dropping a lot. Yep. Inference is getting faster. There's more model diversity. Yep. Yep. I think it's a subtle point. It means that like, you have different models with different perspectives. You don't get stuck in the basin of performance of a single model. Sure. You can just get out of it by just switching models. Yep. Multi-agent research and RL fine tuning. So I just wanted to let you respond to like any of that.Dharmesh [00:07:44]: Yeah. A couple of things. Connecting the dots on the kind of the definition side of it. So we'll get the irritation out of the way completely. I have one more, even more irritating leap on the agent definition thing. So here's the way I think about it. By the way, the kind of word agent, I looked it up, like the English dictionary definition. The old school agent, yeah. Is when you have someone or something that does something on your behalf, like a travel agent or a real estate agent acts on your behalf. It's like proxy, which is a nice kind of general definition. So the other direction I'm sort of headed, and it's going to tie back to tool calling and MCP and things like that, is if you, and I'm not a biologist by any stretch of the imagination, but we have these single-celled organisms, right? Like the simplest possible form of what one would call life. But it's still life. It just happens to be single-celled. And then you can combine cells and then cells become specialized over time. And you have much more sophisticated organisms, you know, kind of further down the spectrum. In my mind, at the most fundamental level, you can almost think of having atomic agents. What is the simplest possible thing that's an agent that can still be called an agent? What is the equivalent of a kind of single-celled organism? And the reason I think that's useful is right now we're headed down the road, which I think is very exciting around tool use, right? That says, okay, the LLMs now can be provided a set of tools that it calls to accomplish whatever it needs to accomplish in the kind of furtherance of whatever goal it's trying to get done. And I'm not overly bothered by it, but if you think about it, if you just squint a little bit and say, well, what if everything was an agent? And what if tools were actually just atomic agents? Because then it's turtles all the way down, right? Then it's like, oh, well, all that's really happening with tool use is that we have a network of agents that know about each other through something like an MMCP and can kind of decompose a particular problem and say, oh, I'm going to delegate this to this set of agents. And why do we need to draw this distinction between tools, which are functions most of the time? And an actual agent. And so I'm going to write this irritating LinkedIn post, you know, proposing this. It's like, okay. And I'm not suggesting we should call even functions, you know, call them agents. But there is a certain amount of elegance that happens when you say, oh, we can just reduce it down to one primitive, which is an agent that you can combine in complicated ways to kind of raise the level of abstraction and accomplish higher order goals. Anyway, that's my answer. I'd say that's a success. Thank you for coming to my TED Talk on agent definitions.Alessio [00:09:54]: How do you define the minimum viable agent? Do you already have a definition for, like, where you draw the line between a cell and an atom? Yeah.Dharmesh [00:10:02]: So in my mind, it has to, at some level, use AI in order for it to—otherwise, it's just software. It's like, you know, we don't need another word for that. And so that's probably where I draw the line. So then the question, you know, the counterargument would be, well, if that's true, then lots of tools themselves are actually not agents because they're just doing a database call or a REST API call or whatever it is they're doing. And that does not necessarily qualify them, which is a fair counterargument. And I accept that. It's like a good argument. I still like to think about—because we'll talk about multi-agent systems, because I think—so we've accepted, which I think is true, lots of people have said it, and you've hopefully combined some of those clips of really smart people saying this is the year of agents, and I completely agree, it is the year of agents. But then shortly after that, it's going to be the year of multi-agent systems or multi-agent networks. I think that's where it's going to be headed next year. Yeah.swyx [00:10:54]: Opening eyes already on that. Yeah. My quick philosophical engagement with you on this. I often think about kind of the other spectrum, the other end of the cell spectrum. So single cell is life, multi-cell is life, and you clump a bunch of cells together in a more complex organism, they become organs, like an eye and a liver or whatever. And then obviously we consider ourselves one life form. There's not like a lot of lives within me. I'm just one life. And now, obviously, I don't think people don't really like to anthropomorphize agents and AI. Yeah. But we are extending our consciousness and our brain and our functionality out into machines. I just saw you were a Bee. Yeah. Which is, you know, it's nice. I have a limitless pendant in my pocket.Dharmesh [00:11:37]: I got one of these boys. Yeah.swyx [00:11:39]: I'm testing it all out. You know, got to be early adopters. But like, we want to extend our personal memory into these things so that we can be good at the things that we're good at. And, you know, machines are good at it. Machines are there. So like, my definition of life is kind of like going outside of my own body now. I don't know if you've ever had like reflections on that. Like how yours. How our self is like actually being distributed outside of you. Yeah.Dharmesh [00:12:01]: I don't fancy myself a philosopher. But you went there. So yeah, I did go there. I'm fascinated by kind of graphs and graph theory and networks and have been for a long, long time. And to me, we're sort of all nodes in this kind of larger thing. It just so happens that we're looking at individual kind of life forms as they exist right now. But so the idea is when you put a podcast out there, there's these little kind of nodes you're putting out there of like, you know, conceptual ideas. Once again, you have varying kind of forms of those little nodes that are up there and are connected in varying and sundry ways. And so I just think of myself as being a node in a massive, massive network. And I'm producing more nodes as I put content or ideas. And, you know, you spend some portion of your life collecting dots, experiences, people, and some portion of your life then connecting dots from the ones that you've collected over time. And I found that really interesting things happen and you really can't know in advance how those dots are necessarily going to connect in the future. And that's, yeah. So that's my philosophical take. That's the, yes, exactly. Coming back.Alessio [00:13:04]: Yep. Do you like graph as an agent? Abstraction? That's been one of the hot topics with LandGraph and Pydantic and all that.Dharmesh [00:13:11]: I do. The thing I'm more interested in terms of use of graphs, and there's lots of work happening on that now, is graph data stores as an alternative in terms of knowledge stores and knowledge graphs. Yeah. Because, you know, so I've been in software now 30 plus years, right? So it's not 10,000 hours. It's like 100,000 hours that I've spent doing this stuff. And so I've grew up with, so back in the day, you know, I started on mainframes. There was a product called IMS from IBM, which is basically an index database, what we'd call like a key value store today. Then we've had relational databases, right? We have tables and columns and foreign key relationships. We all know that. We have document databases like MongoDB, which is sort of a nested structure keyed by a specific index. We have vector stores, vector embedding database. And graphs are interesting for a couple of reasons. One is, so it's not classically structured in a relational way. When you say structured database, to most people, they're thinking tables and columns and in relational database and set theory and all that. Graphs still have structure, but it's not the tables and columns structure. And you could wonder, and people have made this case, that they are a better representation of knowledge for LLMs and for AI generally than other things. So that's kind of thing number one conceptually, and that might be true, I think is possibly true. And the other thing that I really like about that in the context of, you know, I've been in the context of data stores for RAG is, you know, RAG, you say, oh, I have a million documents, I'm going to build the vector embeddings, I'm going to come back with the top X based on the semantic match, and that's fine. All that's very, very useful. But the reality is something gets lost in the chunking process and the, okay, well, those tend, you know, like, you don't really get the whole picture, so to speak, and maybe not even the right set of dimensions on the kind of broader picture. And it makes intuitive sense to me that if we did capture it properly in a graph form, that maybe that feeding into a RAG pipeline will actually yield better results for some use cases, I don't know, but yeah.Alessio [00:15:03]: And do you feel like at the core of it, there's this difference between imperative and declarative programs? Because if you think about HubSpot, it's like, you know, people and graph kind of goes hand in hand, you know, but I think maybe the software before was more like primary foreign key based relationship, versus now the models can traverse through the graph more easily.Dharmesh [00:15:22]: Yes. So I like that representation. There's something. It's just conceptually elegant about graphs and just from the representation of it, they're much more discoverable, you can kind of see it, there's observability to it, versus kind of embeddings, which you can't really do much with as a human. You know, once they're in there, you can't pull stuff back out. But yeah, I like that kind of idea of it. And the other thing that's kind of, because I love graphs, I've been long obsessed with PageRank from back in the early days. And, you know, one of the kind of simplest algorithms in terms of coming up, you know, with a phone, everyone's been exposed to PageRank. And the idea is that, and so I had this other idea for a project, not a company, and I have hundreds of these, called NodeRank, is to be able to take the idea of PageRank and apply it to an arbitrary graph that says, okay, I'm going to define what authority looks like and say, okay, well, that's interesting to me, because then if you say, I'm going to take my knowledge store, and maybe this person that contributed some number of chunks to the graph data store has more authority on this particular use case or prompt that's being submitted than this other one that may, or maybe this one was more. popular, or maybe this one has, whatever it is, there should be a way for us to kind of rank nodes in a graph and sort them in some, some useful way. Yeah.swyx [00:16:34]: So I think that's generally useful for, for anything. I think the, the problem, like, so even though at my conferences, GraphRag is super popular and people are getting knowledge, graph religion, and I will say like, it's getting space, getting traction in two areas, conversation memory, and then also just rag in general, like the, the, the document data. Yeah. It's like a source. Most ML practitioners would say that knowledge graph is kind of like a dirty word. The graph database, people get graph religion, everything's a graph, and then they, they go really hard into it and then they get a, they get a graph that is too complex to navigate. Yes. And so like the, the, the simple way to put it is like you at running HubSpot, you know, the power of graphs, the way that Google has pitched them for many years, but I don't suspect that HubSpot itself uses a knowledge graph. No. Yeah.Dharmesh [00:17:26]: So when is it over engineering? Basically? It's a great question. I don't know. So the question now, like in AI land, right, is the, do we necessarily need to understand? So right now, LLMs for, for the most part are somewhat black boxes, right? We sort of understand how the, you know, the algorithm itself works, but we really don't know what's going on in there and, and how things come out. So if a graph data store is able to produce the outcomes we want, it's like, here's a set of queries I want to be able to submit and then it comes out with useful content. Maybe the underlying data store is as opaque as a vector embeddings or something like that, but maybe it's fine. Maybe we don't necessarily need to understand it to get utility out of it. And so maybe if it's messy, that's okay. Um, that's, it's just another form of lossy compression. Uh, it's just lossy in a way that we just don't completely understand in terms of, because it's going to grow organically. Uh, and it's not structured. It's like, ah, we're just gonna throw a bunch of stuff in there. Let the, the equivalent of the embedding algorithm, whatever they called in graph land. Um, so the one with the best results wins. I think so. Yeah.swyx [00:18:26]: Or is this the practical side of me is like, yeah, it's, if it's useful, we don't necessarilyDharmesh [00:18:30]: need to understand it.swyx [00:18:30]: I have, I mean, I'm happy to push back as long as you want. Uh, it's not practical to evaluate like the 10 different options out there because it takes time. It takes people, it takes, you know, resources, right? Set. That's the first thing. Second thing is your evals are typically on small things and some things only work at scale. Yup. Like graphs. Yup.Dharmesh [00:18:46]: Yup. That's, yeah, no, that's fair. And I think this is one of the challenges in terms of implementation of graph databases is that the most common approach that I've seen developers do, I've done it myself, is that, oh, I've got a Postgres database or a MySQL or whatever. I can represent a graph with a very set of tables with a parent child thing or whatever. And that sort of gives me the ability, uh, why would I need anything more than that? And the answer is, well, if you don't need anything more than that, you don't need anything more than that. But there's a high chance that you're sort of missing out on the actual value that, uh, the graph representation gives you. Which is the ability to traverse the graph, uh, efficiently in ways that kind of going through the, uh, traversal in a relational database form, even though structurally you have the data, practically you're not gonna be able to pull it out in, in useful ways. Uh, so you wouldn't like represent a social graph, uh, in, in using that kind of relational table model. It just wouldn't scale. It wouldn't work.swyx [00:19:36]: Uh, yeah. Uh, I think we want to move on to MCP. Yeah. But I just want to, like, just engineering advice. Yeah. Uh, obviously you've, you've, you've run, uh, you've, you've had to do a lot of projects and run a lot of teams. Do you have a general rule for over-engineering or, you know, engineering ahead of time? You know, like, because people, we know premature engineering is the root of all evil. Yep. But also sometimes you just have to. Yep. When do you do it? Yes.Dharmesh [00:19:59]: It's a great question. This is, uh, a question as old as time almost, which is what's the right and wrong levels of abstraction. That's effectively what, uh, we're answering when we're trying to do engineering. I tend to be a pragmatist, right? So here's the thing. Um, lots of times doing something the right way. Yeah. It's like a marginal increased cost in those cases. Just do it the right way. And this is what makes a, uh, a great engineer or a good engineer better than, uh, a not so great one. It's like, okay, all things being equal. If it's going to take you, you know, roughly close to constant time anyway, might as well do it the right way. Like, so do things well, then the question is, okay, well, am I building a framework as the reusable library? To what degree, uh, what am I anticipating in terms of what's going to need to change in this thing? Uh, you know, along what dimension? And then I think like a business person in some ways, like what's the return on calories, right? So, uh, and you look at, um, energy, the expected value of it's like, okay, here are the five possible things that could happen, uh, try to assign probabilities like, okay, well, if there's a 50% chance that we're going to go down this particular path at some day, like, or one of these five things is going to happen and it costs you 10% more to engineer for that. It's basically, it's something that yields a kind of interest compounding value. Um, as you get closer to the time of, of needing that versus having to take on debt, which is when you under engineer it, you're taking on debt. You're going to have to pay off when you do get to that eventuality where something happens. One thing as a pragmatist, uh, so I would rather under engineer something than over engineer it. If I were going to err on the side of something, and here's the reason is that when you under engineer it, uh, yes, you take on tech debt, uh, but the interest rate is relatively known and payoff is very, very possible, right? Which is, oh, I took a shortcut here as a result of which now this thing that should have taken me a week is now going to take me four weeks. Fine. But if that particular thing that you thought might happen, never actually, you never have that use case transpire or just doesn't, it's like, well, you just save yourself time, right? And that has value because you were able to do other things instead of, uh, kind of slightly over-engineering it away, over-engineering it. But there's no perfect answers in art form in terms of, uh, and yeah, we'll, we'll bring kind of this layers of abstraction back on the code generation conversation, which we'll, uh, I think I have later on, butAlessio [00:22:05]: I was going to ask, we can just jump ahead quickly. Yeah. Like, as you think about vibe coding and all that, how does the. Yeah. Percentage of potential usefulness change when I feel like we over-engineering a lot of times it's like the investment in syntax, it's less about the investment in like arc exacting. Yep. Yeah. How does that change your calculus?Dharmesh [00:22:22]: A couple of things, right? One is, um, so, you know, going back to that kind of ROI or a return on calories, kind of calculus or heuristic you think through, it's like, okay, well, what is it going to cost me to put this layer of abstraction above the code that I'm writing now, uh, in anticipating kind of future needs. If the cost of fixing, uh, or doing under engineering right now. Uh, we'll trend towards zero that says, okay, well, I don't have to get it right right now because even if I get it wrong, I'll run the thing for six hours instead of 60 minutes or whatever. It doesn't really matter, right? Like, because that's going to trend towards zero to be able, the ability to refactor a code. Um, and because we're going to not that long from now, we're going to have, you know, large code bases be able to exist, uh, you know, as, as context, uh, for a code generation or a code refactoring, uh, model. So I think it's going to make it, uh, make the case for under engineering, uh, even stronger. Which is why I take on that cost. You just pay the interest when you get there, it's not, um, just go on with your life vibe coded and, uh, come back when you need to. Yeah.Alessio [00:23:18]: Sometimes I feel like there's no decision-making in some things like, uh, today I built a autosave for like our internal notes platform and I literally just ask them cursor. Can you add autosave? Yeah. I don't know if it's over under engineer. Yep. I just vibe coded it. Yep. And I feel like at some point we're going to get to the point where the models kindDharmesh [00:23:36]: of decide where the right line is, but this is where the, like the, in my mind, the danger is, right? So there's two sides to this. One is the cost of kind of development and coding and things like that stuff that, you know, we talk about. But then like in your example, you know, one of the risks that we have is that because adding a feature, uh, like a save or whatever the feature might be to a product as that price tends towards zero, are we going to be less discriminant about what features we add as a result of making more product products more complicated, which has a negative impact on the user and navigate negative impact on the business. Um, and so that's the thing I worry about if it starts to become too easy, are we going to be. Too promiscuous in our, uh, kind of extension, adding product extensions and things like that. It's like, ah, why not add X, Y, Z or whatever back then it was like, oh, we only have so many engineering hours or story points or however you measure things. Uh, that least kept us in check a little bit. Yeah.Alessio [00:24:22]: And then over engineering, you're like, yeah, it's kind of like you're putting that on yourself. Yeah. Like now it's like the models don't understand that if they add too much complexity, it's going to come back to bite them later. Yep. So they just do whatever they want to do. Yeah. And I'm curious where in the workflow that's going to be, where it's like, Hey, this is like the amount of complexity and over-engineering you can do before you got to ask me if we should actually do it versus like do something else.Dharmesh [00:24:45]: So you know, we've already, let's like, we're leaving this, uh, in the code generation world, this kind of compressed, um, cycle time. Right. It's like, okay, we went from auto-complete, uh, in the GitHub co-pilot to like, oh, finish this particular thing and hit tab to a, oh, I sort of know your file or whatever. I can write out a full function to you to now I can like hold a bunch of the context in my head. Uh, so we can do app generation, which we have now with lovable and bolt and repletage. Yeah. Association and other things. So then the question is, okay, well, where does it naturally go from here? So we're going to generate products. Make sense. We might be able to generate platforms as though I want a platform for ERP that does this, whatever. And that includes the API's includes the product and the UI, and all the things that make for a platform. There's no nothing that says we would stop like, okay, can you generate an entire software company someday? Right. Uh, with the platform and the monetization and the go-to-market and the whatever. And you know, that that's interesting to me in terms of, uh, you know, what, when you take it to almost ludicrous levels. of abstract.swyx [00:25:39]: It's like, okay, turn it to 11. You mentioned vibe coding, so I have to, this is a blog post I haven't written, but I'm kind of exploring it. Is the junior engineer dead?Dharmesh [00:25:49]: I don't think so. I think what will happen is that the junior engineer will be able to, if all they're bringing to the table is the fact that they are a junior engineer, then yes, they're likely dead. But hopefully if they can communicate with carbon-based life forms, they can interact with product, if they're willing to talk to customers, they can take their kind of basic understanding of engineering and how kind of software works. I think that has value. So I have a 14-year-old right now who's taking Python programming class, and some people ask me, it's like, why is he learning coding? And my answer is, is because it's not about the syntax, it's not about the coding. What he's learning is like the fundamental thing of like how things work. And there's value in that. I think there's going to be timeless value in systems thinking and abstractions and what that means. And whether functions manifested as math, which he's going to get exposed to regardless, or there are some core primitives to the universe, I think, that the more you understand them, those are what I would kind of think of as like really large dots in your life that will have a higher gravitational pull and value to them that you'll then be able to. So I want him to collect those dots, and he's not resisting. So it's like, okay, while he's still listening to me, I'm going to have him do things that I think will be useful.swyx [00:26:59]: You know, part of one of the pitches that I evaluated for AI engineer is a term. And the term is that maybe the traditional interview path or career path of software engineer goes away, which is because what's the point of lead code? Yeah. And, you know, it actually matters more that you know how to work with AI and to implement the things that you want. Yep.Dharmesh [00:27:16]: That's one of the like interesting things that's happened with generative AI. You know, you go from machine learning and the models and just that underlying form, which is like true engineering, right? Like the actual, what I call real engineering. I don't think of myself as a real engineer, actually. I'm a developer. But now with generative AI. We call it AI and it's obviously got its roots in machine learning, but it just feels like fundamentally different to me. Like you have the vibe. It's like, okay, well, this is just a whole different approach to software development to so many different things. And so I'm wondering now, it's like an AI engineer is like, if you were like to draw the Venn diagram, it's interesting because the cross between like AI things, generative AI and what the tools are capable of, what the models do, and this whole new kind of body of knowledge that we're still building out, it's still very young, intersected with kind of classic engineering, software engineering. Yeah.swyx [00:28:04]: I just described the overlap as it separates out eventually until it's its own thing, but it's starting out as a software. Yeah.Alessio [00:28:11]: That makes sense. So to close the vibe coding loop, the other big hype now is MCPs. Obviously, I would say Cloud Desktop and Cursor are like the two main drivers of MCP usage. I would say my favorite is the Sentry MCP. I can pull in errors and then you can just put the context in Cursor. How do you think about that abstraction layer? Does it feel... Does it feel almost too magical in a way? Do you think it's like you get enough? Because you don't really see how the server itself is then kind of like repackaging theDharmesh [00:28:41]: information for you? I think MCP as a standard is one of the better things that's happened in the world of AI because a standard needed to exist and absent a standard, there was a set of things that just weren't possible. Now, we can argue whether it's the best possible manifestation of a standard or not. Does it do too much? Does it do too little? I get that, but it's just simple enough to both be useful and unobtrusive. It's understandable and adoptable by mere mortals, right? It's not overly complicated. You know, a reasonable engineer can put a stand up an MCP server relatively easily. The thing that has me excited about it is like, so I'm a big believer in multi-agent systems. And so that's going back to our kind of this idea of an atomic agent. So imagine the MCP server, like obviously it calls tools, but the way I think about it, so I'm working on my current passion project is agent.ai. And we'll talk more about that in a little bit. More about the, I think we should, because I think it's interesting not to promote the project at all, but there's some interesting ideas in there. One of which is around, we're going to need a mechanism for, if agents are going to collaborate and be able to delegate, there's going to need to be some form of discovery and we're going to need some standard way. It's like, okay, well, I just need to know what this thing over here is capable of. We're going to need a registry, which Anthropic's working on. I'm sure others will and have been doing directories of, and there's going to be a standard around that too. How do you build out a directory of MCP servers? I think that's going to unlock so many things just because, and we're already starting to see it. So I think MCP or something like it is going to be the next major unlock because it allows systems that don't know about each other, don't need to, it's that kind of decoupling of like Sentry and whatever tools someone else was building. And it's not just about, you know, Cloud Desktop or things like, even on the client side, I think we're going to see very interesting consumers of MCP, MCP clients versus just the chat body kind of things. Like, you know, Cloud Desktop and Cursor and things like that. But yeah, I'm very excited about MCP in that general direction.swyx [00:30:39]: I think the typical cynical developer take, it's like, we have OpenAPI. Yeah. What's the new thing? I don't know if you have a, do you have a quick MCP versus everything else? Yeah.Dharmesh [00:30:49]: So it's, so I like OpenAPI, right? So just a descriptive thing. It's OpenAPI. OpenAPI. Yes, that's what I meant. So it's basically a self-documenting thing. We can do machine-generated, lots of things from that output. It's a structured definition of an API. I get that, love it. But MCPs sort of are kind of use case specific. They're perfect for exactly what we're trying to use them for around LLMs in terms of discovery. It's like, okay, I don't necessarily need to know kind of all this detail. And so right now we have, we'll talk more about like MCP server implementations, but We will? I think, I don't know. Maybe we won't. At least it's in my head. It's like a back processor. But I do think MCP adds value above OpenAPI. It's, yeah, just because it solves this particular thing. And if we had come to the world, which we have, like, it's like, hey, we already have OpenAPI. It's like, if that were good enough for the universe, the universe would have adopted it already. There's a reason why MCP is taking office because marginally adds something that was missing before and doesn't go too far. And so that's why the kind of rate of adoption, you folks have written about this and talked about it. Yeah, why MCP won. Yeah. And it won because the universe decided that this was useful and maybe it gets supplanted by something else. Yeah. And maybe we discover, oh, maybe OpenAPI was good enough the whole time. I doubt that.swyx [00:32:09]: The meta lesson, this is, I mean, he's an investor in DevTools companies. I work in developer experience at DevRel in DevTools companies. Yep. Everyone wants to own the standard. Yeah. I'm sure you guys have tried to launch your own standards. Actually, it's Houseplant known for a standard, you know, obviously inbound marketing. But is there a standard or protocol that you ever tried to push? No.Dharmesh [00:32:30]: And there's a reason for this. Yeah. Is that? And I don't mean, need to mean, speak for the people of HubSpot, but I personally. You kind of do. I'm not smart enough. That's not the, like, I think I have a. You're smart. Not enough for that. I'm much better off understanding the standards that are out there. And I'm more on the composability side. Let's, like, take the pieces of technology that exist out there, combine them in creative, unique ways. And I like to consume standards. I don't like to, and that's not that I don't like to create them. I just don't think I have the, both the raw wattage or the credibility. It's like, okay, well, who the heck is Dharmesh, and why should we adopt a standard he created?swyx [00:33:07]: Yeah, I mean, there are people who don't monetize standards, like OpenTelemetry is a big standard, and LightStep never capitalized on that.Dharmesh [00:33:15]: So, okay, so if I were to do a standard, there's two things that have been in my head in the past. I was one around, a very, very basic one around, I don't even have the domain, I have a domain for everything, for open marketing. Because the issue we had in HubSpot grew up in the marketing space. There we go. There was no standard around data formats and things like that. It doesn't go anywhere. But the other one, and I did not mean to go here, but I'm going to go here. It's called OpenGraph. I know the term was already taken, but it hasn't been used for like 15 years now for its original purpose. But what I think should exist in the world is right now, our information, all of us, nodes are in the social graph at Meta or the professional graph at LinkedIn. Both of which are actually relatively closed in actually very annoying ways. Like very, very closed, right? Especially LinkedIn. Especially LinkedIn. I personally believe that if it's my data, and if I would get utility out of it being open, I should be able to make my data open or publish it in whatever forms that I choose, as long as I have control over it as opt-in. So the idea is around OpenGraph that says, here's a standard, here's a way to publish it. I should be able to go to OpenGraph.org slash Dharmesh dot JSON and get it back. And it's like, here's your stuff, right? And I can choose along the way and people can write to it and I can prove. And there can be an entire system. And if I were to do that, I would do it as a... Like a public benefit, non-profit-y kind of thing, as this is a contribution to society. I wouldn't try to commercialize that. Have you looked at AdProto? What's that? AdProto.swyx [00:34:43]: It's the protocol behind Blue Sky. Okay. My good friend, Dan Abramov, who was the face of React for many, many years, now works there. And he actually did a talk that I can send you, which basically kind of tries to articulate what you just said. But he does, he loves doing these like really great analogies, which I think you'll like. Like, you know, a lot of our data is behind a handle, behind a domain. Yep. So he's like, all right, what if we flip that? What if it was like our handle and then the domain? Yep. So, and that's really like your data should belong to you. Yep. And I should not have to wait 30 days for my Twitter data to export. Yep.Dharmesh [00:35:19]: you should be able to at least be able to automate it or do like, yes, I should be able to plug it into an agentic thing. Yeah. Yes. I think we're... Because so much of our data is... Locked up. I think the trick here isn't that standard. It is getting the normies to care.swyx [00:35:37]: Yeah. Because normies don't care.Dharmesh [00:35:38]: That's true. But building on that, normies don't care. So, you know, privacy is a really hot topic and an easy word to use, but it's not a binary thing. Like there are use cases where, and we make these choices all the time, that I will trade, not all privacy, but I will trade some privacy for some productivity gain or some benefit to me that says, oh, I don't care about that particular data being online if it gives me this in return, or I don't mind sharing this information with this company.Alessio [00:36:02]: If I'm getting, you know, this in return, but that sort of should be my option. I think now with computer use, you can actually automate some of the exports. Yes. Like something we've been doing internally is like everybody exports their LinkedIn connections. Yep. And then internally, we kind of merge them together to see how we can connect our companies to customers or things like that.Dharmesh [00:36:21]: And not to pick on LinkedIn, but since we're talking about it, but they feel strongly enough on the, you know, do not take LinkedIn data that they will block even browser use kind of things or whatever. They go to great, great lengths, even to see patterns of usage. And it says, oh, there's no way you could have, you know, gotten that particular thing or whatever without, and it's, so it's, there's...swyx [00:36:42]: Wasn't there a Supreme Court case that they lost? Yeah.Dharmesh [00:36:45]: So the one they lost was around someone that was scraping public data that was on the public internet. And that particular company had not signed any terms of service or whatever. It's like, oh, I'm just taking data that's on, there was no, and so that's why they won. But now, you know, the question is around, can LinkedIn... I think they can. Like, when you use, as a user, you use LinkedIn, you are signing up for their terms of service. And if they say, well, this kind of use of your LinkedIn account that violates our terms of service, they can shut your account down, right? They can. And they, yeah, so, you know, we don't need to make this a discussion. By the way, I love the company, don't get me wrong. I'm an avid user of the product. You know, I've got... Yeah, I mean, you've got over a million followers on LinkedIn, I think. Yeah, I do. And I've known people there for a long, long time, right? And I have lots of respect. And I understand even where the mindset originally came from of this kind of members-first approach to, you know, a privacy-first. I sort of get that. But sometimes you sort of have to wonder, it's like, okay, well, that was 15, 20 years ago. There's likely some controlled ways to expose some data on some member's behalf and not just completely be a binary. It's like, no, thou shalt not have the data.swyx [00:37:54]: Well, just pay for sales navigator.Alessio [00:37:57]: Before we move to the next layer of instruction, anything else on MCP you mentioned? Let's move back and then I'll tie it back to MCPs.Dharmesh [00:38:05]: So I think the... Open this with agent. Okay, so I'll start with... Here's my kind of running thesis, is that as AI and agents evolve, which they're doing very, very quickly, we're going to look at them more and more. I don't like to anthropomorphize. We'll talk about why this is not that. Less as just like raw tools and more like teammates. They'll still be software. They should self-disclose as being software. I'm totally cool with that. But I think what's going to happen is that in the same way you might collaborate with a team member on Slack or Teams or whatever you use, you can imagine a series of agents that do specific things just like a team member might do, that you can delegate things to. You can collaborate. You can say, hey, can you take a look at this? Can you proofread that? Can you try this? You can... Whatever it happens to be. So I think it is... I will go so far as to say it's inevitable that we're going to have hybrid teams someday. And what I mean by hybrid teams... So back in the day, hybrid teams were, oh, well, you have some full-time employees and some contractors. Then it was like hybrid teams are some people that are in the office and some that are remote. That's the kind of form of hybrid. The next form of hybrid is like the carbon-based life forms and agents and AI and some form of software. So let's say we temporarily stipulate that I'm right about that over some time horizon that eventually we're going to have these kind of digitally hybrid teams. So if that's true, then the question you sort of ask yourself is that then what needs to exist in order for us to get the full value of that new model? It's like, okay, well... You sort of need to... It's like, okay, well, how do I... If I'm building a digital team, like, how do I... Just in the same way, if I'm interviewing for an engineer or a designer or a PM, whatever, it's like, well, that's why we have professional networks, right? It's like, oh, they have a presence on likely LinkedIn. I can go through that semi-structured, structured form, and I can see the experience of whatever, you know, self-disclosed. But, okay, well, agents are going to need that someday. And so I'm like, okay, well, this seems like a thread that's worth pulling on. That says, okay. So I... So agent.ai is out there. And it's LinkedIn for agents. It's LinkedIn for agents. It's a professional network for agents. And the more I pull on that thread, it's like, okay, well, if that's true, like, what happens, right? It's like, oh, well, they have a profile just like anyone else, just like a human would. It's going to be a graph underneath, just like a professional network would be. It's just that... And you can have its, you know, connections and follows, and agents should be able to post. That's maybe how they do release notes. Like, oh, I have this new version. Whatever they decide to post, it should just be able to... Behave as a node on the network of a professional network. As it turns out, the more I think about that and pull on that thread, the more and more things, like, start to make sense to me. So it may be more than just a pure professional network. So my original thought was, okay, well, it's a professional network and agents as they exist out there, which I think there's going to be more and more of, will kind of exist on this network and have the profile. But then, and this is always dangerous, I'm like, okay, I want to see a world where thousands of agents are out there in order for the... Because those digital employees, the digital workers don't exist yet in any meaningful way. And so then I'm like, oh, can I make that easier for, like... And so I have, as one does, it's like, oh, I'll build a low-code platform for building agents. How hard could that be, right? Like, very hard, as it turns out. But it's been fun. So now, agent.ai has 1.3 million users. 3,000 people have actually, you know, built some variation of an agent, sometimes just for their own personal productivity. About 1,000 of which have been published. And the reason this comes back to MCP for me, so imagine that and other networks, since I know agent.ai. So right now, we have an MCP server for agent.ai that exposes all the internally built agents that we have that do, like, super useful things. Like, you know, I have access to a Twitter API that I can subsidize the cost. And I can say, you know, if you're looking to build something for social media, these kinds of things, with a single API key, and it's all completely free right now, I'm funding it. That's a useful way for it to work. And then we have a developer to say, oh, I have this idea. I don't have to worry about open AI. I don't have to worry about, now, you know, this particular model is better. It has access to all the models with one key. And we proxy it kind of behind the scenes. And then expose it. So then we get this kind of community effect, right? That says, oh, well, someone else may have built an agent to do X. Like, I have an agent right now that I built for myself to do domain valuation for website domains because I'm obsessed with domains, right? And, like, there's no efficient market for domains. There's no Zillow for domains right now that tells you, oh, here are what houses in your neighborhood sold for. It's like, well, why doesn't that exist? We should be able to solve that problem. And, yes, you're still guessing. Fine. There should be some simple heuristic. So I built that. It's like, okay, well, let me go look for past transactions. You say, okay, I'm going to type in agent.ai, agent.com, whatever domain. What's it actually worth? I'm looking at buying it. It can go and say, oh, which is what it does. It's like, I'm going to go look at are there any published domain transactions recently that are similar, either use the same word, same top-level domain, whatever it is. And it comes back with an approximate value, and it comes back with its kind of rationale for why it picked the value and comparable transactions. Oh, by the way, this domain sold for published. Okay. So that agent now, let's say, existed on the web, on agent.ai. Then imagine someone else says, oh, you know, I want to build a brand-building agent for startups and entrepreneurs to come up with names for their startup. Like a common problem, every startup is like, ah, I don't know what to call it. And so they type in five random words that kind of define whatever their startup is. And you can do all manner of things, one of which is like, oh, well, I need to find the domain for it. What are possible choices? Now it's like, okay, well, it would be nice to know if there's an aftermarket price for it, if it's listed for sale. Awesome. Then imagine calling this valuation agent. It's like, okay, well, I want to find where the arbitrage is, where the agent valuation tool says this thing is worth $25,000. It's listed on GoDaddy for $5,000. It's close enough. Let's go do that. Right? And that's a kind of composition use case that in my future state. Thousands of agents on the network, all discoverable through something like MCP. And then you as a developer of agents have access to all these kind of Lego building blocks based on what you're trying to solve. Then you blend in orchestration, which is getting better and better with the reasoning models now. Just describe the problem that you have. Now, the next layer that we're all contending with is that how many tools can you actually give an LLM before the LLM breaks? That number used to be like 15 or 20 before you kind of started to vary dramatically. And so that's the thing I'm thinking about now. It's like, okay, if I want to... If I want to expose 1,000 of these agents to a given LLM, obviously I can't give it all 1,000. Is there some intermediate layer that says, based on your prompt, I'm going to make a best guess at which agents might be able to be helpful for this particular thing? Yeah.Alessio [00:44:37]: Yeah, like RAG for tools. Yep. I did build the Latent Space Researcher on agent.ai. Okay. Nice. Yeah, that seems like, you know, then there's going to be a Latent Space Scheduler. And then once I schedule a research, you know, and you build all of these things. By the way, my apologies for the user experience. You realize I'm an engineer. It's pretty good.swyx [00:44:56]: I think it's a normie-friendly thing. Yeah. That's your magic. HubSpot does the same thing.Alessio [00:45:01]: Yeah, just to like quickly run through it. You can basically create all these different steps. And these steps are like, you know, static versus like variable-driven things. How did you decide between this kind of like low-code-ish versus doing, you know, low-code with code backend versus like not exposing that at all? Any fun design decisions? Yeah. And this is, I think...Dharmesh [00:45:22]: I think lots of people are likely sitting in exactly my position right now, coming through the choosing between deterministic. Like if you're like in a business or building, you know, some sort of agentic thing, do you decide to do a deterministic thing? Or do you go non-deterministic and just let the alum handle it, right, with the reasoning models? The original idea and the reason I took the low-code stepwise, a very deterministic approach. A, the reasoning models did not exist at that time. That's thing number one. Thing number two is if you can get... If you know in your head... If you know in your head what the actual steps are to accomplish whatever goal, why would you leave that to chance? There's no upside. There's literally no upside. Just tell me, like, what steps do you need executed? So right now what I'm playing with... So one thing we haven't talked about yet, and people don't talk about UI and agents. Right now, the primary interaction model... Or they don't talk enough about it. I know some people have. But it's like, okay, so we're used to the chatbot back and forth. Fine. I get that. But I think we're going to move to a blend of... Some of those things are going to be synchronous as they are now. But some are going to be... Some are going to be async. It's just going to put it in a queue, just like... And this goes back to my... Man, I talk fast. But I have this... I only have one other speed. It's even faster. So imagine it's like if you're working... So back to my, oh, we're going to have these hybrid digital teams. Like, you would not go to a co-worker and say, I'm going to ask you to do this thing, and then sit there and wait for them to go do it. Like, that's not how the world works. So it's nice to be able to just, like, hand something off to someone. It's like, okay, well, maybe I expect a response in an hour or a day or something like that.Dharmesh [00:46:52]: In terms of when things need to happen. So the UI around agents. So if you look at the output of agent.ai agents right now, they are the simplest possible manifestation of a UI, right? That says, oh, we have inputs of, like, four different types. Like, we've got a dropdown, we've got multi-select, all the things. It's like back in HTML, the original HTML 1.0 days, right? Like, you're the smallest possible set of primitives for a UI. And it just says, okay, because we need to collect some information from the user, and then we go do steps and do things. And generate some output in HTML or markup are the two primary examples. So the thing I've been asking myself, if I keep going down that path. So people ask me, I get requests all the time. It's like, oh, can you make the UI sort of boring? I need to be able to do this, right? And if I keep pulling on that, it's like, okay, well, now I've built an entire UI builder thing. Where does this end? And so I think the right answer, and this is what I'm going to be backcoding once I get done here, is around injecting a code generation UI generation into, the agent.ai flow, right? As a builder, you're like, okay, I'm going to describe the thing that I want, much like you would do in a vibe coding world. But instead of generating the entire app, it's going to generate the UI that exists at some point in either that deterministic flow or something like that. It says, oh, here's the thing I'm trying to do. Go generate the UI for me. And I can go through some iterations. And what I think of it as a, so it's like, I'm going to generate the code, generate the code, tweak it, go through this kind of prompt style, like we do with vibe coding now. And at some point, I'm going to be happy with it. And I'm going to hit save. And that's going to become the action in that particular step. It's like a caching of the generated code that I can then, like incur any inference time costs. It's just the actual code at that point.Alessio [00:48:29]: Yeah, I invested in a company called E2B, which does code sandbox. And they powered the LM arena web arena. So it's basically the, just like you do LMS, like text to text, they do the same for like UI generation. So if you're asking a model, how do you do it? But yeah, I think that's kind of where.Dharmesh [00:48:45]: That's the thing I'm really fascinated by. So the early LLM, you know, we're understandably, but laughably bad at simple arithmetic, right? That's the thing like my wife, Normies would ask us, like, you call this AI, like it can't, my son would be like, it's just stupid. It can't even do like simple arithmetic. And then like we've discovered over time that, and there's a reason for this, right? It's like, it's a large, there's, you know, the word language is in there for a reason in terms of what it's been trained on. It's not meant to do math, but now it's like, okay, well, the fact that it has access to a Python interpreter that I can actually call at runtime, that solves an entire body of problems that it wasn't trained to do. And it's basically a form of delegation. And so the thought that's kind of rattling around in my head is that that's great. So it's, it's like took the arithmetic problem and took it first. Now, like anything that's solvable through a relatively concrete Python program, it's able to do a bunch of things that I couldn't do before. Can we get to the same place with UI? I don't know what the future of UI looks like in a agentic AI world, but maybe let the LLM handle it, but not in the classic sense. Maybe it generates it on the fly, or maybe we go through some iterations and hit cache or something like that. So it's a little bit more predictable. Uh, I don't know, but yeah.Alessio [00:49:48]: And especially when is the human supposed to intervene? So, especially if you're composing them, most of them should not have a UI because then they're just web hooking to somewhere else. I just want to touch back. I don't know if you have more comments on this.swyx [00:50:01]: I was just going to ask when you, you said you got, you're going to go back to code. What

Postgres FM
PgDog

Postgres FM

Play Episode Listen Later Mar 28, 2025 48:34


Nikolay and Michael are joined by Lev Kokotov to discuss PgDog — including whether or when sharding is needed, the origin story (via PgCat), what's already supported, and what's coming next.    Here are some links to things they mentioned:Lev Kokotov https://postgres.fm/people/lev-kokotovPgDog https://github.com/pgdogdev/pgdogPgCat https://github.com/postgresml/pgcatAdopting PgCat (Instacart blog post) https://www.instacart.com/company/how-its-made/adopting-pgcat-a-nextgen-postgres-proxyPgDog discussion on Hacker News https://news.ycombinator.com/item?id=43364668Citus https://github.com/citusdata/citusSharding & IDs at Instagram (blog post) https://instagram-engineering.com/sharding-ids-at-instagram-1cf5a71e5a5cSharding pgvector (blog post by Lev) https://pgdog.dev/blog/sharding-pgvector~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith special thanks to:Jessie Draws for the elephant artwork 

The Changelog
Revenge of the junior developer (News)

The Changelog

Play Episode Listen Later Mar 24, 2025 8:14


Steve Yegge's latest rant about the future of "coding", Ethan McCue shares some life altering Postgres patterns, Hillel Wayne makes the case for Verification-First Development, Gerd Zellweger experienced lots of pain setting up GitHub Actions & Cascii is a web-based ASCII diagram builder.

Changelog News
Revenge of the junior developer

Changelog News

Play Episode Listen Later Mar 24, 2025 8:14


Steve Yegge's latest rant about the future of "coding", Ethan McCue shares some life altering Postgres patterns, Hillel Wayne makes the case for Verification-First Development, Gerd Zellweger experienced lots of pain setting up GitHub Actions & Cascii is a web-based ASCII diagram builder.

Postgres FM
Snapshots

Postgres FM

Play Episode Listen Later Mar 21, 2025 44:05


Nikolay talks Michael through using cloud snapshots — how they can be used to reduce RTO for huge Postgres setups, also to improve provisioning time, and some major catches to be aware of. Here are some links to things they mentioned:Snapshots on RDS https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_CreateSnapshot.htmlpgBackRest https://pgbackrest.orgWAL-G https://github.com/wal-g/wal-gpg_backup_start and pg_backup_stop (docs) https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-BACKUP How to troubleshoot long Postgres startup (by Nikolay) https://gitlab.com/postgres-ai/postgresql-consulting/postgres-howtos/-/blob/main/0003_how_to_troubleshoot_long_startup.mdRestoring to a DB instance (RDS docs mentioning lazy loading) https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_RestoreFromSnapshot.html Amazon EBS fast snapshot restore https://docs.aws.amazon.com/ebs/latest/userguide/ebs-fast-snapshot-restore.htmlOur 100th episode “To 100TB, and beyond!” https://postgres.fm/episodes/to-100tb-and-beyond ~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork

COMPRESSEDfm
200 | Creating Databases as Easily as Notion Pages with Prisma Postgres

COMPRESSEDfm

Play Episode Listen Later Mar 19, 2025 42:09


Join Amy, Brad, and special guest Ryan Chenkie as they unpack Prisma's expanding ecosystem of database tools. Ryan explains why Prisma launched their own hosted Postgres service and what sets it apart from competitors in the space. The trio examines Prisma's comprehensive feature set including Accelerate for connection pooling, Pulse for real-time events, and optimization tools that help identify performance bottlenecks. They also discuss the upcoming transition from Rust to TypeScript for Prisma's core engine, making it lighter and faster. If you've been curious about modern approaches to database management or wondering which ORM is right for your next project, this conversation provides practical insights and expert perspectives.Show Notes0:00 - Intro1:12 - Working with Prisma and Supabase2:29 - Prisma Postgres Introduction4:17 - Why Choose Postgres6:36 - Prisma's Database Adapter Flexibility8:14 - Serverless Database Architecture11:13 - Connection Pooling with Accelerate14:13 - Pulse for Real-time Database Events16:54 - Studio Integration in Prisma Console18:01 - Database Optimization Tools20:00 - Benefits of Prisma Schema Language22:10 - Prisma Schema vs SQL Definitions23:01 - Comparing Prisma and Drizzle26:24 - Future Improvements to Prisma28:52 - Ryan's History with Prisma32:05 - Learning Resources for Prisma33:37 - Picks and PlugsLinks and ResourcesPrisma ResourcesPrisma WebsitePrisma Twitter/XPrisma YouTube ChannelPrisma Postgres DocumentationPrisma ConsolePrisma VS Code ExtensionPrisma AcceleratePrisma PulsePrisma OptimizePrisma StudioRyan Chenkie ResourcesRyan's Website: https://holodeck.runRyan's YouTube Channel: https://youtube.com/@holodeck_runRyan on Twitter/XFramework and Technologies MentionedRemixRedwood JSSupabasePlanetScaleDrizzle ORMPostgresMySQLMongoDBBrad's ResourcesYouTube Channel: https://youtube.com/@bradgarropyRemix Starter: https://github.com/bradgarropy/remix-appAmy's ResourcesBuild12 Projects: https://buildtwelve.comOther Resources MentionedSkylight FrameAura FrameNetflix Show: "Making Fun"Netflix Show: "Is It Cake"

Thinking Elixir Podcast
245: Supply Chain Security and SBoMs

Thinking Elixir Podcast

Play Episode Listen Later Mar 18, 2025 74:36


News includes a new library called phoenix_sync for real-time sync in Postgres-backed Phoenix applications, Peter Solnica released a Text Parser for extracting structured data from text, a useful tip on finding Hex package versions locally with mix hex.info, Wasmex updated to v0.10 with WebAssembly component support, and Chrome introduces a new browser feature similar to LiveView.JS. We also talked with Alistair Woodman and Jonatan Männchen from the EEF about Jonatan's role as CISO, the Security Working Group, and their work on OpenChain compliance for supply-chain security, Software Bill of Materials (SBoMs), and what these initiatives mean for the Elixir community, and more! Show Notes online - http://podcast.thinkingelixir.com/245 (http://podcast.thinkingelixir.com/245) Elixir Community News https://gigalixir.com/thinking (https://gigalixir.com/thinking?utm_source=thinkingelixir&utm_medium=shownotes) – Gigalixir is sponsoring the show, offering 20% off standard tier prices for a year with promo code "Thinking". https://github.com/electric-sql/phoenix_sync (https://github.com/electric-sql/phoenix_sync?utm_source=thinkingelixir&utm_medium=shownotes) – New library called phoenix_sync providing real-time sync for Postgres-backed Phoenix applications. https://hexdocs.pm/phoenix_sync/readme.html (https://hexdocs.pm/phoenix_sync/readme.html?utm_source=thinkingelixir&utm_medium=shownotes) – Documentation for phoenix_sync, a solution for building modern, real-time apps with local-first/sync in Elixir. https://github.com/josevalim/sync (https://github.com/josevalim/sync?utm_source=thinkingelixir&utm_medium=shownotes) – José Valim's original proof of concept repo that was promptly archived. https://electric-sql.com/ (https://electric-sql.com/?utm_source=thinkingelixir&utm_medium=shownotes) – Electric SQL's platform that syncs subsets of Postgres data into local apps and services, allowing data to be available offline and in-sync. https://solnic.dev/posts/announcing-textparser-for-elixir/ (https://solnic.dev/posts/announcing-textparser-for-elixir/?utm_source=thinkingelixir&utm_medium=shownotes) – Peter Solnica released TextParser, a library for extracting interesting parts of text like hashtags and links. https://hexdocs.pm/text_parser/readme.html (https://hexdocs.pm/text_parser/readme.html?utm_source=thinkingelixir&utm_medium=shownotes) – Documentation for the Text Parser library that helps parse text into structured data. https://www.elixirstreams.com/tips/mix-hex-info (https://www.elixirstreams.com/tips/mix-hex-info?utm_source=thinkingelixir&utm_medium=shownotes) – Elixir stream tip on using mix hex.info to find the latest package version for a Hex package locally, without needing to search on hex.pm or GitHub. https://github.com/phoenixframework/tailwind/blob/main/README.md#updating-from-tailwind-v3-to-v4 (https://github.com/phoenixframework/tailwind/blob/main/README.md#updating-from-tailwind-v3-to-v4?utm_source=thinkingelixir&utm_medium=shownotes) – Guide for upgrading Tailwind to V4 in existing Phoenix applications using Tailwind's automatic upgrade helper. https://gleam.run/news/hello-echo-hello-git/ (https://gleam.run/news/hello-echo-hello-git/?utm_source=thinkingelixir&utm_medium=shownotes) – Gleam 1.9.0 release with searchability on hexdocs, Echo debug printing for improved debugging, and ability to depend on Git-hosted dependencies. https://d-gate.io/blog/everything-i-was-lied-to-about-node-came-true-with-elixir (https://d-gate.io/blog/everything-i-was-lied-to-about-node-came-true-with-elixir?utm_source=thinkingelixir&utm_medium=shownotes) – Blog post discussing how promises made about NodeJS actually came true with Elixir. https://hexdocs.pm/wasmex/Wasmex.Components.html (https://hexdocs.pm/wasmex/Wasmex.Components.html?utm_source=thinkingelixir&utm_medium=shownotes) – Wasmex updated to v0.10 with support for WebAssembly components, enabling applications and components to work together regardless of original programming language. https://ashweekly.substack.com/p/ash-weekly-issue-8 (https://ashweekly.substack.com/p/ash-weekly-issue-8?utm_source=thinkingelixir&utm_medium=shownotes) – AshWeekly Issue 8 covering AshOps with mix task capabilities for CRUD operations and BeaconCMS being included in the Ash HQ installer script. https://developer.chrome.com/blog/command-and-commandfor (https://developer.chrome.com/blog/command-and-commandfor?utm_source=thinkingelixir&utm_medium=shownotes) – Chrome update brings new browser feature with commandfor and command attributes, similar to Phoenix LiveView.JS but native to browsers. https://codebeamstockholm.com/ (https://codebeamstockholm.com/?utm_source=thinkingelixir&utm_medium=shownotes) – Code BEAM Lite announced for Stockholm on June 2, 2025 with keynote speaker Björn Gustavsson, the "B" in BEAM. https://alchemyconf.com/ (https://alchemyconf.com/?utm_source=thinkingelixir&utm_medium=shownotes) – AlchemyConf coming up March 31-April 3 in Braga, Portugal. Use discount code THINKINGELIXIR for 10% off. https://www.gigcityelixir.com/ (https://www.gigcityelixir.com/?utm_source=thinkingelixir&utm_medium=shownotes) – GigCity Elixir and NervesConf on May 8-10, 2025 in Chattanooga, TN, USA. https://www.elixirconf.eu/ (https://www.elixirconf.eu/?utm_source=thinkingelixir&utm_medium=shownotes) – ElixirConf EU on May 15-16, 2025 in Kraków & Virtual. https://goatmire.com/#tickets (https://goatmire.com/#tickets?utm_source=thinkingelixir&utm_medium=shownotes) – Goatmire tickets are on sale now for the conference on September 10-12, 2025 in Varberg, Sweden. Do you have some Elixir news to share? Tell us at @ThinkingElixir (https://twitter.com/ThinkingElixir) or email at show@thinkingelixir.com (mailto:show@thinkingelixir.com) Discussion Resources https://elixir-lang.org/blog/2025/02/26/elixir-openchain-certification/ (https://elixir-lang.org/blog/2025/02/26/elixir-openchain-certification/?utm_source=thinkingelixir&utm_medium=shownotes) https://cna.erlef.org/ (https://cna.erlef.org/?utm_source=thinkingelixir&utm_medium=shownotes) – EEF CVE Numbering Authority https://erlangforums.com/t/security-working-group-minutes/3451/22 (https://erlangforums.com/t/security-working-group-minutes/3451/22?utm_source=thinkingelixir&utm_medium=shownotes) https://podcast.thinkingelixir.com/220 (https://podcast.thinkingelixir.com/220?utm_source=thinkingelixir&utm_medium=shownotes) – previous interview with Alistair https://digital-strategy.ec.europa.eu/en/policies/cyber-resilience-act (https://digital-strategy.ec.europa.eu/en/policies/cyber-resilience-act?utm_source=thinkingelixir&utm_medium=shownotes) – CRA - Cyber Resilience Act https://www.cisa.gov/ (https://www.cisa.gov/?utm_source=thinkingelixir&utm_medium=shownotes) – CISA US Government Agency https://www.cisa.gov/sbom (https://www.cisa.gov/sbom?utm_source=thinkingelixir&utm_medium=shownotes) – Software Bill of Materials https://oss-review-toolkit.org/ort/ (https://oss-review-toolkit.org/ort/?utm_source=thinkingelixir&utm_medium=shownotes) – Desire to integrate with tooling outside the Elixir ecosystem like OSS Review Toolkit https://github.com/voltone/rebar3_sbom (https://github.com/voltone/rebar3_sbom?utm_source=thinkingelixir&utm_medium=shownotes) https://cve.mitre.org/ (https://cve.mitre.org/?utm_source=thinkingelixir&utm_medium=shownotes) https://openssf.org/projects/guac/ (https://openssf.org/projects/guac/?utm_source=thinkingelixir&utm_medium=shownotes) https://erlef.github.io/security-wg/securityvulnerabilitydisclosure/ (https://erlef.github.io/security-wg/security_vulnerability_disclosure/?utm_source=thinkingelixir&utm_medium=shownotes) – EEF Security WG Vulnerability Disclosure Guide Guest Information - https://x.com/maennchen_ (https://x.com/maennchen_?utm_source=thinkingelixir&utm_medium=shownotes) – Jonatan on Twitter/X - https://bsky.app/profile/maennchen.dev (https://bsky.app/profile/maennchen.dev?utm_source=thinkingelixir&utm_medium=shownotes) – Jonatan on Bluesky - https://github.com/maennchen/ (https://github.com/maennchen/?utm_source=thinkingelixir&utm_medium=shownotes) – Jonatan on Github - https://maennchen.dev (https://maennchen.dev?utm_source=thinkingelixir&utm_medium=shownotes) – Jonatan's Blog - https://www.linkedin.com/in/alistair-woodman-51934433 (https://www.linkedin.com/in/alistair-woodman-51934433?utm_source=thinkingelixir&utm_medium=shownotes) – Alistair Woodman on LinkedIn - awoodman@erlef.org - https://github.com/ahw59/ (https://github.com/ahw59/?utm_source=thinkingelixir&utm_medium=shownotes) – Alistair on Github - http://erlef.org/ (http://erlef.org/?utm_source=thinkingelixir&utm_medium=shownotes) – Erlang Ecosystem Foundation Website Find us online - Message the show - Bluesky (https://bsky.app/profile/thinkingelixir.com) - Message the show - X (https://x.com/ThinkingElixir) - Message the show on Fediverse - @ThinkingElixir@genserver.social (https://genserver.social/ThinkingElixir) - Email the show - show@thinkingelixir.com (mailto:show@thinkingelixir.com) - Mark Ericksen on X - @brainlid (https://x.com/brainlid) - Mark Ericksen on Bluesky - @brainlid.bsky.social (https://bsky.app/profile/brainlid.bsky.social) - Mark Ericksen on Fediverse - @brainlid@genserver.social (https://genserver.social/brainlid) - David Bernheisel on Bluesky - @david.bernheisel.com (https://bsky.app/profile/david.bernheisel.com) - David Bernheisel on Fediverse - @dbern@genserver.social (https://genserver.social/dbern)

Postgres FM
GIN Indexes

Postgres FM

Play Episode Listen Later Mar 14, 2025 40:59


Nikolay and Michael discuss GIN indexes in Postgres — what they are, what they're used for, and some limitations to be aware of. Here are some links to things they mentioned:GIN Indexes https://www.postgresql.org/docs/current/gin.htmlGeneralized Search Trees for Database Systems (Hellerstein, Naughton, Pfeffer) https://dsf.berkeley.edu/papers/vldb95-gist.pdf RUM extension https://pgxn.org/dist/rum/1.1.0/Understanding Postgres GIN Indexes: The Good and the Bad (Lukas Fittl) https://pganalyze.com/blog/gin-index~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork

Postgres FM
Best Practices

Postgres FM

Play Episode Listen Later Mar 7, 2025 40:49


Nikolay and Michael use a recent "best practices" article as a prompt — giving a few tips each on the topics mentioned, like schema design, performance, backups, and more. Here are some links to things they mentioned:7 Crucial PostgreSQL Best Practices (recent blog post) https://speakdatascience.com/postgresql-best-practices“Don't do this” episode https://postgres.fm/episodes/dont-do-thisArticle discussion on Hacker News https://news.ycombinator.com/item?id=42992913Mozilla's SQL Style Guide https://docs.telemetry.mozilla.org/concepts/sql_style“SQL vs NoSQL” episode with Franck Pachot https://postgres.fm/episodes/sql-vs-nosqlHA episode https://postgres.fm/episodes/high-availability ~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork

Syntax - Tasty Web Development Treats
882: Aaron Francis is putting PHP in Your JS Files

Syntax - Tasty Web Development Treats

Play Episode Listen Later Mar 5, 2025 54:10


Wes and Scott talk with Aaron Francis about Fusion for Laravel, a new way to seamlessly integrate PHP into JavaScript. They discuss how Fusion expands on Inertia, its potential for React support, and how it simplifies full-stack development. Show Notes 00:00 Welcome to Syntax! 01:22 Aaron's background in PHP Yii Laravel 02:27 What is Fusion for Laravel? Fusion for Laravel 09:14 How Fusion works 13:57 The benefits of Laravel 19:18 Invalidation and caching 25:20 Brought to you by Sentry.io 25:32 Optimistic UI 28:28 React integration? 31:44 Fusion's original name (and the naming process) 33:30 Laravel's approach to frontend frameworks Livewire 37:32 Databases and scaling 41:27 Postgres extensibility and hosting options Crunchy Data Xata 47:44 The vision for Fusion 48:31 Sick Picks + Shameless Plugs Sick Picks Aaron: Better Display CLI Shameless Plugs Aaron: High Performance SQLite Mastering Postgres Screencasting.com Hit us up on Socials! Syntax: X Instagram Tiktok LinkedIn Threads Wes: X Instagram Tiktok LinkedIn Threads Scott: X Instagram Tiktok LinkedIn Threads Randy: X Instagram YouTube Threads

Thinking Elixir Podcast
242: Magic Links and Sudo Mode

Thinking Elixir Podcast

Play Episode Listen Later Feb 25, 2025 20:21


News includes exciting updates to Phoenix gen_auth with magic links and sudo mode security features, a comprehensive guide on Elixir and Phoenix security best practices from Paraxial.io, significant updates to the DaisyUI Components library for Phoenix LiveView reaching version 0.7.0, more on LiveDebugger tool for Phoenix applications, performance improvements in PostgreSQL's self-join handling, and more! Show Notes online - http://podcast.thinkingelixir.com/242 (http://podcast.thinkingelixir.com/242) Elixir Community News https://gigalixir.com/thinking (https://gigalixir.com/thinking?utm_source=thinkingelixir&utm_medium=shownotes) – Visit to sign up and get 20% off your first year. Or use the promo code "Thinking" during signup. https://github.com/phoenixframework/phoenix/pull/6081 (https://github.com/phoenixframework/phoenix/pull/6081?utm_source=thinkingelixir&utm_medium=shownotes) – Phoenix gen_auth is adding support for magic links (passwordless login) and sudo mode for sensitive operations. https://elixirstream.dev/gendiff (https://elixirstream.dev/gendiff?utm_source=thinkingelixir&utm_medium=shownotes) – Additional resource for Phoenix gen_auth updates. https://github.com/9elements/hex-mcp (https://github.com/9elements/hex-mcp?utm_source=thinkingelixir&utm_medium=shownotes) – New Model Context Protocol server providing real-time Hex package version information for AI tools like Cursor. https://paraxial.io/blog/elixir-best (https://paraxial.io/blog/elixir-best?utm_source=thinkingelixir&utm_medium=shownotes) – Michael Lubas shares 11 best practices for security in Elixir and Phoenix applications. https://elixirstatus.com/p/7bQOj-daisyuicomponents---a-phoenix-liveview--daisyui-library (https://elixirstatus.com/p/7bQOj-daisyuicomponents---a-phoenix-liveview--daisyui-library?utm_source=thinkingelixir&utm_medium=shownotes) – DaisyUI Components library for Phoenix LiveView updated to version 0.7.0. https://github.com/phcurado/daisyuicomponents (https://github.com/phcurado/daisy_ui_components?utm_source=thinkingelixir&utm_medium=shownotes) – GitHub repository for DaisyUI Components, featuring over 30 pre-styled components. https://daisy-ui-components-site.fly.dev/storybook/welcome (https://daisy-ui-components-site.fly.dev/storybook/welcome?utm_source=thinkingelixir&utm_medium=shownotes) – Interactive Storybook for exploring DaisyUI Components. https://github.com/phcurado/daisyuicomponents/blob/main/CHANGELOG.md (https://github.com/phcurado/daisy_ui_components/blob/main/CHANGELOG.md?utm_source=thinkingelixir&utm_medium=shownotes) – Changelog showing recent updates to DaisyUI Components. https://github.com/software-mansion-labs/live-debugger (https://github.com/software-mansion-labs/live-debugger?utm_source=thinkingelixir&utm_medium=shownotes) – LiveDebugger tool for Phoenix LiveView applications, providing insights into LiveViews, components, and state transitions. https://www.phoronix.com/news/PostgreSQL-Self-Join-Eliminate (https://www.phoronix.com/news/PostgreSQL-Self-Join-Eliminate?utm_source=thinkingelixir&utm_medium=shownotes) – Postgres adds optimization for self-joins, improving query performance. https://www.lambdadays.org/lambdadays2025 (https://www.lambdadays.org/lambdadays2025?utm_source=thinkingelixir&utm_medium=shownotes) – Lambda Days conference tickets on sale, happening June 12-13 in Kraków, Poland, focusing on functional programming. https://alchemyconf.com/ (https://alchemyconf.com/?utm_source=thinkingelixir&utm_medium=shownotes) – Alchemy Conf happening April 2-3 in Braga, Portugal with 10% discount code "THINKINGELIXIR". https://membrz.club/alchemyconf/events (https://membrz.club/alchemyconf/events?utm_source=thinkingelixir&utm_medium=shownotes) – Direct link for purchasing Alchemy Conf tickets. Do you have some Elixir news to share? Tell us at @ThinkingElixir (https://twitter.com/ThinkingElixir) or email at show@thinkingelixir.com (mailto:show@thinkingelixir.com) Find us online - Message the show - Bluesky (https://bsky.app/profile/thinkingelixir.com) - Message the show - X (https://x.com/ThinkingElixir) - Message the show on Fediverse - @ThinkingElixir@genserver.social (https://genserver.social/ThinkingElixir) - Email the show - show@thinkingelixir.com (mailto:show@thinkingelixir.com) - Mark Ericksen on X - @brainlid (https://x.com/brainlid) - Mark Ericksen on Bluesky - @brainlid.bsky.social (https://bsky.app/profile/brainlid.bsky.social) - Mark Ericksen on Fediverse - @brainlid@genserver.social (https://genserver.social/brainlid) - David Bernheisel on Bluesky - @david.bernheisel.com (https://bsky.app/profile/david.bernheisel.com) - David Bernheisel on Fediverse - @dbern@genserver.social (https://genserver.social/dbern)