IGeometry

Share on

Welcome to IGeometry podcast with your host Hussein Nasser. If you like GIS and software engineering you’ve came to the right place. All Opinions are my own.

Hussein Nasser

Jun 17, 2026 LATEST EPISODE
monthly NEW EPISODES
21m AVG DURATION
531 EPISODES

Search for episodes from IGeometry with a specific topic:

Latest episodes from IGeometry

Postgres is half as fast in Linux 7.0

Play Episode Listen Later Jun 17, 2026 34:08

An aws engineer discovered a 50% regression in postgres throughput while testing the new Linux 7.0 kernel. The cause turns out to be massive TLB and page faults exacerbated by Postgres process-based design. In this backend engineering show episode I dive deep into how this was discovered, the root cause and the possible fixes and workarounds. Intermediate and Advanced Backend Engineering Course Bundlehttps://courses.husseinnasser.com/bundleMy Book, Root Cause: Stories and Lessons from Two Decades of Backend Engineering Bugs https://amzn.to/4cKfZhe 0:00 Intro2:30 The Discovery6:30 Spinlocks9:25 Preemption 13:00 Root Cause17:00 How Postgres Processes exacerbated the problem 22:30 Is the fix easy?25:50 Summary

lessons linux intermediate two decades postgres tlb preemption

Don't let AI rob you

Play Episode Listen Later Jun 8, 2026 31:53

A discussion about why many engineers still love the struggle, the mistakes, and the process of figuring things out themselves. This is how we grow and get better and stronger. Letting AI do everything (even though it can't) robs us this feeling..

My new book - Root cause, Stories from two decades of backend bugs

Play Episode Listen Later Apr 15, 2026 9:48

I wrote a new book that has been in the works for years. It is called Root Cause, and it is for those who enjoy the art of backend engineering.Early in my career, 20 years ago, I built backend and database applications without fully grasping their inner mechanics. Performance issues, race conditions, bugs, and even data corruption often left me lost.Since that day, I resolved to truly understand how systems work. From networking protocols and intermediary proxies to backend services and various database engines. I made it a habit to follow every request on its journey through the dark alleys of the network, down to the bowels of the database engine, meanwhile interacting with various kernel data structures in the process at every hop, and back.I became obsessed with understanding what happens behind the scenes in software. Not just what breaks, and how but also why and what was the source of the bleed. Root Cause is a collection of the most interesting bugs I encountered, ranging from performance bottlenecks and non-deterministic crashes to subtle data inconsistencies and incorrect results.This book is for anyone curious about how production backend systems really behave under pressure, and how to debug them when they don't. Even when you don't have access to the source code.Root cause consists of 15 chapters, each is a story about a backend bug, with investigation, diagrams, a section of a fundamental concept until the root cause is revealed.Grab your copy here paperback or kindle ebookpaperbackhttps://amzn.to/4cKfZheebookhttps://amzn.to/4cfQjJj

stories performance root new books bugs root cause backend two decades

5 Backend Design Patterns for Managing Threads and Sockets

Play Episode Listen Later Jan 19, 2026 46:09

In this video I introduce 5 different design patterns for building backend applications. Each mode explains how a socket listener is established, a connections are established and how threads and connections are managed to read, write and process requests.

managing threads backend design patterns sockets

Page Tables

Play Episode Listen Later Dec 15, 2025 46:39

Page tables provide the mapping between virtual memory and physical memory for each process. This means it needs to be as efficient and as fast as possible. I explore the inner workings of page tables in this episode.0:00 Intro2:00 Virtual Memory ⁃ ⁃ 8:00 MMU10:00 Page Tables ⁃ ⁃ ⁃ ⁃ ⁃ ⁃ ⁃ 11:30 Single Table Byte Addressability ⁃ ⁃ ⁃ ⁃ ⁃ ⁃ ⁃ ⁃ 16:00 Single Table Page addressability ⁃ ⁃ ⁃ ⁃ ⁃ 19:00 Multi-level Paging (Radix tree) ⁃ ⁃ 31:00 Huge Tables ⁃ ⁃ 33:00 TLB ⁃ ⁃ Summary

tables virtual memory

CPU and Kernel Page Faults

Play Episode Listen Later Nov 24, 2025 48:37

Page faults occurs when the process tries to access a memory that isn't backed by a physical page kernel raises a fault which loads a page. It happens on first access, stack expansion, COW, swap and much more. However it comes with a cost. In this episode of the backend engineering show I dissect the need and the cost page faults in the kernel. 0:00 Intro 4:00 Virtual memoryAbstraction of physical memoryMemory sharingAllow more processes to run , unused go to diskNuma, kernel can place memory near the cpu12:00 VMA areasText/code Data BSSHeapStack19:50 Kernel mode25:30 What is a Page fault?30:30 First access page fault33:00 Stack Expansion page fault34:30 CoW page fault38:00 Swap page fault39:39 File backed page fault40:29 Permission page fault 45:30 Summary

permission cows file swap faults kernel

Amazon US-EAST-1 Outage in Details

Play Episode Listen Later Oct 31, 2025 24:26

On October 19 2025 AWS experienced an outage that lasted over a day, 10 days later we finally got the root cause analysis and we know exactly what caused the DNS to fail0:00 Summary 5:30 How did Dynamo lost its DNS?13:41 EC2 Errors 16:16 Network Load Balancer ErrorsRCA here https://aws.amazon.com/message/101925/

aws outage dns dynamo amazon us us east

Graceful shutdown in HTTP

Play Episode Listen Later Oct 17, 2025 25:49

There are cases where the backend may need to close the connection to prevent unexpected situations, prevent bad actors or simply just free up resources. Closing a connection gracefully allows clients and backends to clean up and finish any pending requests. In this episode of the backend engineering show I discuss graceful connections in both HTTP/1.1 via the connection header and HTTP/2 via the GOAWAY frame. 0:00 Intro4:58 Why shutdown connection? 6:46 HTTP/1.1 Graceful shutdown12:26 Cost of HTTP/2 17:40 HTTP/2 GoAWAY frame23:40 SummaryLinkshttps://www.youtube.com/watch?v=fVKPrDrEwTI&t=1s https://chromium.googlesource.com/chromium/src/net/%2B/master/socket/client_socket_pool_manager.cc#76https://issues.chromium.org/issues/40555364https://issues.chromium.org/issues/40501721

cost shutdowns 2b graceful

Asynchronous IO in Postgres 18

Play Episode Listen Later Oct 3, 2025 41:12

Postgres 18 has been released with many exciting features such as UUIDv7, Over explain module, composite index skip scans, and the most anticipated asynchronous IO with worker and io_uring mode which I uncover in this show. Hope you enjoy it0:00 Intro1:30 Synchronous vs Asynchronous calls3:00 Synchronous IO6:30 Asynchronous IO10:00 Postgres 17 synchronous io 17:20 The challenge of Async IO in Postgres 1820:00 io_method worker23:00 io_method io_uring29:30 io_method sync 31:08 Async IO isn't done! 31:30 Support for backend writers32:36 Improve worker io_method33:00 direct io support 37:00 Summary

io asynchronous postgres synchronous

kTLS - Kernel level TLS

Play Episode Listen Later Jun 13, 2025 22:55

Fundamentals of Operating Systems Course https://oscourse.winktls is brilliant.TLS encryption/decryption often happens in userland. While TCP lives in the kernel. With ktls, userland can hand the keys to the kernel and the kernel does crypto. When calling write, the kernel encrypts the packet and send it to the NIC.When calling read, the kernel decrypts the packet and handed it to the userspace. This mode still taxes the host's CPU of course, so there is another mode where the kernel offloads the crypto to the NIC device! Host CPU becomes free. Incoming packets to the NIC are decrypted in device before they are DMAed to the kernel. outgoing packets are encrypted before they leave the NIC to the network.ktls still need handshake to happen in userspace. There is also enabling zerocopy in some cases (now that kernel has context) Deserves a video. So much good stuff.0:00 Intro2:00 Userspace SSL Libraries 3:00 ktls 6:00 Kernel Encrypts/Decrypts (TLS_SW)8:20 NIC offload mode (TLS_HW)10:15 NIC does it all (TLS_HW_RECORD)12:00 Write TX Example13:50 Read RX Example17:00 Zero copy (sendfile)https://docs.kernel.org/networking/tls-offload.html

fundamentals nic incoming deserves cpu kernel tls

The beauty of the CPU

Play Episode Listen Later May 9, 2025 9:38

If you are bored of contemporary topics of AI and need a breather, I invite you to join me to explore a mundane, fundamental and earthy topic.The CPU.A reading of my substack article https://hnasr.substack.com/p/the-beauty-of-the-cpu

ai beauty cpu

Sequential Scans in Postgres just got faster

Play Episode Listen Later Apr 18, 2025 27:36

This new PostgreSQL 17 feature is game changer. They know can combine IOs when performing sequential scan. Grab my database coursehttps://courses.husseinnasser.com

ios scans sequential postgresql postgres

Does discipline work?

Play Episode Listen Later Apr 11, 2025 10:06

No technical video today, just talking about the idea of discipline and consistency.

discipline

Socket management and Kernel Data structures

Play Episode Listen Later Apr 4, 2025 31:26

Fundamentals of Operating Systems Course This video is an overview of how the operating system kernel does socket management and the different data structures it utilizes to achieve that.timestamps0:00 Intro1:38 Socket vs Connections7:50 SYN and Accept Queue18:56 Socket Sharding23:14 Receive and Send buffers27:00 Summary

management receive fundamentals kernel syn socket data structures

The genius of long polling

Play Episode Listen Later Dec 6, 2024 28:57

Polling is the ability to interrogate a backend to see if a piece of information is ready. It can introduce a chatty system and as a result long polling was born. In this video I explain the beauty of this design pattern and how we can push it to its limit. 0:00 Intro 0:45 Polling 2:30 Problem with Polling 3:50 Long Polling 8:18 Timeouts 10:00 Long Polling Benefits 12:00 Make requests into Long Polling 17:36 Request Resumption 21:40 Summary

genius polling

Six stages of a good software engineer

Play Episode Listen Later Nov 1, 2024 39:27

You get better as a software engineer when you go through these stages. 0:00 Intro 1:15 Understand a technology 7:07 Articulate how it works 15:30 Understand its' limitations 19:48 Try to build something better 27:45 Realize what you built also has limitations 32:48 Appreciate the original tech as is Understand a technology We use technologies all the time without knowing how it works. And it is ok not knowing how things work if interests isn't there. But when there is interest to understand how something works, pursue it. It feels good when you understand how something works because you work better with it, you swim with the tide instead of against it. When I learned how TCP/IP work.. you would appreciate every connection request, how you read requests. You will ask questions, what is my code doing here? When exactly I'm creating connections? When am I reading from the connection? Is it safe to share connections? Articulate how it works This one is not easy, you might think you understand something until you try to explain how it works. If you find yourself using jargon you probably don't understand and you just try to impress others. Have you seen people who want to talk about something to show they understand it? It's the opposite. Try to truly articlate how it works, you will really understand it , back to 1. I thought I understand how backend reads requests until I tried to speak to it. Understand the technology limitations Once 1,2 are done you will truly understand the tech, now you are confidant, you are excited about the tech and you will truly see when you can use the tech to its full potential and also know the weak points of the tech where it breaks, this happens a lot with TCP/IP. We know tcps limitations. Try to build something better This one is optional and can be skipped, but attempting to design or building something better then the tech because you know the limitations will truly reveal how you became better. But the challenge here is the ego, you might understand the limitations but you problem is thinking that what you will build is flawless. This step must be proceed with caution. Realize what you build also has limitation Dust settles.. this step hurts, and you may take a while to realize it, but whatever you build will have flaws… and when you realize this it is when you get better as an engineer. Appreciate the tech as is This is when you are back full circle you are back to the first stage, look at the technology and understand it but don't judge it.. just know the limitations and its strength and flow with it. Stop fighting and instead build around a tech, does that mean you shouldn't build anything new, of course not. Go build, but don't stress around making something better to defeat existing tech. But actually build it for building it.

realize software engineers articulate tcp ip six stages

This new Linux patch can speed up Reading Requests

Play Episode Listen Later Oct 25, 2024 18:12

Fundamentals of Operating Systems Course https://oscourse.win Very clever! We often call read/rcv system call to read requests from a connection, this copies data from kernel receive buffer to user space which has a cost. This new patch changes this to allow zero copy with notification. “Reading' data out of a socket instead becomes a “notification” mechanism, where the kernel tells userspace where the data is.” This kernel patch enables zero copy from the receive queue. https://lore.kernel.org/io-uring/ZwW7_cRr_UpbEC-X@LQ3V64L9R2/T/ 0:00 Intro 1:30 patch summary 7:00 Normal Connection Read (Kernel Copy) 12:40 Zero copy Read 15:30 Performance

performance reading fundamentals patch requests speed up new linux

Cloudflare's 150ms global cache purge | Deep Dive

Play Episode Listen Later Oct 18, 2024 62:21

Cloudflare built a global cache purge system that runs under 150 ms. This is how they did it. Using RockDB to maintain local CDN cache, and a peer-to-peer data center distributed system and clever engineering, they went from 1.5 second purge, down to 150 ms. However, this isn't full picture, because that 150 ms is just actually the P50. In this video I explore Clouldflare CDN work, how the old core-based centralized quicksilver, lazy purge work compared to the new coreless, decentralized active purge. In it I explore the pros and cons of both systems and give you my thoughts of this system. 0:00 Intro 4:25　From Core Base Lazy Purge to Coreless Active 12:50 CDN Basics 16:00 TTL Freshness 17:50 Purge 20:00 Core-Based Purge 24:00 Flexible Purges 26:36 Lazy Purge 30:00 Old Purge System Limitations 36:00 Coreless / Active Purge 39:00 LSM vs BTree 45:30 LSM Performance issues 48:00 How Active Purge Works 50:30 My thoughts about the new system 58:30 Summary Cloudflare blog https://blog.cloudflare.com/instant-purge/ Mentioned Videos Cloudflare blog https://blog.cloudflare.com/instant-purge/ Percentile Tail Latency Explained (95%, 99%) Monitor Backend performance with this metric https://www.youtube.com/watch?v=3JdQOExKtUY How Discord Stores Trillions of Messages | Deep Dive https://www.youtube.com/watch?v=xynXjChKkJc Fundamentals of Operating Systems Course https://os.husseinnasser.com Backend Troubleshooting Course https://performance.husseinnasser.com

global deep dive purge cache cloudflare cdn lsm p50

MySQL is having a bumpy journey

Play Episode Listen Later Sep 28, 2024 28:34

Fundamentals of Database Engineering udemy course https://databases.win MySQL has been having bumpy journey since 2018 with the release of the version 8.0. Critical crashes that made to the final product, significant performance regressions, and tons of stability and bugs issues. In this video I explore what happened to MySql, are these issues getting fixed? And what is the current state of MySQL at the end of 2024. 0:00 Intro 2:00 MySQL 8.0 vs 5.7 Performance 11:00 Critical Crash in 8.0.38, 8.4.1 and 9.0.0 15:40 Is 8.4 better than 8.0.36? 16:30 More Features = More Bugs 22:30 Summary and my thoughts resources https://x.com/MarkCallaghanDB/status/1786428909376164263 https://www.percona.com/blog/do-not-upgrade-to-any-version-of-mysql-after-8-0-37/ http://smalldatum.blogspot.com/2024/09/mysql-innodb-vs-sysbench-on-large-server.html https://www.percona.com/blog/mysql-8-0-vs-5-7-are-the-newer-versions-more-problematic/

fundamentals bumpy mysql

How many kernel calls in NodeJS vs Bun vs Python vs native C

Play Episode Listen Later Sep 20, 2024 20:41

Fundamentals of Operating Systems Course https://oscourse.win In this video I use strace a performance tool that measures how many system calls does a process makes. We compare a simple task of reading from a file, and we run the program in different runtimes, namely nodejs, buns , python and native C. We discuss the cost of kernel mode switches, system calls and pe 0:00 Intro 5:00 Code Explanation 6:30 Python 9:30 NodeJS 12:30 BunJS 13:12 C 16:00 Summary

native fundamentals python kernel nodejs

When do you use threads?

Play Episode Listen Later Sep 13, 2024 31:08

Fundamentals of Operating Systems Course https://os.husseinnasser.com When do you use threads? I would say in scenarios where the task is either 1) IO blocking task 2) CPU heavy 3) Large volume of small tasks In any of the cases above, it is favorable to offload the task to a thread. 1) IO blocking task When you read from or write to disk, depending on how you do it and the kernel interface you used, the write might be blocking. This means the process that executes the IO will not be allowed to execute any more code until the write/read completes. That is why you see most logging operations are done on a secondary thread (like libuv that Node uses) this way the thread is blocked but the main process/thread can resume its work. If you can do file reads/writes asynchronously with say io_uring then you technically don't need threading. Now notice how I said file IO because it is different than socket IO which is always done asynchronously with epoll/select etc. 2) CPU heavy The second use case is when the task requires lots of CPU time, which then starves/blocks the rest of the process from doing its normal job. So offloading that task to a thread so that it runs on a different core can allow the main process to continue running on its the original core. 3) Large volume of small tasks The third use case is when you have large amount of small tasks and single process can't deliver as much throughput. An example would be accepting connections, a single process can only accept connections so fast, to increase the throughput in case where you have massive amount of clients connecting, you would spin multiple threads to accept those connections and of course read and process requests. Perhaps you would also enable port reuse so that you avoid accept mutex locking. Keep in mind threads come with challenges and problems so when it is not required. 0:00 Intro 1:40 What are threads? 7:10 IO blocking Tasks 17:30 CPU Intensive Tasks 22:00 Large volume of small tasks

large threads fundamentals io cpu node

Frontend and Backends Timeouts

Play Episode Listen Later Sep 7, 2024 25:23

I am fascinated by how timeouts affect backend and frontend programming. When a party is waiting on something you can place a timeout to break the wait. This is useful for freeing resources to more critical processes, detecting slow operations and even avoiding DOS attacks. Contrary to common beliefs, timeouts are not exclusive to request processing, they can be applied to other parts of the frontend-backend communications. Let us explore this briefly. 0:00 Intro 2:30 Connection Timeout 5:00 Request Read timeout 10:00 Wait Timeout 12:00 Usage Timeout 14:00 Response Timeout 16:00 Canceling a request 19:50 Proxies and timeouts

contrary canceling backend frontend proxies timeouts

Postgres is combining IO in version 17

Play Episode Listen Later Sep 2, 2024 27:39

Learn more about database and OS internals, check out my courses Fundamentals of database engineering https://databases.win Fundamentals of operating systems https://oscourse.win This new PostgreSQL 17 feature is game changer. You see, postgres like most databases work with fixed size pages. Pretty much everything is in this format, indexes, table data, etc. Those pages are 8K in size, each page will have the rows, or index tuples and a fixed header. The pages are just bytes in files and they are read and cached in the buffer pool. To read page 0, for example, you would call read on offset 0 for 8192 bytes, To read page 1 that is another read system call from offset 8193 for 8192, page 7 is offset 57,345 for 8192 and so on. If table is 100 pages stored a file, to do a full table scan, we would be making 100 system calls, each system call had an overhead (I talk about all of that in my OS course). The enhancement in Postgres 17 is to combine I/Os you can specify how much IO to combine, so technically while possible you can scan that entire table in one system call doesn't mean its always a good idea of course and Ill talk about that. This also seems to included a vectorized I/O, with preadv system call which takes an array of offsets and lengths for random reads. The challenge will become how to not read too much, say I'm doing a seq scan to find something, I read page 0 and found it and quit I don't need to read any more pages. With this feature I might read 10 pages in one I/O and pull all its content, put in shared buffers only to find my result in the first page (essentially wasting disk bandwidth, memory etc) It is going to be interesting to balance this out.

os ios fundamentals io ill 8k postgresql postgres

Windows vs Linux Kernel

Play Episode Listen Later Aug 30, 2024 37:23

Fundamentals of Operating Systems Course https://os.husseinnasser.com Why Windows Kernel connects slower than Linux I explore the behavior of TCP/IP stack in Windows kernel when it receives a RST from the backend server especially when the host is available but the port we are trying to connect to is not. This behavior is exacerbated by having both IPv6 and IPv4 and if the happy eye ball protocol is in place where IPv6 is favorable. 0:00 Intro 0:30 Fundamentals TCP/IP 3:00 Unreachable Port Behavior 6:00 Client Kernel Behavior (Linux vs Windows) 11:40 Slow TCP Connect on Windows 15:00 localhost, IPv6 and IPv4 20:00 Happy Eyeballs 28:00 Registry keys to change the behavior 31:00 Port Unreachable vs Host Unreachable https://daniel.haxx.se/blog/2024/08/14/slow-tcp-connect-on-windows/

windows fundamentals registry ipv6 tcp ip ipv4 linux kernel rst

Running out of TCP ephemeral source ports

Play Episode Listen Later Aug 25, 2024 20:06

In this episode of the backend engineering show I describe an interesting bug I ran into where the web server ran out of ephemeral ports causing the system to halt. 0:00 Intro 0:30 System architecture 2:20 The behavior of the bug 4:00 Backend Troubleshooting 7:00 The cause 15:30 Ephemeral ports on loopback

running system ports ephemeral

io uring gets even faster

Play Episode Listen Later May 20, 2024 16:35

Fundamentals of Operating Systems Course https://os.husseinnasser.com Linux I/O expert and subsystem maintainer Jens Axboe has submitted all of the IO_uring feature updates ahead of the imminent Linux 6.10 merge window. In this video I explore this with a focus on what zerocopy. 0:00 Intro 0:30 IO_uring gets faster 2:00 What is io_uring 7:00 How Normal Copying Work 12:00 How Zero Copy Works 13:50 ZeroCopy and TLS https://www.phoronix.com/news/Linux-6.10-IO_uring https://lore.kernel.org/io-uring/fef75ea0-11b4-4815-8c66-7b19555b279d@kernel.dk/?s=09

fundamentals io linux

They made Python faster with this compiler option

Play Episode Listen Later May 7, 2024 29:04

Fundamentals of Operating Systems Course https://oscourse.win Looks like fedora is compiling cpython with the -o3 flag, which does aggressive function inlining among other optimizations. This seems to improve python benchmarks performance by at most 1.16x at a cost of an extra 3MB in binary size (text segment). Although it does seem to slow down some benchmarks as well though not significantly. O1 - local register allocation, subexpression elimination O2 - Function inlining only small functions O3 - Agressive inlining, SMID 0:00 Intro 1:00 Fedora Linux gets Fast Python 5:40 What is Compiling? 9:00 Compiling with No Optimization 12:10 Compiling with -O1 15:30 Compiling with -O2 20:00 Compiling with -O3 23:20 Showing Numbers Backend Troubleshooting Course https://performance.husseinnasser.com

option fundamentals python compiling compiler smid 3mb o1

How Apache Kafka got faster by switching ext4 to XFS

Play Episode Listen Later Apr 29, 2024 33:52

https://oscourse.win Allegro improved their Kafka produce tail latency by over 80% when they switched from ext4 to xfs. What I enjoyed most about this article is the detailed analysis and tweaking the team made to ext4 before considering switching to xfs. This is a classic case of how a good tech blog looks like in my opinion. 0:00 Intro 0:30 Summary 2:35 How Kafka Works? 5:00 Producers Writes are Slow 7:10 Tracing Kafka Protocol 12:00 Tracing Kernel System Calls 16:00 Journaled File Systems 21:00 Improving ext4 26:00 Switching to XFS Blog https://blog.allegro.tech/2024/03/kafka-performance-analysis.html

improving switching kafka apache kafka ext4

Google Patches Linux kernel with 40% TCP performance

Play Episode Listen Later Mar 5, 2024 13:35

Get my backend course https://backend.win Google submitted a patch to Linux Kernel 6.8 to improve TCP performance by 40%, this is done via rearranging the tcp structures for better cpu cache lines, I explore this here. 0:00 Intro 0:30 Google improves Linux Kernel TCP by 40% 1:40 How CPU Cache Line Works 6:45 Reviewing the Google Patch https://www.phoronix.com/news/Linux-6.8-Networking https://lore.kernel.org/netdev/20231129072756.3684495-1-lixiaoyan@google.com/ Discovering Backend Bottlenecks: Unlocking Peak Performance https://performance.husseinnasser.com

google performance reviewing linux patches tcp linux kernel

Database Torn pages

Play Episode Listen Later Feb 29, 2024 15:33

0:00 Intro 2:00 File System Block vs Database Pages 4:00 Torn pages or partial page 7:40 How Oracle Solves torn pages 8:40 MySQL InnoDB Doublewrite buffer 10:45 Postgres Full page writes

pages torn databases

Cloudflare Open sources Pingora (NGINX replacement)

Play Episode Listen Later Feb 28, 2024 31:05

Get my backend course https://backend.win Cloudflare has announced they are opening sources Pingora as a networking framework! Big news, let us discuss 0:00 Intro 0:30 Reasons why Cloudflare built Pingora? 3:00 It is a framework! 7:30 What in Pingora? 11:50 Security in Pingora 13:45 Multi-threading in Pingora 21:00 Customization vs Configuration 25:00 Summary ⁠https://blog.cloudflare.com/pingora-open-source/?utm_campaign=cf_blog&utm_content=20240228&utm_medium=organic_social&utm_source=twitter⁠

security replacement cloudflare customization nginx

The Internals of MongoDB

Play Episode Listen Later Feb 19, 2024 44:57

https://backend.win https://databases.win I'm a big believer that database systems share similar core fundamentals at their storage layer and understanding them allows one to compare different DBMS objectively. For example, How documents are stored in MongoDB is no different from how MySQL or PostgreSQL store rows. Everything goes to pages of fixed size and those pages are flushed to disk. Each database define page size differently based on their workload, for example MongoDB default page size is 32KB, MySQL InnoDB is 16KB and PostgreSQL is 8KB. The trick is to fetch what you need from disk efficiently with as fewer I/Os as possible, the rest is API. In this video I discuss the evolution of MongoDB internal architecture on how documents are stored and retrieved focusing on the index storage representation. I assume the reader is well versed with fundamentals of database engineering such as indexes, B+Trees, data files, WAL etc, you may pick up my database course to learn the skills. Let us get started.

ios api wal mongodb mysql postgresql internals

The Beauty of Programming Languages

Play Episode Listen Later Feb 19, 2024 17:33

In this video I explore the type of languages, compiled, garbage collected, interpreted, JIT and more.

beauty programming languages jit

The Danger of Defaults - A PostgreSQL Story

Play Episode Listen Later Feb 18, 2024 11:34

I talk about default values and how PostgreSQL 14 got slower when a default parameter has changed. Mike's blog https://smalldatum.blogspot.com/2024/02/it-wasnt-performance-regression-in.html

danger defaults postgresql

Database Background writing

Play Episode Listen Later Feb 16, 2024 9:08

Background writing is a process that writes dirty pages in shared buffer to the disk (well goes to the OS file cache then get flushed to disk by the OS) I go into this process in this video

writing os databases

The Cost of Memory Fragmentation

Play Episode Listen Later Jan 29, 2024 39:07

Fragmentation is a very interesting topic to me, especially when it comes to memory. While virtually memory does solve external fragmentation (you can still allocate logically contiguous memory in non-contiguous physical memory) it does however introduce performance delays as we jump all over the physical memory to read what appears to us for example as contiguous array in virtual memory. You see, DDR RAM consists of banks, rows and columns. Each row has around 1024 columns and each column has 64 bits which makes a row around 8kib. The cost of accessing the RAM is the cost of “opening” a row and all its columns (around 50-100 ns) once the row is opened all the columns are opened and the 8 kib is cached in the row buffer in the RAM. The CPU can ask for an address and transfer 64 bytes at a time (called bursts) so if the CPU (or the MMU to be exact) asks for the next 64 bytes next to it, it comes at no cost because the entire row is cached in the RAM. However if the CPU sends a different address in a different row the old row must be closed and a new row should be opened taking an additional 50 ns hit. So spatial access of bytes ensures efficiency, So fragmentation does hurt performance if the data you are accessing are not contiguous in physical memory (of course it doesn't matter if it is contiguous in virtual memory). This kind of remind me of the old days of HDD and how the disk needle physically travels across the disk to read one file which prompted the need of “defragmentation” , although RAM access (and SSD NAND for that matter) isn't as bad. Moreover, virtual memory introduces internal fragmentation because of the use of fixed-size blocks (called pages and often 4kib in size), and those are mapped to frames in physical memory. So if you want to allocate a 32bit integer (4 bytes) you get a 4 kib worth of memory, leaving a whopping 4092 allocated for the process but unused, which cannot be used by the OS. These little pockets of memory can add up as many processes. Another reason developers should take care when allocating memory for efficiency.

cost memory os ram cpu fragmentation hdd mmu

The Real Hidden Cost of a Request

Play Episode Listen Later Dec 13, 2023 13:08

In this video I explore the hidden costs of sending a request from the frontend to the backend Heard https://medium.com/@hnasr/the-journey-of-a-request-to-the-backend-c3de704de223

hidden cost

Why create Index blocks writes

Play Episode Listen Later Oct 28, 2023 12:04

Fundamentals of Database Engineering udemy course (link redirects to udemy with coupon) https://database.husseinnasser.com Why create Index blocks writes In this video I explore how create index, why does it block writes and how create index concurrently work and allow writes. 0:00 Intro 1:28 How Create Index works 4:45 Create Index blocking Writes 5:00 Create Index Concurrently

fundamentals blocks index writes

Consider this before migrating the Backend to HTTP/3

Play Episode Listen Later Oct 5, 2023 12:19

HTTP/3 is getting popular in the cloud scene but before you migrate to HTTP/3 consider its cost. I explore it here. 0:00 Intro HTTP/3 is getting popular 3:40 HTTP/1.1 Cost 5:18 HTTP/2 Cost 6:30 HTTP/3 Cost https://blog.apnic.net/2023/09/25/why-http-3-is-eating-the-world/

backend migrating

Encrypted Client Hello - The Pros & Cons

Play Episode Listen Later Sep 29, 2023 33:17

The Encrypted Client Hello or ECH is a new RFC that encrypts the TLS client hello to hide sensitive information like the SNI. In this video I go through pros and cons of this new rfc. 0:00 Intro 2:00 SNI 4:00 Client Hello 8:40 Encrypted Client Hello 11:30 Inner Client Hello Encryption 18:00 Client-Facing Outer SNI 21:20 Decrypting Inner Client Hello 23:30 Disadvantages 26:00 Censorship vs Privacy ECH https://blog.cloudflare.com/announcing-encrypted-client-hello/ https://chromestatus.com/feature/6196703843581952

clients censorship pros cons disadvantages tls encrypted rfc ech sni

The Journey of a Request to the Backend

Play Episode Listen Later Aug 1, 2023 52:14

From the frontend through the kernel to the backend processWhen we send a request to a backend most of us focus on the processing aspect of the request which is really just the last step. There is so much more happening before a request is ready to be processed, most of this step happens in the Kernel. I break this into 6 steps, each step can theoretically be executed by a dedicated thread or process. Pretty much all backends, web servers, proxies, frameworks and even databases have to do all these steps and they all do choose to do it differently. Grab my backend performance course https://performance.husseinnasser.com 0:00 Intro 3:50 What is a Request? 10:14 Step 1 - Accept 21:30 Step 2 - Read 29:30 Step 3 - Decrypt 34:00 Step 4 - Parse 40:36 Step 5 - Decode 43:14 Step 6 - Process Medium article https://medium.com/@hnasr/the-journey-of-a-request-to-the-backend-c3de704de223

accept backend kernel

They Enabled Postgres Partitioning and their Backend fell apart

Play Episode Listen Later Jun 24, 2023 32:40

In a wonderful blog, Kyle explores the pains he faced managing a Postgres instance for a startup he works for and how enabling partitioning sigintfically created wait events causing the backend and subsequently NGINX to through 500 errors. We discuss this in this video/podcast https://www.kylehailey.com/post/postgres-partition-pains-lockmanager-waits

fell enabled backend postgres nginx partitioning

WebTransport - A Backend Game Changer

Play Episode Listen Later Jun 9, 2023 15:01

WebTransport is a cutting-edge protocol framework designed to support multiplexed and secure transport over HTTP/2 and HTTP/3. It brings together the best of web and transport technologies, providing an all-in-one solution for real-time, bidirectional communication on the web. Watch full episode (subscribers only) https://spotifyanchor-web.app.link/e/cTSGkq5XuAb

game changers backend

Your SSD lies but that's ok | Postgres fsync

Play Episode Listen Later May 25, 2023 30:04

fsync is a linux system call that flushes all pages and metadata for a given file to the disk. It is indeed an expensive operation but required for durability especially for database systems. Regular writes that make it to the disk controller are often placed in the SSD local cache to accumulate more writes before getting flushed to the NAND cells. However when the disk controller receives this flush command it is required to immediately persist all of the data to the NAND cells. Some SSDs however don't do that because they don't trust the host and no-op the fsync. In this video I explain this in details and go through details on how postgres provide so many options to fine tune fsync 0:00 Intro 1:00 A Write doesn't write 2:00 File System Page Cache 6:00 Fsync 7:30 SSD Cache 9:20 SSD ignores the flush 9:30 15 Year old Firefox fsync bug 12:30 What happens if SSD loses power 15:00 What options does Postgres exposes? 15:30 open_sync (O_SYNC) 16:15 open_datasync (O_DSYNC) 17:10 O_DIRECT 19:00 fsync 20:50 fdatasync 21:13 fsync = off 23:30 Don't make your API simple 26:00 Database on metal?

lies write regular api databases firefox ssd postgres nand

The problem with software engineering

Play Episode Listen Later May 21, 2023 17:39

ego is the main problem to a defective software product. the ego of the engineer or the tech lead seeps into the quality of the product. Fundamentals of Backend Engineering Design patterns udemy course (link redirects to udemy with coupon) https://backend.husseinnasser.com

fundamentals software engineering

2x Faster Reads and Writes with this MongoDB feature | Clustered Collections

Play Episode Listen Later May 11, 2023 27:01

Fundamentals of Database Engineering udemy course (link redirects to udemy with coupon) https://database.husseinnasser.com In version 5.3, MongoDB introduced a feature called clustered collection which stores documents in the _id index as oppose to the hidden wiredTiger hidden index. This eliminates an entire b+tree seek for reads using the _id index and also removes the additional write to the hidden index speeding both reads and writes. However like we know in software engineering, everything has a cost. This feature does come with a few that one must be aware of before using it. In this video I discuss the following How Original MongoDB Collections Work How Clustered Collections Work Benefits of Clustered Collections Limitations of Clustered Collections

feature fundamentals reads collections writes mongodb clustered

Prime Video Swaps Microservices for Monolith: 90% Cost Reduction

Play Episode Listen Later May 6, 2023 35:58

Prime video engineering team has posted a blog detailing how they moved their live stream monitoring service from microservices to a monolith reducing their cost by 90%, let us discuss this 0:00 Intro 2:00 Overview 10:35 Distributed System Overhead 21:30 From Microservices to Monolith 29:00 Scaling the Monolith 32:30 Takeaways https://www.primevideotech.com/video-streaming/scaling-up-the-prime-video-audio-video-monitoring-service-and-reducing-costs-by-90 Fundamentals of Backend Engineering Design patterns udemy course (link redirects to udemy with coupon) https://backend.husseinnasser.com

cost prime scaling prime video reduction monolith swaps microservices

A Deep Dive in How Slow SELECT * is

Play Episode Listen Later May 2, 2023 39:23

Fundamentals of Database Engineering udemy course (link redirects to udemy with coupon) https://database.husseinnasser.com In a row-store database engine, rows are stored in units called pages. Each page has a fixed header and contains multiple rows, with each row having a record header followed by its respective columns. When the database fetches a page and places it in the shared buffer pool, we gain access to all rows and columns within that page. So, the question arises: if we have all the columns readily available in memory, why would SELECT * be slow and costly? Is it really as slow as people claim it to be? And if so why is it so? In this post, we will explore these questions and more. 0:00 Intro 1:49 Database Page Layout 5:00 How SELECT Works 10:49 No Index-Only Scans 18:00 Deserialization Cost 21:00 Not All Columns are Inline 28:00 Network Cost 36:00 Client Deserialization https://medium.com/@hnasr/how-slow-is-select-8d4308ca1f0c

deep dive fundamentals select

AWS Serverless Lambda Supports Response Streaming

Play Episode Listen Later Apr 7, 2023 13:14

Lambda now supports Response payload streaming, now you can flush changes to the network socket as soon as it is available and it will be written to the client socket. I think this is a game changing feature 0:00 Intro 1:00 Traditional Lambda 3:00 Server Sent Events & Chunk-Encoding 5:00 What happens to clients? 6:00 Supported Regions 7:00 My thoughts Fundamentals of Backend Engineering Design patterns udemy course (link redirects to udemy with coupon) https://backend.husseinnasser.com

streaming lambda serverless

The Cloudflare mTLS vulnerability - A Deep Dive Analysis

Play Episode Listen Later Apr 6, 2023 43:13

Cloudflare released a blog detailing a vulnerability that has been in their system for nearly two years. it is related to mTLS or mutual TLS and specifically client certificate revocation. I explore this in details 0:00 Intro 3:00 The Vulnerability 7:00 What happened? 8:50 Certificate Revocation 12:30 Rejecting certain endpoints 17:00 Certificate Authentication 20:30 Certificate serial number 24:00 Session Resumption (PSK) 35:00 The bug 37:00 How they addressed the problem Fundamentals of Backend Engineering Design patterns udemy course (link redirects to udemy with coupon) https://backend.husseinnasser.com

deep dive vulnerability certificates rejecting cloudflare tls

The Virgin Media ISP outage - What happened?

Play Episode Listen Later Apr 6, 2023 23:23

BGP (Border gateway protocol) withdrawals caused the Virgin media ISP customers to lose their Internet connection. I go into details on this video. 0:00 Intro 2:00 What happened? 4:11 How BGP works? 11:50 Version media withdrawals 15:00 Deep dive Fundamentals of Backend Engineering Design patterns udemy course (link redirects to udemy with coupon) https://backend.husseinnasser.com

internet deep fundamentals virgin outage isp virgin media

Claim IGeometry

In order to claim this podcast we'll send an email to with a verification link. Simply click the link and you will be able to edit tags, request a refresh, and other features to take control of your podcast page!

Claim Cancel