Podcasts about site reliability engineering sre

53PODCASTS
64EPISODES
41mAVG DURATION
1MONTHLY NEW EPISODE
Jun 8, 2026LATEST

POPULARITY

20192020202120222023202420252026

Best podcasts about site reliability engineering sre

S.R.E.path Podcast

5 episodes with site reliability engineering sre

Packet Pushers - Full Podcast Feed

2 episodes with site reliability engineering sre

Software Engineering Radio - The Podcast for Professional Software Developers

2 episodes with site reliability engineering sre

TestGuild Performance Testing and Site Reliability Podcast

2 episodes with site reliability engineering sre

Google SRE Prodcast

2 episodes with site reliability engineering sre

Packet Pushers - Full Stack Journey

2 episodes with site reliability engineering sre

Find Flow

2 episodes with site reliability engineering sre

Latest podcast episodes about site reliability engineering sre

Software Development When Budget and Velocity Fade Away with Tyler Wells

Silicon Valley Tech And AI With Gary Fowler

Play Episode Listen Later Jun 8, 2026 31:54

Join Tyler Wells, Co-founder and CTO of BrainGrid, for a forward-looking discussion on how artificial intelligence is rewriting the rules of product development. Boasting over 25 years of distributed systems engineering—including a foundational tenure at Skype building Facebook's first video-calling engine and 7+ years directing Video and global SRE at Twilio—Tyler has built infra where structural failure was not an option. In this episode, we explore why the traditional constraints of software engineering—headcount, timelines, and budgets—are dissolving, leaving a brand-new bottleneck at the front of the innovation cycle: human imagination.

ceo ai video budget engineering skype pivot saas cto boasting mvps velocity software development twilio mcp sre cursor fade away webrtc clis site reliability engineering sre for tyler

The Best Open Source US Model (Right behind China)

The Generative AI Meetup Podcast

Play Episode Listen Later Jun 7, 2026 114:55 Transcription Available

https://novacut.ai/ https://genaimeetup.com/ Anthropic has officially closed a $65 billion Series H at a $965 billion valuation, nearly 2.5x its valuation from just 100 days ago. Meanwhile, funding is flowing across the ecosystem: Frameworks AI at $15B, Baseten at $11B, OpenRouter's $113M Series B, and Cognition AI's $1B Series D. NVIDIA went on an open-source super week with Nemotron 3 Ultra, Cosmos 3, and Nemotron 3.5 ASR. Microsoft dropped 5 new MAI models. Google released Gemma 4 12B, and Anthropic shipped Opus 4.8. On the benchmarks front, DeepSWE crowns GPT-5.5 as the leader in long-horizon coding tasks, while ITBench shows even frontier models struggle with real-world SRE incidents — Claude Opus 4.7 tops out at just 47%. Plus: Cloudflare acquires VoidZero to build the future of AI-native edge development, and Google is paying SpaceX $920M/month for compute. Topics covered: • Anthropic's $65B Series H and path to $1T • Fireworks AI, Baseten, OpenRouter & Cognition funding rounds • Microsoft's 5 new MAI models • NVIDIA's open-source super week (Nemotron, Cosmos 3) • MiniMax M3, Gemma 4 12B, JetBrains Mellum2, Opus 4.8 • DeepSWE benchmark: GPT-5.5 leads long-horizon coding • ITBench: Frontier models under 50% on real SRE tasks • Cloudflare + VoidZero for AI-native edge dev • Google's $920M/month SpaceX compute deal #AI #Anthropic #NVIDIA #OpenAI #AInews #TechNews #LLM Funding rounds Anthropic formally confirmed the closure of its $65 billion Series H funding round at a post-money valuation of $965 billion. This represents a 2.5-fold increase over its $380 billion Series G valuation from February 2026, adding $585 billion in value in approximately 100 days https://www.anthropic.com/news/series-h Frameworks AI raising at 15B valuation representing a near fourfold increase from its $4 billion Series C valuation recorded in October 2025 processing 15 trillion tokens daily for major production clients including Cursor, Notion, and Perplexity https://finance.yahoo.com/sectors/technology/articles/fireworks-ai-eyes-15-billion-174609357.html Baseten is raising 1B at 11B valuation annualized revenue, which skyrocketed from $200 million to $600 million over a single quarter https://techstartups.com/2026/05/26/ai-inference-startup-baseten-in-talks-to-raise-1-billion-at-11-billion-valuation/ OpenRouter has secured a $113 million Series B funding OpenRouter has experienced exponential traffic growth, with weekly production throughput expanding fivefold from 5 trillion to 25 trillion tokens over a six-month horizon https://www.businesswire.com/news/home/20260526953416/en/OpenRouter-Raises-%24113-Million-CapitalG-led-Series-B-as-Weekly-Volume-Explodes-to-25T-Tokens Further up the stack: Cognition AI secured a $1 billion Series D round led by Lux Capital and 8VC https://cognition.ai/blog/series-d Model Releases MAI models: MAI-Code-1-Flash: A 5-billion active parameter model optimized for ultra-low latency within GitHub Copilot and VS Code. MAI-Image-2.5: A high-fidelity image generation model ranking third on global image evaluation arenas, outperforming competing architectures like Nano Banana Pro. MAI-Transcribe-1.5: A multi-lingual speech processing engine offering fivefold speed improvements across 43 languages. MAI-Voice-2: Natural audio and voice generation across 15 languages, available at a highly competitive price point. Web IQ: A search-grounding API engineered to directly compete with Perplexity. https://microsoft.ai/models/ https://www.peoplematters.in/news/ai-and-emerging-tech/uber-imposes-dollar1500-monthly-ai-spending-limit-on-employees-amid-rising-costs-50073 Nvidia has executed an "Open-Source Super Week," positioning itself as a dominant software and model publisher: Nemotron 3 Ultra (best US open source open weights model but behind china): A massive 550-billion parameter MoE (55 billion active) designed with a 1-million token context window, optimized specifically for high-throughput, cyclical agent loops. It achieved peak throughput rates of 400 tokens per second on day-zero optimized clusters. Cosmos 3: A physical AI world-modeling framework comprising 16-billion Nano and 64-billion Super variants. Built on a Mixture-of-Transformers (MoT) architecture, Cosmos 3 natively binds textual, visual, auditory, and physical kinetic vectors. Nemotron 3.5 ASR: A highly compact 0.6-billion parameter streaming speech recognition model pushing sub-100 millisecond latencies across 40 language locales. https://www.minimax.io/models/text/m3 MiniMax M3: A 1-million token context model hitting 59.0% on SWE-Bench Pro and 74.2% on MCP Atlas, though noted for high token consumption due to intensive internal self-validation loops. https://blog.google/innovation-and-ai/technology/developers-tools/introducing-gemma-4-12b/ Gemma 4 12B: Google's Apache 2.0 on-device model, which utilizes an encoder-free architecture that projects vision and audio vectors directly into the text-token space, bypassing separate CLIP-style encoders to minimize local memory footprints. https://www.jetbrains.com/mellum/ JetBrains Mellum2: A compact 12-billion parameter MoE (2.5 billion active) engineered for ultra-low latency routing and retrieval-augmented generation (RAG) sub-agents within developer IDEs. Opus 4.8 https://www.anthropic.com/news/claude-opus-4-8 https://www.cnbc.com/2026/06/05/google-to-pay-spacex-920-million-a-month-for-xai-compute-capacity.html Benchmarks: https://deepswe.d atacurve.ai/blog https://venturebeat.com/technology/deepswe-blows-up-the-ai-coding-leaderboard-crowns-gpt-5-5-and-finds-claude-opus-exploiting-a-benchmark-loophole (GPT 5.5 the winner in long horizon tasks) a highly complex software engineering benchmark focused on original, long-horizon tasks across five distinct programming languages. Comprising 113 chaotic tasks across 91 live, production-grade repositories, DeepSWE forces agents to generate 5.5 times more code and modify an average of 7 separate files per task compared to standard evaluations. On this challenging leaderboard, GPT-5.5 leads with a score of 70%, establishing a significant 16-percentage-point lead over contemporary alternatives I think older benchmarks where models reach ~90% accuracy can be considered saturated. Few percentage points don't give us any good signal. https://research.ibm.com/publications/developing-ai-agents-for-it-automation-tasks-with-itbench ITBench-AA, an evaluation framework focusing on live Kubernetes incident response and Site Reliability Engineering (SRE) operations. Comprising 59 live, containerized SRE incident snapshots, the results are remarkably sobering: every frontier model scored under 50% on successful incident resolution, with Claude Opus 4.7 leading at 47% and GPT-5.5 following closely at 46%. Edge AI announcements: https://www.cloudflare.com/press/press-releases/2026/cloudflare-acquires-voidzero-to-build-the-future-of-the-ai-native-web/ The consolidation of the AI-native developer stack has reached the runtime virtualization layer. Cloudflare recently completed the acquisition of VoidZero, the development group responsible for Vite, Vitest, Rolldown, and Oxc, backing the transaction with a $1 million open-source ecosystem fund. This acquisition is highly strategic; as autonomous agents write an increasing proportion of production software, local development environments, compilation pipelines, and bundlers must be optimized for execution speeds that match agent speeds. Cloudflare's goal is to construct a localized, full-stack edge playground. In this sandbox, AI agents can generate, test, bundle (utilizing the highly parallelized, Rust-based Oxc and Rolldown engines), and deploy entire web applications end-to-end within milliseconds. This architecture completely bypasses traditional local machine container bottlenecks, enabling high-velocity agent loops to execute in a fully sandboxed, web-scale edge runtime.

ai google china microsoft built natural spacex nvidia rust cosmos api open source gpt 1b notion clip opus nano apache cognition anthropic vite perplexity ides cloudflare rag kubernetes benchmarks series b sre cursor mixture series c github copilot vs code asr series d 15b comprising edge ai lux capital site reliability engineering sre series g us model

SO MANY THINGS need to go right just so you can watch a TikTok! | E2215

This Week in Startups

Play Episode Listen Later Nov 26, 2025 75:55

tiktok ai sound uber register engineers machine learning subject hawkeyes eq loading agi cpas many things cla scale ai alexandr gou it ops site reliability engineering sre

La comunidad Wordpress Ferrolterra propone la charla "El equipo azul ataca", sobre ciberseguridad y defensa digital

Voces de Ferrol - RadioVoz

Play Episode Listen Later Oct 22, 2025 19:02

El próximo viernes, 24 de octubre, a las 19:30 horas, el Centro Cívico de Canido (Ferrol) será escenario de la charla gratuita “El equipo azul ataca”, impartida por Rodrigo Dantés González Mantuano, técnico en sistemas y ciberseguridad especializado en Site Reliability Engineering (SRE). La sesión, organizada por Ferrolterra WordPress, ofrecerá una guía práctica sobre cómo proteger negocios y proyectos frente a amenazas digitales, abordando estrategias de defensa pasiva y activa, documentación esencial y acciones para responder eficazmente ante incidentes. Además, incluirá recursos útiles y contactos clave para la gestión de crisis cibernéticas. El encuentro concluirá con un espacio de networking, pensado para favorecer el intercambio de experiencias entre profesionales y emprendedores del ámbito tecnológico. El evento cuenta con la colaboración de la Asociación Vecinal de Canido y el Concello de Ferrol, y con el patrocinio de Raiola Networks y la Fundación Universidade da Coruña. La inscripción es gratuita y puede realizarse escaneando el código QR disponible en el cartel del evento.

digital adem qr gonz fundaci asociaci azul universidade defensa la comunidad el equipo ataca ciberseguridad propone coru ferrol la charla vecinal centro c site reliability engineering sre concello raiola networks comunidad wordpress

The One with Technical Program Managers and Karanveer Anand

Google SRE Prodcast

Play Episode Listen Later Jul 16, 2025 27:48

This episode features Google Technical Program Manager (TPM) Karanveer Anand, who joins our hosts to discuss the unique role of TPMs in Site Reliability Engineering (SRE). The conversation highlights how SRE TPMs bridge the gap between technical details and business impact, managing complex projects with inter-team dependencies and ensuring system reliability, particularly in the rapidly evolving AI landscape.

ai program managers anand site reliability engineering tpms site reliability engineer site reliability engineering sre technical program

#175 - DevOps VS SRE - com Luriel Santana

Getup Kubicast

Play Episode Listen Later Jul 10, 2025 57:51

No episódio 175 do Kubicast, recebemos o especialista Luriel Santana para um duelo de ideias entre DevOps e Site Reliability Engineering (SRE). Entre cafés e risadas, mergulhamos em discussões sobre cultura organizacional, automação de infraestrutura, métricas de confiabilidade e práticas de campo que vão desde data centers em Angola até pipelines modernos em nuvem.1. O Panorama: DevOps e SRE no MercadoDesde seu surgimento, o movimento DevOps trouxe um sopro de velocidade e integração entre equipes de desenvolvimento e operações. Já o SRE, idealizado pelo Google, elevou o patamar ao introduzir métricas claras (SLIs, SLOs e SLAs) e processos de gestão de erros. Nesta batalha, não há um “vencedor único”: DevOps acelera a entrega; SRE garante que ela aconteça sem interrupções.2. Lições de Campo em AngolaLuriel compartilhou conosco suas aventuras em data centers físicos, rodando Linux e configurando roteadores Cisco numa das regiões mais desafiadoras do continente africano. A mensagem foi clara: sem automação mínima, manter servidores operando em condições extremas vira gargalo. Foi ali que aprendemos a importância de Infrastructure as Code e do versionamento de configurações.3. Cultura vs FerramentalFrequentemente, equipes se apaixonam por ferramentas e esquecem a cultura. Discutimos como pipelines de CI/CD, contêineres e orquestração Kubernetes só fazem sentido quando há um mindset de colaboração e responsabilidade compartilhada. Do contrário, viram apenas mais uma “caixinha de truques” sem resultados consistentes.4. Métricas de Confiabilidade: SLOs e SLIs na PráticaA gente explorou exemplos de SLOs para aplicações críticas e viu que definir limites aceitáveis de erro é tanto arte quanto ciência. Falamos dos trade‑offs entre velocidade e estabilidade, e de como o roteamento de incidentes pode se apoiar em dashboards bem configurados — sem esquecer dos alertas que evitam alert fatigue.5. Pandemia e Adoção AceleradaA crise global empurrou muitas empresas para a nuvem e para práticas de automação. Discutimos como o trabalho remoto reforçou a necessidade de automação e infraestrutura resiliente, e refletimos sobre cases de pipelines que nasceram em questão de dias para suportar picos inesperados.Conclusão e Próximos PassosSaímos deste episódio com uma certeza: DevOps e SRE não são antagonistas, mas sim parceiros na jornada de entregar software com velocidade e confiabilidade. Se você está começando, comece definindo seus SLIs. Para os veteranos, a dica é revisitar processos e investir em cultura.Links e Recomendações:Conecte-se com Luriel Santana no LinkedIn: https://www.linkedin.com/in/lurielsantana/João Brito - https://www.linkedin.com/in/juniorjbnAssista ao FilmeTEArapia - https://youtu.be/M4QFmW_HZh0?si=HIXBDWZJ8yPbpflMSaiba mais sobre o DevOps Days Feira de Santana: https://www.devopsdays.org/events/2025-feira-de-santana/Confira o Canal Pro Evolua: https://www.youtube.com/c/ProEvoluaDescubra o Projeto Zero CVE (Getup): https://getup.io/zerocveParticipe de nosso programa de acesso antecipado e tenha um ambiente mais seguro em instantes! https://getup.io/zerocve

spotify google pr code os pandemia automation infrastructure cultura li get up foi cisco campo falamos confira linux angola crea devops discutimos containers brito kubernetes sre recomenda ado ci cd conclus devsecops cloud native slas conecte continuous delivery slos slis site reliability engineering sre

We're back with Season 4!

Google SRE Prodcast

Play Episode Listen Later Apr 16, 2025 15:03

In this "bumpisode", hosts and producers of Prodcast (including our new co-host, Matt Siegler!) reflect on the previous season and introduce the new season's focus on upcoming trends in Site Reliability Engineering (SRE) and AI, and the friends we make along the way. They also introduce new elements we are bringing in with Season 4, such as a video format and a feedback form.

ai google artificial intelligence sre site reliability engineering prodcast site reliability engineer site reliability engineering sre

How to Make B2B Marketing Exciting – Dan Ruby – Nobl9

Marketing B2B Technology

Play Episode Listen Later Jan 23, 2025 26:49

Mike is joined by Dan Ruby, VP of Marketing at Noble9, a leading reliability platform that helps manage and monitor application reliability. Dan discusses the challenges of marketing a product that aims to keep issues unnoticed by end users and how storytelling can make a traditionally "unexciting" product compelling and engaging. The conversation also covers the importance of data-driven marketing, balancing brand building with lead generation, and innovative campaign strategies. About Nobl9 Founded in 2019 by ex-Googlers Marcin Kurc and Brian Singer, Nobl9 is the premiere Service Level Objectives-based platform for driving a reliable digital experience. With a strong enterprise customer base as well as strategic investments from Cisco and ServiceNow, Nobl9 is recognized as a bleeding-edge solution to modernizing Site Reliability Engineering (SRE) strategies, ensuring that reliability is not measured primarily by availability, but rather by users' ability to do what they expect to be able to do within an application. About Dan Ruby Dan is an eighteen year veteran of digital marketing, with the vast majority of his experience coming as the head of marketing for various B2B SaaS organizations in the Boston area. He has been acquired at various points by Google and Snap, and is currently the VP of Marketing for Nobl9, a B2B SaaS platform for user-centric site reliability. He holds a Bachelor of Journalism from the University of Missouri as well as an MBA from Brandeis University. He occasionally teaches an undergraduate course on marketing at Bentley University. Throughout his career, Dan has become increasingly stubborn about the fact that marketing must focus on creating value for potential leads, and is quite fond of telling anyone who will listen that "nobody gives a **** about your product, give them valuable information, not product pitches." Time Stamps 00:00:42 - Dan Ruby's Career Journey 00:02:09 - Overview of Noble9 00:05:48 - Challenges in Marketing a Reliability Product 00:07:03 - Using Stories to Make Marketing Exciting 00:12:43 - Balancing Brand Building and Lead Generation 00:17:07 - Innovative Campaign Example: DORA 00:22:24 - The Importance of Partnerships in Marketing 00:22:41 - Best Marketing Advice Received 00:23:41 - Advice for New Marketing Professionals 00:25:44 - How to Contact Dan Ruby 00:26:18 - Closing Remarks Quotes "Marketing is such an interesting field. It takes pretty much any skill set and makes it useful.” Dan Ruby, VP of Marketing at Nobl9 "Nothing is boring if you can make it into a story that resonates." Dan Ruby, VP of Marketing at Nobl9 "You can find partners who believe in your product, believe in your company, believe in your people, who will work with you." Dan Ruby, VP of Marketing at Nobl9 Follow Dan: Dan Ruby on LinkedIn: https://www.linkedin.com/in/danielruby/ Nobl9's website: https://www.nobl9.com/ Nobl9 on LinkedIn: https://www.linkedin.com/company/nobl9inc/ Follow Mike: Mike Maynard on LinkedIn: https://www.linkedin.com/in/mikemaynard/ Napier website: https://www.napierb2b.com/ Napier LinkedIn: https://www.linkedin.com/company/napier-partnership-limited/ If you enjoyed this episode, be sure to subscribe to our podcast for more discussions about the latest in Marketing B2B Tech and connect with us on social media to stay updated on upcoming episodes. We'd also appreciate it if you could leave us a review on your favourite podcast platform. Want more? Check out Napier's other podcast - The Marketing Automation Moment: https://podcasts.apple.com/ua/podcast/the-marketing-automation-moment-podcast/id1659211547

university google marketing advice challenges bachelor partnership mba missouri journalism exciting snap cisco b2b saas brandeis university b2b marketing napier servicenow bentley university using stories brian singer site reliability engineering sre

Dev Harmony: Communication & Proven SRE Practices • Liz Fong-Jones & Marit van Dijk

GOTO - Today, Tomorrow and the Future

Play Episode Listen Later May 31, 2024 33:50 Transcription Available

This interview was recorded at GOTO Copenhagen for GOTO Unscripted.http://gotopia.techRead the full transcription of this interview hereLiz Fong-Jones - Field CTO at Honeycomb.ioMarit van Dijk - Developer Advocate at JetBrains & Open Source ContributorRESOURCESLIzhttps://twitter.com/lizthegreyhttps://linkedin.com/in/efonghttps://www.lizthegrey.comMarithttps://twitter.com/MaritvanDijk77https://linkedin.com/in/maritvandijkhttps://mastodon.social/@maritvandijkhttps://github.com/mlvandijkhttps://medium.com/@mlvandijkhttps://maritvandijk.comDESCRIPTIONExplore the intricacies of efficient development collaboration and gain valuable insights into Site Reliability Engineering (SRE) strategies in this engaging conversation.Liz Fong-Jones and Marit van Dijk delve into the challenges developers face, emphasizing streamlined communication and workflow optimization. From managing software dependencies to the evolving role of SRE teams, they share practical experiences and thoughts on building internal platforms, shedding light on the collaborative dynamics that shape successful development endeavors.Discover how embracing effective communication and proven SRE practices can pave the way for improved team efficiency and impactful software development outcomes.RECOMMENDED BOOKSCharity Majors, Liz Fong-Jones & George Miranda • Observability EngineeringBeyer, Murphy, Rensin, Kawahara & Thorne • The Site Reliability WorkbookKelly Shortridge & Aaron Rinehart • Security Chaos EngineeringNora Jones & Casey Rosenthal • Chaos EngineeringRuss Miles • Learning Chaos EngineeringMark Seemann & Steven van Deursen • Dependency Injection Principles, Practices & PatternsTwitterInstagramLinkedInFacebookLooking for a unique learning experience?Attend the next GOTO conference near you! Get your ticket: gotopia.techSUBSCRIBE TO OUR YOUTUBE CHANNEL - new videos posted daily!

discover practices programming proven open source devops software engineering dijk van dijk sre marit honeycomb dependencies jetbrains platform engineering developer productivity site reliability engineering sre liz fong jones dora metrics

Site Reliability Engineering with Dan Salinas & Sarv Shah of Nobl9 – IT in the D 484

IT in the D

Play Episode Listen Later May 23, 2024

In this episode Bob and Randy invite Dan Salinas and Sarv Shah from Nobl9 to dive deep into the complexities of Site Reliability Engineering (SRE) and Service Level Objectives (SLOs). Discover the origins of SRE, the significance of SLOs in improving customer experience, and the impact of digital reliability on businesses today. From the challenges of maintaining microservices to the advent of cloud dependency, this episode is packed with insights on ensuring operational excellence in our digital world.

discover shah salinas sre site reliability engineering slos site reliability engineering sre

Ep. 162 - The cloud native journey

Next in Tech

Play Episode Listen Later Apr 9, 2024 25:44

While getting to a cloud native development pattern is a goal for most organizations, it can be a significant journey to transform both infrastructure and processes. Analyst Carl Lehmann joins host Eric Hanselman to explore the paths that can move enterprises forward. DevOps approaches can speed development, Site Reliability Engineering (SRE) can change ways of managing risk and platform engineering can simplify tool sets, but adoption does not always follow a straight line.

devops cloud native site reliability engineering sre

#5.01 - La Vida de un SRE, con Pelado Nerd

Charlas técnicas de AWS (AWS en Español)

Play Episode Listen Later Feb 12, 2024 69:19

En este primer episodio de la Temporada 5, charlamos con Pelado Nerd, reconocido SRE y creador de contenido en YouTube. Exploramos su trayectoria desde sus inicios hasta su éxito en YouTube, así como su experiencia en el mundo de Site Reliability Engineering (SRE). Discutimos el día a día de un SRE y herramientas esenciales para el rol, entre otros.Tabla de Contenidos 01:34 Intro al invitado, los orígenes de Pelado... 03:50 Tú faceta como creador de contenidos. 11:00 Aplicando lo aprendido en Youtube y viceversa. 14:00 Balanceando el ejercicio con el trabajo / mejorando la productividad 18:30 El día a día de un SRE. 25:47 Las 3 herramientas imprescindibles del SRE 27:04 La gran ventaja de Kubernetes 31:21 Kubernetes NO es la opción de ORO para todo 33:45 Lanzando 300 nodos...en 30 min! 35:12 Descubriendo los warm-up 40:13 Historias para no dormir: Adiós a los certificados 44:28 Consejos para futuros SREs 48:30 Para qué quieres Jenkins? Usa Dagger. 52:20 Escalado de clusters con Karpenter 55:20 Lambdas en contenedores 58:12 El futuro de K8s y la 3ra ola de contenedores: WASM 1:01:45 Impacto de la IA en la Infraestructura1:04:50 Recomendaciones finalesRedes Sociales del InvitadoTwitter: https://twitter.com/peladonerdYouTube: https://www.youtube.com/@PeladoNerdLinkedIN: https://www.linkedin.com/in/pablofredrikson/Videos MencionadosDocker de Novato a Pro: https://www.youtube.com/watch?v=CV_Uf3Dq-EU&t=115sIntroducción a Dagger: https://www.youtube.com/watch?v=lGl1UlcODLQWASM, la 3ra ola de contenedores: https://www.youtube.com/watch?v=bgWTf3m6HG0LENS, la mejor interfaz para K8s: https://www.youtube.com/watch?v=DFMKcR4BqwMCrossplane, mejor que Terraform? https://www.youtube.com/watch?v=dWbEvHOtljg&t=129sRecomendacionesLibro: Time Management for System Administrators: https://amzn.eu/d/fL7FiUlLibro: Site Reliability Engineering (Gratis)https://sre.google/books/Canal Pelado Entrena, el desafío de correr una maratón: https://www.youtube.com/@PeladoEntrena✉️ Si quieren escribirnos pueden hacerlo a este correo: podcast-aws-espanol@amazon.comPodes encontrar el podcast en este link: https://aws-espanol.buzzsprout.com/O en tu plataforma de podcast favoritaMás información y tutoriales en el canal de youtube de Charlas Técnicas#foobar #AWSenEspañol

nerds ia jenkins historias consejos impacto oro discutimos exploramos dagger kubernetes descubriendo contenidos sre tabla terraform aplicando novato wasm pelado k8s sres escalado lambdas site reliability engineering sre system administrators

Evo Nordics #475 - SRE & Ownership

The Evolution Exchange Podcast Nordics

Play Episode Listen Later Jan 25, 2024 57:27

Explore the evolving landscape of Site Reliability Engineering (SRE) and Ownership in the latest episode of Evo Nordics. Hosted by Georgia Benton, this episode features insights from Christian Holmboe, Engineering Manager at Volvo Cars, Alex Ewerlöf, Senior Staff Engineer also at Volvo Cars, and Jens Rantil, Senior Software Engineer. Dive into discussions on fostering ownership in engineering teams and implementing SRE practices for optimal performance, only on Evo Nordics. Manager at Volvo Cars, Alex Ewerlöf, Senior Staff Engineer also at Volvo Cars, and Jens Rantil, Senior Software Engineer. Dive into discussions on fostering ownership in engineering teams and implementing SRE practices for optimal performance, only on Manager at Volvo Cars, Alex Ewerlöf, Senior Staff Engineer also at Volvo Cars, and Jens Rantil, Senior Software Engineer. Dive into discussions on fostering ownership in e Manager at Volvo Cars, Alex Ewerlöf, Senior Staff Engineer

explore dive ownership nordics sre engineering manager senior software engineer volvo cars site reliability engineering sre

Site Reliability Engineering – How tun run production systems

IT-Management Podcast | Für den Service-Management Nerd in Dir.

Play Episode Listen Later Dec 9, 2023 57:58

Site Reliability Engineering (SRE) ist eine Disziplin, die das tiefe Verständnis von Softwareengineering mit einer ausgeprägten Fokussierung auf Zuverlässigkeit und Betriebsstabilität verbindet. Ursprünglich von Google entwickelt, zielt SRE darauf ab, die Lücke zwischen der Entwicklung und dem Betrieb von Software zu schließen, indem es Prinzipien des Engineerings auf Betriebsaufgaben anwendet. SRE-Teams sind dafür verantwortlich, Skalierbarkeit, Performance und Ausfallsicherheit von Services zu gewährleisten und dabei auch die schnelle Entwicklung und Bereitstellung neuer Features zu unterstützen. Sie nutzen eine Reihe von Methoden, wie Automatisierung und kontinuierliche Integration/Delivery, um manuelle Arbeit zu reduzieren und Fehlerquellen zu minimieren. Genau über diese Methoden und SRE an sich spreche ich heute mit Alex Lichtenberger.

google performance services production software arbeit entwicklung verst genau reihe methoden disziplin prinzipien betrieb urspr automatisierung sre zuverl fokussierung bereitstellung skalierbarkeit site reliability engineering fehlerquellen site reliability engineering sre ausfallsicherheit

Full-Stack Mindfulness and AI: Allison Durham's Vision for the Future

Crazy Wisdom

Play Episode Listen Later Oct 31, 2023 56:36

Intro Allison Durham Focus: Exploring AI, Software Development, and the Human Mind What is the Human Mind? Allison doesn't make a distinction between the brain and the mind. She sees the mind as a dynamic range of cognitive experiences that include thoughts, perception, and self-awareness. The mind exists alongside the human experience and is fully integrated with bodily sensations. On Consciousness Allison discusses the topic of consciousness, noting that awareness can vary in its intensity. She mentions an intriguing question: Can awareness exist without the brain? She recalls an interesting conversation with a friend who asked her about consciousness and awareness. The Experience of Dreams Allison describes a dream she had that was "rooted in Earth," contrasting it with another dream featuring a monstrous, otherworldly creature. She emphasizes her ability to fully visualize experiences in her dreams, even though she struggles with visualization in her waking life. Aphantasia and Visualization Allison brings up the concept of Aphantasia, where people have difficulty visualizing images. She explores the idea that visualization might be trainable, mentioning techniques such as the "candle technique" to improve skill. She notes that while most people can recall memories with images, these people also often have underdeveloped other sensory recall like smell and hearing. Software Development and AI Allison talks about Rust, a systems-level programming language she enjoys using. She delves into the concept of Site Reliability Engineering (SRE), explaining it stems from Google's earlier operations methods. She praises GitLab for packaging all the tools needed for DevOps, making it more accessible. She explores the concept of MLOps, which focuses on getting machine learning models into production. She finds the speed of open-source AI development both exciting and challenging, noting that problems can't be fully solved before new ones appear. Personal Psychology Framework Allison discusses her psychological framework, leaning heavily on mindfulness-based tactics. She believes in being fully aware of one's thoughts and emotional state, and she finds this awareness essential for taking proper action in life. Final Thoughts She mentions her website, AdjectiveAllison.com, and her social media handle, AdjectiveAllison on X. Time Stamps: 2:30 - Discussing the nature of the mind and its relationship to the brain and awareness 5:00 - Allison explains her experience with aphantasia 7:30 - Stuart talks about training himself to visualize through meditation 9:00 - Whether imagination and visualization can be trained as skills 11:00 - Allison's perspective on not training her own visualization abilities right now 12:00 - Allison's interest in learning Rust programming language 14:00 - Using ChatGPT to assist with engineering problems as a "rubber duck debugger" 16:00 - Explanation of DevOps, APIs, serverless solutions like Repl.it 19:00 - How AI may or may not change API and engineering architectures 21:00 - Automation as connecting APIs; engineers building instead of using no-code 23:00 - AI unlikely to change API interface itself, complexity happens behind it 24:00 - Allison's favorite psychological framework is mindfulness 25:30 - Aligning with specific frameworks depending on the problem

ai google earth chatgpt mindfulness stuart automation aligning durham rust explanation api apis devops software development gitlab fullstack using chatgpt human mind vision for the future aphantasia repl site reliability engineering sre

A Microservices Outcome: Testing Boomed

The New Stack Podcast

Play Episode Listen Later Sep 15, 2023 21:45

Over the past five to ten years, the testing of microservices has seen significant growth. This surge in testing can be attributed to the increasing adoption of microservices and Kubernetes, which signify a shift away from monolithic application architectures. Bruno Lopes, a leader at Kubernetes company incubator Kubeshop, noted this trend. Kubeshop has initiated six Kubernetes projects, including TestKube, a Kubernetes native testing framework led by Lopes.This rise in testing is making it more accessible to a wider audience and is enhancing the developer experience through automation. Developers now have more time to focus on innovation rather than manual testing. However, there is often a disconnect between development and testing, as developers move quickly, outpacing organizational adaptation to modern testing methods.Lopes emphasized the importance of testing before production deployment and advocated for creating production-resembling testing environments that allow for rapid deployment without waiting for manual tests. This approach is particularly critical for Site Reliability Engineering (SRE) teams who need to respond quickly to issues and minimize downtime for customers. In some cases, it's necessary to run tests within Kubernetes itself, a concept that may take time for companies to fully embrace as the developer experience continues to improve.Learn more from The New Stack about Kubernetes, Testing and TestKube:Testkube: A Cloud Native Testing Framework for KubernetesTop 5 Challenges in Modern Kubernetes TestingWhy You Should Start Testing in the Cloud Native Way

challenges tech testing developers outcome devops lopes software engineers kubernetes software engineering software developers tech podcast microservices cloud native software testing new stack site reliability engineering sre bruno lopes developer podcast new stack makers

TechChat Tuesdays #66: The DevOps of Sports Betting with Drew Rogers

Chariot TechCast

Play Episode Listen Later Aug 22, 2023 43:02

Today we talk to our own Drew Rogers about some of the nuanced aspects of Site Reliability Engineering (SRE) and Development Operations (DevOps) within the rapidly evolving domain of sports betting. The post TechChat Tuesdays #66: The DevOps of Sports Betting with Drew Rogers appeared first on Chariot Solutions.

rogers sports betting devops site reliability engineering sre chariot solutions

TechChat Tuesdays #66: The DevOps of Sports Betting with Drew Rogers

Chariot TechCast

Play Episode Listen Later Aug 22, 2023 43:02

rogers sports betting devops site reliability engineering sre chariot solutions

Establishing SRE Foundations with Vlad Ukis

TestGuild Performance Testing and Site Reliability Podcast

Play Episode Listen Later Jul 19, 2023 28:11

On this episode of DevOps Toolchain, host Joe Colantonio interviews Vlad Ukis, the head of R&D for Siemens Health Imagineers, about the implementation and benefits of Site Reliability Engineering (SRE). Vlad emphasizes the importance of involving product management, product development, and product operations from the beginning to ensure the success of SRE in an organization. He discusses how to prioritize and communicate the importance of SRE in large organizations with competing initiatives and how introducing a role like SRE and creating a community of practice can facilitate cross-pollination of ideas and best practices. Vlad also dives into the concept of Service Level Objectives (SLOs), their importance in managing services, and the process of defining them by bringing together different teams. He shares his experience introducing SRE in a healthcare domain within a medical device vendor and addresses the challenge of orchestrating organizational buy-in for SRE. Vlad highlights the need for unique approaches to engaging each party in the organization and stresses the importance of culture in implementing new processes at scale. Listeners are encouraged to check out Vlad's book, 'Establishing SRE Foundations.' The interview provides valuable insights into the changes and efforts required for successful SRE implementation and the shift in mindset towards prioritizing reliability. Vlad also discusses the role of coaching and learning over time and the transformation of traditional product management, development, and operations models in the software-as-a-service world. The episode concludes with a discussion on the definition and practice of SRE, its role within an organization, and the potential creation of new positions. Don't miss out on this informative and thought-provoking episode featuring Vlad Ukis, a true expert in SRE and continuous delivery.

foundations establishing vlad r d sre site reliability engineering sre

#5 Where does SRE fit into your organization's structure?

S.R.E.path Podcast

Play Episode Listen Later Jun 15, 2023 17:02

We discuss throughout this episode the different engagement models for Site Reliability Engineering (SRE) and how to contextualize SRE into an organization's structure. Sebastian Vietz, an experienced SRE practitioner, suggests five different engagement models for SRE and emphasizes the importance of considering the cost associated with each model. The hosts also discuss the different types of SREs that can exist within these engagement models, including SRE champions and unicorns. They stress the importance of considering organizational context when implementing SRE and tease a future episode where they will delve deeper into a framework for identifying the capabilities needed to solve SRE-related problems.Timestamps of key conceptsWhere and how SRE fits into an organization [00:00:20]We discuss the importance of considering organizational context when implementing SRE and explore different engagement models for SRE.Center of Excellence for Reliability Engineering [00:02:14]We discuss the idea of a center of excellence for reliability engineering, where a few practitioners take on an advisory role for the organization.Embedded SREs [00:04:14]We discuss the idea of embedding SREs into teams, where each team has an embedded SRE whose focus is to implement reliability engineering principles and best practices.Five SRE Engagement Models [00:08:23]We discuss five different engagement models for SRE, including embedded SREs, a center of excellence, and a consulting or ambassador model.Types of SREs [00:10:25]We discuss different personas that an SRE can take, including champions, advocates, and unicorns.Unicorn SREs [00:13:50]We discuss the rare and sought-after unicorn SREs, who have extensive experience and exposure to different business domains and contexts. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit srepath.substack.com

excellence types structure sre sres reliability engineering site reliability engineering sre

Ep 9 - Cloud journey spotlight with Mersedes ( @blkwomenread )

Cables2Clouds

Play Episode Play 31 sec Highlight Listen Later Jun 14, 2023 46:45 Transcription Available

In this episode, we are joined by Mersedes, aka @blkwomenread. We talk about her journey from working at one of the countries busiest Dominos, to call center, to today. These days Mersedes works as a Systems Engineer/Monitoring Engineer which lines up very nicely with her strong interest in the field of Site Reliability Engineering (SRE). We dive into the nitty gritty about her transition from fast food worker to network engineer and learn about what her day-to-day looks like in her current role. This was a great roundtable discussion, suitable for anyone wondering about or currently working in the cloud!How to connect with Mersedes:Twitter: [https://twitter.com/blkwomenread]YouTube: [https://www.youtube.com/@blkwomenread]Twitch: [https://www.twitch.tv/techsavvysadie]Linkedin: [https://linkedin.com/in/mersedeshenderson]Topics:Site Reliability Engineering (SRE)-https://sre.googleBooks to look out for-Building Secure and Reliable SystemsThe Site Reliability WorkbookSite Reliability EngineeringCheck out the Fortnightly Cloud Networking NewsVisit our website and subscribe: https://www.cables2clouds.com/Follow us on Twitter: https://twitter.com/cables2cloudsFollow us on YouTube: https://www.youtube.com/@cables2clouds/Follow us on TikTok: https://www.tiktok.com/@cables2cloudsMerch Store: https://store.cables2clouds.com/Join the Discord Study group: https://artofneteng.com/iaatjArt of Network Engineering (AONE): https://artofnetworkengineering.com

tiktok twitch cloud dominos site reliability engineering sre

#4 Should organizations care about SRE?

S.R.E.path Podcast

Play Episode Listen Later Jun 1, 2023 18:44

This episode discusses how Site Reliability Engineering (SRE) can be important to organizations. SRE can optimize software operations, reduce costs, support revenue-driving areas, mitigate risks, improve cybersecurity, and enhance customer experiences. We will also cover how to integrate SRE into the organization's culture for continuous improvement and innovation. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit srepath.substack.com

care organizations sre site reliability engineering sre

#3 SRE vs DevOps vs Platform Engineering

S.R.E.path Podcast

Play Episode Listen Later May 17, 2023 22:53

In this episode of SREpath, Ash and Sebastian discuss the unnecessary debate surrounding Site Reliability Engineering (SRE), DevOps, and platform engineering. They argue that these disciplines should not be pitted against each other, but rather seen as complementary and able to coexist within an organization. The focus should be on continuous improvement, learning from failures, and making things better. The hosts emphasize that practitioners in all three areas share the common goal of improvement and should collaborate rather than compete. They briefly distinguish SRE as focusing on system reliability and scalability, DevOps on collaboration and automation, and platform engineering on building and maintaining infrastructure. The decision to establish dedicated teams for each discipline depends on the organization's scale and needs. The hosts encourage a context-driven approach, where individuals from diverse backgrounds and skill sets can contribute to the SRE field. Ultimately, the key is to prioritize improvement and learning, regardless of labels or titles. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit srepath.substack.com

ash devops sre platform engineering site reliability engineering sre

O Zero Trust i koncepcji Site Reliability Engineering (SRE)

FlyTalks - Podcast o chmurze w biznesie

Play Episode Listen Later May 16, 2023 37:28

Czym jest Web3 i czy jest to krok w stronę przejścia na zdecentralizowane kontakty między ludźmi? Jak działa i w jaki sposób Ramp Network pomaga na wejście i wyjście ze świata Web3? Podczas rozmowy w sposób oczywisty towarzyszy nam wątek Site Reliability Engineering (SRE), kwestie bezpieczeństwa fintech oraz to, dlaczego Paweł Dawidowicz nie wyobraża sobie budowania startupu on-premise. Słuchajcie uważnie, bo polecimy Wam darmową do pobrania książkę poświęconą tematyce SRE, a nasz gość uchyli rąbka tajemnicy o tym, jak Ramp Network realizuje Zero Trust.

web3 jak czym wam zero trust podczas sre site reliability engineering sre

#2 What is Site Reliability Engineering (SRE) and what is not SRE?

S.R.E.path Podcast

Play Episode Listen Later May 4, 2023 23:55

In this episode of the SREpath podcast, Ash and Sebastian explore what Site Reliability Engineering (SRE) is and how it manifests in a highly functional organization. We also cover the controversial issue of what SRE is not. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit srepath.substack.com

ash sre site reliability engineering sre

#1 Introducing the SREpath podcast

S.R.E.path Podcast

Play Episode Listen Later Apr 20, 2023 21:06

Welcome to the first episode of the SREpath podcast! In this episode, we'll introduce you to our podcast hosts and give you their broad-level view of Site Reliability Engineering (SRE). We'll also share some points about how we'll be running future episodes. Whether you're an SRE expert or new to the field, this episode will provide valuable insights into SRE and what you can expect from our podcast series. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit srepath.substack.com

sre site reliability engineering sre

Episode 228: Web Apps and Site Reliability Engineering (SRE) with Brian Love

Real Talk JavaScript

Play Episode Listen Later Apr 6, 2023 40:02

const podcast = { episode: 228, title: 'Web Apps and Site Reliability Engineering', topics: [ 'reliability', 'web apps', 'user focused' ], guest: 'Brian Love' hosts: [ 'John Papa', 'Ward Bell' ]};Recording date: March, 23, 2023John Papa @John_PapaWard Bell @WardBellDan Wahlin @DanWahlinCraig Shoemaker @craigshoemakerBrian Love @Brian_loveBrought to you byAG GridIdeaBladeResources:Google Books on SREWhat is SREIntroduction to Site Reliability Engineering (SRE)Reliable systems in DevOpsPing testVoting with your feetWhat is an SLAService Level Objectives and IndicatorsSLA vs SLO vs SLISLIs, SLOs, and SLAs, oh my:Interview with Dave Rensen, SRE Engineering Director on the SRE Workbook:The Origins of SREWhat it means to be a SREGet Polaris (SRE tool)Send Beacon APIGitHub Copilot XPrompt EngineeringLearn with Introduction to Prompt EngineeringTimejumps00:29 Welcome01:37 Guest introduction02:55 What is SRE?05:38 What is it like if you don't have an SRE?09:29 Sponsor: Ag Grid10:36 Available vs reliable13:35 Is SRE the same as health monitoring?21:29 Sponsor: IdeaBlade22:30 How do I make sure I don't cause more reliability issues?27:36 Who's providing the infastructure?31:04 Where's the AI in all of this?33:59 Final thoughtsPodcast editing on this episode done by Chris Enns of Lemon Productions.

ai interview recording sre web apps slo slas site reliability engineering slos site reliability engineering sre chris enns brian love

SRE for the non-unicorns (aka Enterprises) with James Brookbank

PurePerformance

Play Episode Listen Later Dec 5, 2022 52:50

You have a CISO (Chief Security Information Officer) but no CRO (Chief Reliability Officer)? You blame people if systems crash? You scale your people in the rate of scaling your infrastructure? If you answer any of those questions with YES then you should tune into this podcast as you probably struggle adopting Site Reliability Engineering (SRE) in your organization.James Brookbank, Cloud Solutions Architect, has dealt with resiliency topics in a large enterprise prior to joining Google. In our conversation he shares advice he gives Enterprises to convert the excitement about SRE into actual implementation. James gave some good guidance on what good and not so good projects are to start with. He gives practical examples on what it means to change your company culture and why there doesn't have to be an SRE for every service.In our call we discussed the SRE in Enterprise talk at DevOpsDays Boston and SRECon EMEA as well as their recent book. Here are all the relevant links:James Brookbank on Linkedin:https://www.linkedin.com/in/jamesbrookbank/SRECon EMEA Slides: https://www.usenix.org/system/files/srecon22_slides_mcghee.pdfDevOpsDays Boston 2022 Session Recording: https://www.youtube.com/watch?v=__e7b25QOHcEnterprise Roadmap to SRE Book: https://sre.google/resources/practices-and-processes/enterprise-roadmap-to-sre/

google unicorns enterprise cro reliability enterprises ciso sre site reliability engineering sre

Making sense of production

JS Party

Play Episode Listen Later Nov 4, 2022 63:59 Transcription Available

Maggie Johnson-Pint from Stanza sits down with Amal & Divya for a deep-dive in to the production side of the development world. If you're at all curious (and/or intimidated) by terms like Site Reliability Engineering (SRE), Service Level Objective (SLO), OpenTelemetry, distributed tracing, and the like… this episode's for you!

web production programming animation robotics iot making sense javascript html css node backend amal frontend divya stanza changelog site reliability engineering sre

Making sense of production (JS Party #250)

Changelog Master Feed

Play Episode Listen Later Nov 4, 2022 63:59 Transcription Available

web production programming animation robotics iot making sense javascript html css node backend amal frontend divya stanza changelog site reliability engineering sre

S1 Ep 08 In Conversation With Ramón Medrano: From Site Reliability Engineering to Data Reliability Engineering

The Soda Podcast

Play Episode Listen Later Oct 26, 2022 69:40

Maarten is in conversation with Ramón Medrano, Senior Staff Site Reliability Engineer at Google. In this conversation Maarten and Ramón discuss how the principles and practices of Site Reliability Engineering (SRE) can be applied to the practices of Data Reliability Engineering and data quality management. They deep-dive into four topics - SLOs, lineage, debuggability, and how to operate as a team - from the book Site Reliability Engineering: How Google Runs Production Systems, co-authored by Ramón's manager, Jennifer Petoff. As the book explains how Google's SRE team builds, deploys, monitors, and maintains some of the largest software systems in the world, Maarten and Ramón's conversation explores how data practitioners can apply some of the best practices, processes, and thinking, when it comes to data and systems. More about our host, Maarten Masschelein Read the transcript of this episode Learn more about the chosen charity, Open Arms Connect with us on social media: Twitter, LinkedIn, Facebook From Soda, the provider of data reliability tools and observability platform to enable data teams to find, analyze, and resolve data issues.

conversations google data ram soda maarten sre medrano data quality site reliability engineering slos site reliability engineering sre

EP85 Deploy Security Capabilities at Scale: SRE Explains How

Cloud Security Podcast by Google

Play Episode Listen Later Sep 26, 2022 30:50

Guest: Steve McGhee, Reliability Advocate, Google Cloud Topics: What can security teams learn from the Site Reliability Engineering (SRE) art of rapid and safe deployment? Is this all about the process or do SREs possess some magical technology to do this? What is SRE approach to automation? What are the pillars / components of SRE approach to deployment? SRE is also about scaling. Some security teams have to manage 1000s of detection rules, how can this be done in a manner that does not conflict or cause other problems? Resources: Google SRE book A companion Google SRE workbook “How We Scale Detection and Response at Google: Automation, Metrics, Toil” (ep75) “Achieving Autonomic Security Operations: Why metrics matter (but not how you think)” blog “Achieving Autonomic Security Operations: Reducing toil” blog.

security scale cybersecurity metrics capabilities deploy toil sre sres site reliability engineering site reliability engineering sre

S4 Ep88: The Download on SRE with Jayne Groll

The Humans of DevOps Podcast Series

Play Episode Listen Later Sep 14, 2022 24:04 Transcription Available

Welcome to a new season of the Humans of DevOps Podcast with your host Eveline Oehrlich. In this episode Eveline is joined by DevOps Institute CEO Jayne Groll to discuss Site Reliability Engineering (SRE). Jayne and Eveline discuss the findings of the 2022 Global SRE Pulse report, how SRE came into being, and the developments and frameworks that are leading SRE into the future. Special thanks to our sponsor Range! Enjoy the Humans of DevOps Podcast? We're incredibly grateful to be voted one of the Best 25 DevOps Podcasts by Feedspot. Want access to more DevOps-focused content and learning? When you join SKILup IT Learning you gain the tools, resources and knowledge to help your organization adapt and respond to the challenges of today. And if you're looking for the answers to DevOps' persistent questions, pop on in to SKILup Discussions, one of the fastest-growing DevOps communities around! Have questions, feedback or just want to chat about the podcast? Send us an email at podcast@devopsinstitute.com

humans range devops feedspot sre site reliability engineering sre jayne groll

All Things Site Reliability Engineering (SRE)

Find Flow

Play Episode Listen Later Aug 26, 2022 34:27

This week, Sean sat down with Emily Arnott of Blameless, who is making it her mission to spread “the Gospel of SRE.” Their discussion covered the philosophy underpinning Site Reliability Engineering, its origins in the world of manufacturing, and a few detailed scenarios for how this approach plays out in real-world incident response teams.

gospel open source evolved blameless sre practicewhat site reliability engineering site reliability engineering sre

Is Site Reliability Engineering (SRE) the Next Evolution of DevOps?

Find Flow

Play Episode Listen Later Aug 19, 2022 35:00

This week, host Sean McDermott is speaking with Eveline Oehrlich, Industry Analyst and Chief Research Officer at the DevOps Institute. Eveline is an industry analyst, author, speaker and business advisor focused on digital transformation. They discuss the emerging topic of Site Reliability Engineering, or SRE, and challenges relative to the broad adoption of DevOps. Has site reliability become the next natural extension of DevOps? Operations always lags behind engineering in a "build and ship" world. Can SRE get these two organizations collaborating from the outset, on both processes and outcomes? Join us for this deep-dive into the future of technology.

operations devops sre sean mcdermott chief research officer next evolution industry analyst site reliability engineering site reliability engineering sre devops institute

For the Back of the Room: Gerard Spivey, Senior Systems Development Engineer at Amazon Web Services.

Outspoken with Shana Cosgrove

Play Episode Listen Later Jun 7, 2022 55:55

Curiosity, Focus, and Forging a Path.In this episode of The Outspoken Podcast, host Shana Cosgrove talks to Gerard Spivey, Senior Systems Development Engineer at Amazon Web Services. Gerard speaks in detail about Amazon's interview process, giving us insight into their procedures and how he prepared himself. We also hear about Gerard's time at Amazon and the types of work he's taking on. Side hustles are a way of life for Gerard, and he speaks about his latest experiences managing his YouTube channel, Gerard's Curious Tech. Lastly, Gerard talks about his time at NYLA and how he was able to bring his full self to work thanks to NYLA's culture. QUOTES “I can do slow and steady, I can find my target audience, and then once I have that I can figure out what I want to parlay that into later.” - Gerard Spivey [25:59] “‘I'm a Senior Director [at Intel], and I can do what I want' is basically what he told me. He's like ‘the company has a 3.0 thing, but for someone like you who actually knows what they're talking about it's not a problem.' So I said, ‘Ooh this is my time, they're letting me in'” - Gerard Spivey [42:07] “You're in a good spot in your career when you're valued for the thing you're going to do next versus the thing you did previously. What you're going to do next is your competitive value - that is what you bring to the table.” - Gerard Spivey [48:27] TIMESTAMPS [00:04] Intro [01:31] Gerard's Wedding Ceremony [02:32] Working at Amazon Web Services (AWS) [05:33] Amazon's Interview Process [12:06] Gerard's Experience with the Job Market [15:54] Working at Amazon [19:11] Starting a New Job During COVID [19:43] Side Hustles [23:21] Gerard's YouTube Channel [31:08] Gerard's Childhood [31:52] How Gerard Decided to Study Electrical Engineering [34:19] Choosing a College [45:13] Gerard's Advice to his Younger Self [47:42] Favorite Books [50:57] Gerard's Time at NYLA [55:36] Outro RESOURCES https://aws.amazon.com/ec2/ (Amazon EC2) https://aws.amazon.com/ec2/instance-types/ (Amazon EC2 Instance Types) https://aws.amazon.com/dynamodb/ (Amazon DynamoDB) https://sre.google/ (Site Reliability Engineering (SRE)) https://www.c2stechs.com/ (Commercial Cloud Services (C2S)) https://www.thebalancecareers.com/what-is-the-star-interview-response-technique-2061629 (STAR Interview Response Method) https://www.microsoft.com/en-us/microsoft-365/exchange/email (Microsoft Exchange) https://azure.microsoft.com/en-us/ (Microsoft Azure) https://www.synopsys.com/glossary/what-is-cicd.html (CI/CD) https://mlt.org/ (Management Leadership for Tomorrow (MLT)) https://www.hbs.edu/ (Harvard Business School) https://a16z.com/ (Andreessen Horowitz) https://www.youtube.com/ (YouTube) https://www.nsbe.org/K-12/Programs/PCI-Programs (NSBE Pre-College Initiative Program) https://www.jhu.edu/ (Johns Hopkins University) https://www.abet.org/ (Accreditation Board for Engineering and Technology (ABET)) https://www.ncat.edu/ (North Carolina A&T State University) https://www.morgan.edu/ (Morgan State University) https://howard.edu/ (Howard University) https://www.rit.edu/ (Rochester Institute of Technology) https://www.psu.edu/ (Penn State University) https://www.digitaltechnologieshub.edu.au/teach-and-assess/classroom-resources/topics/digital-systems/ (Digital Systems) https://www.xilinx.com/products/silicon-devices/fpga/what-is-an-fpga.html (Field Programmable Gate Arrays (FPGAs)) https://www.gwu.edu/ (The George Washington University) https://www.intel.com/content/www/us/en/homepage.html (Intel) https://www.pcmag.com/encyclopedia/term/pci-express (PCI Express) https://www.intel.com/content/www/us/en/io/serial-ata/serial-ata-developer.html (Serial ATA (SATA)) https://consortium.org/ (Consortium of Universities of the Washington Metropolitan Area) https://www.amazon.com/Zero-One-Notes-Startups-Future/dp/0804139296 (Zero to One) by Peter Thiel and Blake Masters https://www.richdad.com/...

amazon time starting technology college advice focus childhood engineering senior engineers curiosity intel senior director side hustles universities harvard business school johns hopkins university gerard george washington university howard university forging peter thiel penn state university amazon web services job market younger self consortium favorite books andreessen horowitz microsoft azure ci cd rochester institute morgan state university spivey microsoft exchange wedding ceremony management leadership t state university interview process amazon ec2 pci express development engineer site reliability engineering sre systems development washington metropolitan area zero one notes startups future amazon dynamodb

Mengenal Site Reliability Engineering (SRE)

Kode Nol

Play Episode Listen Later Apr 15, 2022 15:14

Meski membangun budaya DevOps telah membantu tim berkolaborasi dengan lebih baik serta menghadirkan software yang lebih cepat dan handal tim DevOps sebaiknya juga memiliki orang yang di dedikasikan khusus untuk mengembangkan keandalan sistem dan kinerja software. Disitulah Site Reliability Engineering berperan Site Reliability Engineering atau yang biasa disingkat SRE awalnya diinisiasi oleh insinyur Google Ben Treynor. Tak lama setelah menerapkan SRE mereka menerbitkan eBook untuk mensosialisasikan SRE di industri teknologi Nah, sekarang kita sudah kedatangan kak Tara Baskara, Engineering Manajer di Gojek untuk membedah lebih jauh tentang SRE ini.

ebooks nah tak devops sre mengenal meski gojek site reliability engineering site reliability engineering sre

Episode 293: Moving TOO fast and following my manager

Soft Skills Engineering

Play Episode Listen Later Feb 28, 2022 21:37

In this episode, Dave and Jamison answer these questions: Is it possible to move too fast and do you believe in too much enthusiasm? I am one of the youngest member of the team and am always willing to start new projects and balance a few different things. Is there a point where this can start hurting my career? I've gotten bumped in compensation fairly, almost 25% raise since I first started. My career goal is to stay on the programming side but want to become a possible trainer for newer engineers/devs. Listener Michael asks, I'm a backend engineer in an engineering/coding role with a small bit of SRE type work. I love the work as I get to dig deep into tech we use and have become subject a matter expert on databases within the company. I really like my team and my manager in particular, and get to learn a lot every week. My manager is leaving my team to lead a new team within the company that is focused on the company's SaaS offering and I've been given the option of joining this new team if I wish. I like their managerial style and how they have helped me with my career progression so far. However, I'd be doing Site Reliability Engineering (SRE) work. I'm not sure if I'm ready yet to commit to being an SRE and code less/focus more on ensuring the reliability of mission critical production systems. I don't know how easy it would be to switch back to more of a coding role in a years time or if it would pigeonhole me into that type of role. Have you got any advice?

saas sre moving too fast site reliability engineering sre

What is Site Reliability Engineering [ SRE ] | How to think like an SRE | Responsibilities of an SRE | SRE vs System Admin vs DevOps

Being a pro

Play Episode Listen Later Jan 3, 2022 23:59

Hey there! Follow the podcast if you like the episode This is Tharun. In the Developer Tharun Podcast, I speak about Software Engineering Thank you for Listening In this Episode Site reliability engineering The 4 aspects of Site Reliability Engineering according to me And more... Thank you for listening to my Podcast. Follow my podcast if you find it helpful. Check out my other episodes. I talk about programming & software engineering. YouTube: https://youtube.com/c/developerTharun Blog Article on: https://tharunshiv.com Instagram: @developerTharun Dev.to: https://dev.to/developertharun Udemy: https://www.udemy.com/user/tharun-shiv/ LinkedIn: https://linkedin.com/in/tharunshiv

responsibility devops site reliability engineering site reliability engineering sre vs system tharun system admin

Episode 225 – SRE is a Journey with Dave Stanke

The 6 Figure Developer Podcast

Play Episode Listen Later Dec 20, 2021 42:18

Dave Stanke joins us to talk all about Site Reliability Engineering. Dave is a Developer Relations Engineer with Google Cloud Platform specializing in DevOps, Site Reliability Engineering (SRE), and other flavors of technical relationship therapy. He loves chatting with practitioners: listening to stories, telling stories, sharing a healthy cry. Prior to Google, he was the CTO of OvationTix/TheaterMania, a SaaS startup in the performing arts industry, where he specialized in feeding memory to Java servers. He chose on purpose to live in New Jersey, where he enjoys baking, indie rock, and fatherhood. Links https://stanke.dev/ https://twitter.com/davidstanke https://cloud.google.com/developers/advocates/dave-stanke Resources https://sre.google/ https://bit.ly/reliability-discuss https://bit.ly/dora-sodr Thinking, Fast and Slow Site Reliability Engineering The Site Reliability Workbook Want to supercharge your DevOps practice? Research says try SRE Eliminating Toil Identifying and tracking toil using SRE principles How maintenance windows affect your error budget—SRE tips "Tempting Time" by Animals As Leaders used with permissions - All Rights Reserved × Subscribe now! Never miss a post, subscribe to The 6 Figure Developer Podcast! Are you interested in being a guest on The 6 Figure Developer Podcast? Click here to check availability!

google research new jersey saas cto java devops sre google cloud platform site reliability engineering animals as leaders site reliability engineering sre developer relations engineer

T2E1 - Cultura da Inovação

Google Cloud Cast

Play Episode Listen Later Jul 21, 2021 48:05

O mercado está passando por um período de muitas mudanças e desafios. Para os líderes de empresas, o cenário atual exige a tomada de decisões estratégicas, que ajudem a acelerar a transformação digital dos negócios e otimizar seus investimentos. Nesse sentido, a cultura organizacional ganha força para que líderes e gestores repensem a forma como pessoas, estruturas e processos interagem no dia a dia. No primeiro episódio da segunda temporada do Google Cloud Cast, Daniel Leite, Executivo de Vendas do Google Cloud, e Marcelo Gomes, Especialista em Modernização de Infraestrutura do Google Cloud, recebem o Senior Innovation Advisor do Google Cloud, Renato Nobre, para discutir como desenvolver uma cultura organizacional que valorize a inovação. Se você quiser conferir essa conversa na íntegra e sem cortes, acesse o canal do Google Cloud LATAM no YouTube e assista à gravação completa - o vídeo estará disponível em breve. O Google Cloud Cast é o podcast oficial do Google Cloud no Brasil, no qual discutimos quinzenalmente temas como transformação digital, inovação e a jornada para a nuvem com a participação de executivos, especialistas e convidados especiais. Confira os links deste episódio: O que é Site Reliability Engineering (SRE): https://sre.google Saiba mais sobre DevOps & SRE: https://cloud.google.com/blog/products/devops-sre Confira a pesquisa "O futuro do trabalho no Brasil: Insights sobre a colaboração e novas formas de trabalho": https://bit.ly/FuturodoTrabalhoGCC Confira essa linha do tempo da história computacional: https://www.computerhistory.org/timeline/1945/ Saiba mais sobre o cabo de comunicações transatlânticas Grace Hopper: https://cloud.google.com/blog/products/infrastructure/announcing-googles-grace-hopper-subsea-cable-system Saiba mais sobre Paul Otlet, um dos fundadores da documentação: https://daily.jstor.org/internet-before-internet-paul-otlet/ Solving for Innovation: o que estamos solucionando: https://cloudonair.withgoogle.com/events/reimagine-negocio-business-solution-2021?talk=oq_estamos_solucionando Reinventar a inovação: https://cloudonair.withgoogle.com/events/reimagine-negocio-business-solution-2021?talk=reinventar_a_inovacao Gostou do episódio ou tem alguma sugestão? Compartilha conosco por e-mail em googlecloudcast@google.com

innovation brasil cultura nesse confira saiba especialista devops inova google cloud vendas executivo sre infraestrutura reinventar compartilha grace hopper moderniza marcelo gomes site reliability engineering sre

State of DevOps and SRE 2021

ServiceNow DevOps

Play Episode Listen Later Jul 8, 2021 17:08

ServiceNow partnered with EMA Research to understand the state of DevOps and Site Reliability Engineering (SRE) industry. This video offers a synopsis of key findings. Download these DevOps and SRE reports to learn about the complete results. See omnystudio.com/listener for privacy information.

devops servicenow sre site reliability engineering sre

State of DevOps and SRE 2021

ServiceNow Podcasts

Play Episode Listen Later Jul 8, 2021 17:08

devops servicenow sre site reliability engineering sre

Ep 26. 和 xintao 聊聊新加坡的工作与生活

捕蛇者说

Play Episode Listen Later Mar 7, 2021 75:40

如果喜欢我们的节目，欢迎通过爱发电打赏支持：https://afdian.net/@pythonhunter 主播 Manjusaka laike9m laixintao 时间轴 00:02:00 为什么 xintao 会离开阿里？ 00:22:43 办理新加坡签证 00:28:30 新加坡的生活成本和税收 00:29:57 在新加坡租房 00:43:20 新加坡的日常生活 00:58:17 应对诈骗 01:03:13 xintao 在 Shopee 的工作，Shopee 的公司文化 01:06:06 如何进入 Shopee 工作？ 01:11:05 Manjusaka 的招人广告链接 What is Site Reliability Engineering (SRE)? Google December 2020 services outage 智能运维系列（一）| AIOps 的崛起与实践关于《Fluent Python》中文版中“期物”这个翻译的讨论组屋我在新加坡一个月的生活费明细 - by laixintao Join Shopee & Work with Me! - xintao 的内推链接 PyCon US 2021

work shopee aiops site reliability engineering sre

1472: The Security Risks of Unmanaged Tech In Businesses

The Tech Blog Writer Podcast

Play Episode Listen Later Jan 21, 2021 20:28

New data reveals the competing – and sometimes conflicting – challenges and priorities of IT leaders from 2020 that are shaping IT’s agenda for 2021 when it comes to managing risk. According to a new global survey, 72% of IT leaders and 52% of employees agreed that security is the biggest issue when it comes to unaccounted for and unmanaged technology. It seems that IT’s continuous efforts to reinforce security best practices may finally be paying off. But there is a lower level of awareness for additional issues, especially among employees, with 16% believing unaccounted for and unmanaged technologies do not cause any business problems whatsoever. Snow Software CIO Alastair Pooley joins me on Tech Talks Daily to dive deeper into the findings. On joining Snow, Alastair championed the idea of launching SaaS services and established both a hosting and Site Reliability Engineering (SRE) function to support such growth. This provides the infrastructure and Support for over 200 customers who have adopted such services and provided a path to future growth for the business. By adopting a SaaS/IaaS approach to IT services Snow has managed to pass $100M of ARR with barely any owned infrastructure. Over 90% of Snow’s IT lives either with a SaaS provider or in the public cloud. This is achieved on a zero-trust design which focuses on building single sign on capabilities with strong cybersecurity controls. Alastair also initiated, and now oversees, the cybersecurity function at Snow to provide a central risk and compliance function for the business.

tech businesses snow saas 100m arr alastair security risks unmanaged site reliability engineering sre tech talks daily

Site Reliability Engineering 101 - Bob Strecansky - MailChimp

ShipTalk

Play Episode Listen Later Nov 3, 2020 35:35 Transcription Available

In this episode, we talk to Bob Strecansky who is a Staff SRE at MailChimp. A packed podcast about all things Site Reliability Engineering (SRE). Learn about how to become an SRE, the rise of blameless culture, a clear definition of black-box vs white-box approaches, and much more!

mailchimp sre site reliability engineering site reliability engineering sre

S01-E34: CNCF sigue creciendo, también Kubernetes en Netflix y Rust en el Linux kernel

Cloud Native MX

Play Episode Listen Later Jul 24, 2020 63:18

# Podcast S01-E34: CNCF sigue creciendo, también Kubernetes en Netflix y Rust en el Linux kernel - Conducido por @_marKox, @domix ## Revisión de las noticias - [Cloud Native Computing Foundation Takes Charge of Red Hat’s Operator Framework](https://thenewstack.io/cloud-native-computing-foundation-takes-charge-of-red-hats-operator-framework/) - [KubeCon + CloudNativeCon North America 2020 is now an online experience](https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/attend/virtual-event-update/) - [Will 2020 Be The Year Of Rust In The Linux Kernel?](https://hackaday.com/2020/07/15/will-2020-be-the-year-of-rust-in-the-linux-kernel/) - [Site Reliability Engineering (SRE) 101 with DevOps vs SRE](https://www.cncf.io/blog/2020/07/17/site-reliability-engineering-sre-101-with-devops-vs-sre/) ## Twitter! - [Netflix moved to Kubernetes](https://twitter.com/aspyker/status/1283836267646431234) ## Referencias y Recursos - [Setting SLOs: a step-by-step guide](https://cloud.google.com/blog/products/management-tools/practical-guide-to-setting-slos) - [GKE best practices: Exposing GKE applications through Ingress and Services](https://cloud.google.com/blog/products/containers-kubernetes/exposing-services-on-gke) - [Announcing the New Version of the Well-Architected Framework](https://aws.amazon.com/blogs/architecture/announcing-the-new-version-of-the-well-architected-framework/) ## Repos chingones de código - [Kubelive](https://github.com/ameerthehacker/kubelive) - [kubevol](https://github.com/bmaynard/kubevol) ### Créditos de música Music by Scott Buckley – www.scottbuckley.com.au

netflix services sigue rust linux tambi n devops red hat kubernetes creciendo revisi sre repos scott buckley referencias new version conducido ingress cncf linux kernel gke site reliability engineering sre

Johnny Boursiquot on Serverless Go and Site Reliability Engineering at Heroku

The InfoQ Podcast

Play Episode Listen Later Jun 19, 2020 41:07

In this podcast, Johnny Boursiquot, Site Reliability Engineer at Heroku, sat down with InfoQ podcast co-host Daniel Bryant and discussed topics that included: why Go is a useful language for building Function-as-a-Service (FaaS) style applications; how Heroku implement the role of Site Reliability Engineer (SRE); and why the ability to teach is such a valuable skill. Why listen to this podcast: - Go is a useful language for building Function-as-a-Service (FaaS) style applications. The ability to build Go applications into a static binary reduces the need for dependency management, and the quick runtime and application start time is good for initiation and scaling - The FaaS development toolchain has improved over the years. Many cloud providers now provide local runtimes, e.g. AWS SAM Local, and service simulators, e.g. LocalStack. Testing in production is facilitated by the ability to do dark launches and canary releasing at the ingress/API gateway - Developing “serverless” applications typically does not remove the need for operational expertise on a development team. Designing systems appropriately and getting the most out of the runtime (with minimal cost) requires knowledge of the underlying infrastructure components - The role of Site Reliability Engineering (SRE) looks different across practically every organisation. The Heroku SRE team have adapted well-established patterns and practices into their roles. They act as “diplomats”, working closely with product teams to share knowledge around operational best practices - The ability to teach is a valuable skill, regardless of your job. Teaching people to code or to embrace important operational principles is extremely rewarding. - Engineers who teach must seek to escape the pull of their ego; by focusing on the needs of the people you are teaching, much more progress can be made. More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2UV0tqK You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2UV0tqK

teaching developing testing engineers designing function api faa serverless heroku site reliability engineering site reliability engineer infoq site reliability engineering sre daniel bryant

Site Reliability Engineering the Big Picture with Elton Stoneman

TestGuild Performance Testing and Site Reliability Podcast

Play Episode Listen Later Apr 14, 2020 34:20

DevOps is great, but it needs a huge cultural shift, which many organizations find too hard. That's where Site Reliability Engineering (SRE) comes in. In this episode, Elton Stoneman, author of the Pluralsight course Site Reliability Engineering (SRE): The Big Picture, shares why SRE might be than DevOps for most organizations. Discover how SRE brings a software engineering approach to operations, making it easy to implement and to get quick results. Listen in to discover some critical aspects of SRE and how to transform your organization.

discover big picture devops elton sre pluralsight stoneman site reliability engineering site reliability engineering sre

sp.80【ゲスト: songmu】語学学校の元営業が楽しいPerlやGoコミュニティに関わって成長し、NatureのCTOになるまで

しがないラジオ

Play Episode Listen Later Mar 17, 2020 113:14

songmuさんをゲストにお迎えして、中国での起業、語学学校、SE時代、Perlコミュニティ、OSS活動、Nature株式会社、などについて話しました。【Show Notes】 Nature株式会社 Rebuild.fm 慶應義塾大学 SFC 総合政策学部環境情報学部順徳区 - Wikipedia Shibuya Perl Mongers Sugamo.css 面白法人カヤック @fujiwara | Twitter @typester | Twitter IRC - Wikipedia ISUCON YAPC - Wikipedia オードリー・タン - Wikipedia はてなに入った技術者の皆さんへ - jkondoのはてなブログ @miyagawa | Twitter Plagger - Wikipedia Plack - Wikipedia 退職とFA宣言のお知らせ | おそらくはそれさえも平凡な日々 @stanaka | Twitter Mackerel インフラチーム改め Site Reliability Engineering (SRE) チームになりました - Mercari Engineering Blog セールスエンジニア改め Customer Reliability Engineer (CRE) になりました - Hatena Developer Blog @maaash | Twitter Nature Remoのシステムの裏側についての資料を公開します - An Epicurean Nature Remo E YAPC::Kyoto 2020 OSS貢献を小さく始めて技術力を高め、大きく花開かせる - YAPC::Kyoto 2020 採用情報 — Nature ghq v1リリースとghq-handbookのお知らせ | おそらくはそれさえも平凡な日々 ghq-handbook 配信情報はtwitter ID @shiganaiRadio で確認することができます。フィードバックは(#しがないラジオ)でつぶやいてください！感想、話して欲しい話題、改善して欲しいことなどつぶやいてもらえると、今後のポッドキャストをより良いものにしていけるので、ぜひたくさんのフィードバックをお待ちしています。【パーソナリティ】 gami@jumpei_ikegami zuckey@zuckey_17 【ゲスト】 songmu@songmu 【機材】 Blue Micro Yeti USB 2.0マイク 15374

nature wikipedia rebuild fa oss epicurean sfc mackerel site reliability engineering sre internet relay chat nature remo

058: DevOps vs. SRE

CloudSkills.fm

Play Episode Listen Later Jan 15, 2020 39:15

In this episode I catch up with Josh Duffney to discuss the differences between DevOps and Site Reliability Engineering (SRE).

career cloud certification devops sre site reliability engineering sre

DEV313: Shift-Left SRE: Self-Healing with AWS Lambda Functions

AWS re:Invent 2018

Play Episode Listen Later Nov 30, 2018 55:16

Even the best continuous delivery and DevOps practices cannot guarantee that there will be no issues in production. The rise of Site Reliability Engineering (SRE) has promoted new ways to automate resilience into your system and applications to circumvent potential problems, but it's time to 'shift-left' this effort into engineering. In this session, learn to leverage AWS Lambda functions as 'remediation as code.' We show how to make it part of your continuous delivery process and orchestrate the invocation of Self-Healing Lambda functions in case of unexpected situations impacting the reliability of your system. Gone are the days of traditional operation teams-it's the rise of 'shift-lefters'! This session is brought to you by AWS partner, Dynatrace.

aws devops self healing aws lambda dynatrace shift left site reliability engineering sre aws lambda functions

How can SRE help organizations achieve better and faster results?

On Cloud

Play Episode Listen Later Nov 13, 2018 21:26

Tune in to host Mike Kavis and guest Damon Edwards as they discuss how Site Reliability Engineering (SRE) works, some of the common benefits, and why it is important to define an SRE model specific to your organization. Learn how SRE’s shared responsibility model and feedback mechanisms can help organizations gain control and enable the operations, development, and business teams to work together.

cloud achieve websites organizations devops sre site reliability engineering sre damon edwards mike kavis

Full Stack Journey 022: Site Reliability Engineering (SRE) With Michael Kehoe

Packet Pushers - Full Podcast Feed

Play Episode Listen Later Jun 20, 2018

Site Reliability Engineering (SRE) is the topic for the latest Full Stack Journey podcast. Guest Michael Kehoe explores SRE, its relationship w/ DevOps, essential skills, and more.

devops sre fullstack site reliability engineering sre michael kehoe

Full Stack Journey 022: Site Reliability Engineering (SRE) With Michael Kehoe

Packet Pushers - Full Stack Journey

Play Episode Listen Later Jun 20, 2018

Site Reliability Engineering (SRE) is the topic for the latest Full Stack Journey podcast. Guest Michael Kehoe explores SRE, its relationship w/ DevOps, essential skills, and more.

devops sre fullstack site reliability engineering sre michael kehoe

Full Stack Journey 022: Site Reliability Engineering (SRE) With Michael Kehoe

Packet Pushers - Full Stack Journey

Play Episode Listen Later Jun 20, 2018

Site Reliability Engineering (SRE) is the topic for the latest Full Stack Journey podcast. Guest Michael Kehoe explores SRE, its relationship w/ DevOps, essential skills, and more. The post Full Stack Journey 022: Site Reliability Engineering (SRE) With Michael Kehoe appeared first on Packet Pushers.

devops sre fullstack site reliability engineering sre packet pushers michael kehoe

Full Stack Journey 022: Site Reliability Engineering (SRE) With Michael Kehoe

Packet Pushers - Full Podcast Feed

Play Episode Listen Later Jun 20, 2018

devops sre fullstack site reliability engineering sre packet pushers michael kehoe

Site Reliability Engineering (SRE) @ Bloomberg w/ Stig Sorenson

DevOps Chat

Play Episode Listen Later May 25, 2018 18:42

SRE is a very hot field right now. Some say it is "the ops in DevOps". We chat with Stig Sorenson of the Bloomberg SRE team about how Bloomberg is using SRE to make their business more responsive to their customers. Stig and the Bloomberg team are really at the forefront of what is happening in the SRE field, so this is a great look in.

bloomberg devops stig sre sorenson site reliability engineering sre

SRE vs Devops with Liz Fong-Jones and Seth Vargo

Google Cloud Platform Podcast

Play Episode Listen Later May 15, 2018 35:45

This week is a clash of titans! Liz Fong-Jones and Seth Vargo join Mark and Melanie, to battle out on which is better: SRE or Devops (hint - everyone wins!). Liz Fong-Jones Liz is a Staff Site Reliability Engineer at Google and works on the Google Cloud Customer Reliability Engineering team in New York. She has worked on services ranging from Google Flights to Cloud Bigtable in her 10+ years at Google. She lives with her wife, metamour, and a Samoyed/Golden Retriever mix in Brooklyn. In her spare time, she plays classical piano, leads an EVE Online alliance, and advocates for transgender rights. Seth Vargo Seth Vargo is a Developer Advocate at Google. Previously he worked at HashiCorp, Chef Software, CustomInk, and a few Pittsburgh-based startups. He is the author of Learning Chef and is passionate about reducing inequality in technology. Seth is an active member of the DevOps community and has written thought-leader-y pieces such as the 10 Myths of DevOps. Cool things of the week Google I/O session youtube What’s new in Firebase at I/O 2018 blog Introducing ML Kit for Firebase blog Jeff Dean is new Head of AI wired Introducing Cloud Memorystore: A fully managed in-memory data store service for Redis blog Google Group Issue tracker Interview class SRE implements DevOps youtube series DevOps wikipedia Site Reliability Engineering (SRE) site Terraform site Chef site Puppet site Ansible site SaltStack site Prometheus site Datadog site Stackdriver site The Site Reliability Workbook: Practical Ways to Implement SRE amazon Seeking SRE o’reilly Customer Reliability Engineering Blog Series blogs Question of the week I’m a researcher at a regionally accredited academic institution and I need compute resources. Does Google Cloud have any programs that can help me out? Google Cloud Platform announces new credits program for researchers blog faq Where can you find us next? Mark will be speaking at the Monthly SF Game Development Community, presenting on You Can’t Just Add More Servers on May the 30th in San Francisco. Melanie is speaking at the Understand Risk Forum on May 17th, in Mexico City.

new york head ai google interview san francisco chefs myths pittsburgh software mexico city puppets io faq prometheus devops cre google i o sre terraform eve online datadog developer advocate redis hashicorp neurotic firebase ansible google cloud platform google flights jeff dean google groups saltstack site reliability engineering sre liz fong jones seth vargo

The Cloudcast #331 - Has SRE replaced DevOps?

The Cloudcast

Play Episode Listen Later Jan 25, 2018 39:24

Aaron and Brian talk with Rob Hirschfeld (@zehicle, CEO @rackngo; Kubernetes Cluster Ops Co-Chair) about the consistency, continuum and confusion between the concepts of DevOps and Site Reliability Engineering (SRE). Show Links: Google SRE Book DevOps vs. SRE DevOps vs. SRE (Rob on Datanauts) Love DevOps? Wait until you meet SRE? Open Source PXE "L8istSh9y" - Rob's Edge & Automation Podcast [PODCAST] @PodCTL - Containers | Kubernetes - RSS Feed, iTunes, Google Play, Stitcher, TuneIn and all your favorite podcast players [A CLOUD GURU] Get The Cloudcast Alexa Skill [A CLOUD GURU] A Cloud Guru Membership - Start your free trial. Unlimited access to the best cloud training and new series to keep you up-to-date on all things AWS. [A CLOUD GURU] FREE access to AWS Certification Exam Prep Guide - At A Cloud Guru, the #1 question received from students is "I want to pass the AWS cert exam, so where do I start?" This course is your answer. [FREE] eBook from O'Reilly Show Notes Topic 1 - What is the State of DevOps today? On one hand, there’s Gene Kim’s DevOps Reports (all is great), on another hand is DevOps Days which has become about Empathy, and somewhere in between are companies struggling with all of this silo-busting and automation and constant change. So where are we? Topic 2 - The DevOps community seemed to want to reject all sort of labels and titles (DevOps engineer, DevOps certified, etc.) and how there is this “SRE” (Site Reliability Engineering) concept. Is this just a new name for DevOps? Topic 3 - Like NetFlix had microservices, so everybody needed microservices - Google has SRE, so now everyone needs SRE? How does SRE fit into a non-Google company? Topic 4 - Many Infra/Ops-centric people have been trying to learn automation and some basic programming (e.g. Python, Powershell/Scripting). SREs are often described as programmers that live in the Ops world. Can these current Infra/Ops people evolve to SRE? Topic 5 - Do you find that DevOps or SRE apply more (or less) to using certain types of technologies vs. other technologies? Feedback? Email: show at thecloudcast dot net Twitter: @thecloudcastnet and @ServerlessCast

ceo netflix google state chefs empathy stitcher google play puppets python aws tunein replaced devops ops kubernetes free ebooks sre microservices ansible gene kim sres devopsdays like netflix site reliability engineering sre cloudcast rob hirschfeld rackn

Do you know DevOps? What about SRE? Learn More with Founder of RackN

HPE Helion Podcast

Play Episode Listen Later Feb 14, 2017

In this podcast, Rob Hirschfeld, Founder and CEO, RackN discusses the latest trends in IT management at scale including DevOps and the emergence of Site Reliability Engineering (SRE). SRE is a response to the limitations of DevOps faced by Google providing an answer to the significant challenges of operating global Hybrid IT infrastructure that continues to grow at a rapid rate. For more information on RackN visit www.rackn.com

ceo founders google devops sre site reliability engineering sre hybrid it rob hirschfeld rackn

SE-Radio-Episode-276-Björn-Rabenstein-on-Site-Reliability-Engineering

Software Engineering Radio - The Podcast for Professional Software Developers

Play Episode Listen Later Dec 6, 2016 57:24

Björn Rabenstein discusses the field of Site Reliability Engineering (SRE) with host Robert Blumen. The term SRE has recently emerged to mean Google's approach to DevOps. The publication of Google's book on SRE has brought many of their practices into more public discussion. The interview covers: what is distinct about SRE versus devops; the SRE focus on development of operational software to minimize manual tasks; the emphasis on reliability; Dickerson's hierarchy of reliability; how reliability can be measured; is there such a thing as too much reliability?; can Google's approach to SRE be applied outside of Google?; Björn's experience in applying SRE to Soundcloud - what worked and what did not; how can engineers best apply SRE to their organizational situation?; the importance of monitoring; monitoring and alerting; being on call, responding to incidents; the importance of documentation for responding to problems; they wrap up with a discussion of why people from non-computer science backgrounds are often found in devops and SRE.

google soundcloud engineering bj monitoring prometheus reliability devops dickerson sre site reliability engineering site reliability engineering sre robert blumen se radio

SE-Radio Episode 276: Björn Rabenstein on Site Reliability Engineering

Software Engineering Radio - The Podcast for Professional Software Developers

Play Episode Listen Later Dec 6, 2016 57:24

Björn Rabenstein discusses the field of Site Reliability Engineering (SRE) with host Robert Blumen. The term SRE has recently emerged to mean Google’s approach to DevOps. The publication of Google’s book on SRE has brought many of their practices into more public discussion. The interview covers: what is distinct about SRE versus devops; the SRE […]

google development testing software engineering architecture patterns bj enterprise programming languages devops embedded scripting sre soa mda concurrency site reliability engineering site reliability engineering sre robert blumen se radio

CZ Podcast 160 - Site Reliability Engineering

CZPodcast

Play Episode Listen Later Nov 29, 2016 53:03

DevOps je mrtvé, ať žije Site Reliability Engineering (SRE). Do tohoto dílu jsme pozvali Ladislava Prskavce, který vede SRE tým v Apiary a je tedy osobou více než povolanou, aby nám o tomto novém přístupu něco prozradil.

programming devops sre filemon apiary site reliability engineering site reliability engineering sre dagi

Podcasts about site reliability engineering sre

Best podcasts about site reliability engineering sre

S.R.E.path Podcast

Packet Pushers - Full Podcast Feed

Software Engineering Radio - The Podcast for Professional Software Developers

TestGuild Performance Testing and Site Reliability Podcast

Google SRE Prodcast

Packet Pushers - Full Stack Journey

Find Flow

Latest news about site reliability engineering sre

Latest podcast episodes about site reliability engineering sre

Software Development When Budget and Velocity Fade Away with Tyler Wells

The Best Open Source US Model (Right behind China)

SO MANY THINGS need to go right just so you can watch a TikTok! | E2215

La comunidad Wordpress Ferrolterra propone la charla "El equipo azul ataca", sobre ciberseguridad y defensa digital

The One with Technical Program Managers and Karanveer Anand

#175 - DevOps VS SRE - com Luriel Santana

We're back with Season 4!

How to Make B2B Marketing Exciting – Dan Ruby – Nobl9

Dev Harmony: Communication & Proven SRE Practices • Liz Fong-Jones & Marit van Dijk

Site Reliability Engineering with Dan Salinas & Sarv Shah of Nobl9 – IT in the D 484

Ep. 162 - The cloud native journey

#5.01 - La Vida de un SRE, con Pelado Nerd

Evo Nordics #475 - SRE & Ownership

Site Reliability Engineering – How tun run production systems

Full-Stack Mindfulness and AI: Allison Durham's Vision for the Future

A Microservices Outcome: Testing Boomed

TechChat Tuesdays #66: The DevOps of Sports Betting with Drew Rogers

TechChat Tuesdays #66: The DevOps of Sports Betting with Drew Rogers

Establishing SRE Foundations with Vlad Ukis

#5 Where does SRE fit into your organization's structure?

Ep 9 - Cloud journey spotlight with Mersedes ( @blkwomenread )

#4 Should organizations care about SRE?

#3 SRE vs DevOps vs Platform Engineering

O Zero Trust i koncepcji Site Reliability Engineering (SRE)

#2 What is Site Reliability Engineering (SRE) and what is not SRE?

#1 Introducing the SREpath podcast

Episode 228: Web Apps and Site Reliability Engineering (SRE) with Brian Love

SRE for the non-unicorns (aka Enterprises) with James Brookbank

Making sense of production

Making sense of production (JS Party #250)

S1 Ep 08 In Conversation With Ramón Medrano: From Site Reliability Engineering to Data Reliability Engineering

EP85 Deploy Security Capabilities at Scale: SRE Explains How

S4 Ep88: The Download on SRE with Jayne Groll

All Things Site Reliability Engineering (SRE)

Is Site Reliability Engineering (SRE) the Next Evolution of DevOps?

For the Back of the Room: Gerard Spivey, Senior Systems Development Engineer at Amazon Web Services.

Mengenal Site Reliability Engineering (SRE)

Episode 293: Moving TOO fast and following my manager

What is Site Reliability Engineering [ SRE ] | How to think like an SRE | Responsibilities of an SRE | SRE vs System Admin vs DevOps

Episode 225 – SRE is a Journey with Dave Stanke

T2E1 - Cultura da Inovação

State of DevOps and SRE 2021

State of DevOps and SRE 2021

Ep 26. 和 xintao 聊聊新加坡的工作与生活

1472: The Security Risks of Unmanaged Tech In Businesses

Site Reliability Engineering 101 - Bob Strecansky - MailChimp

S01-E34: CNCF sigue creciendo, también Kubernetes en Netflix y Rust en el Linux kernel

Johnny Boursiquot on Serverless Go and Site Reliability Engineering at Heroku

Site Reliability Engineering the Big Picture with Elton Stoneman

sp.80【ゲスト: songmu】語学学校の元営業が楽しいPerlやGoコミュニティに関わって成長し、NatureのCTOになるまで

058: DevOps vs. SRE

DEV313: Shift-Left SRE: Self-Healing with AWS Lambda Functions

How can SRE help organizations achieve better and faster results?

Full Stack Journey 022: Site Reliability Engineering (SRE) With Michael Kehoe

Full Stack Journey 022: Site Reliability Engineering (SRE) With Michael Kehoe

Full Stack Journey 022: Site Reliability Engineering (SRE) With Michael Kehoe

Full Stack Journey 022: Site Reliability Engineering (SRE) With Michael Kehoe

Site Reliability Engineering (SRE) @ Bloomberg w/ Stig Sorenson

SRE vs Devops with Liz Fong-Jones and Seth Vargo

The Cloudcast #331 - Has SRE replaced DevOps?

Do you know DevOps? What about SRE? Learn More with Founder of RackN

SE-Radio-Episode-276-Björn-Rabenstein-on-Site-Reliability-Engineering

SE-Radio Episode 276: Björn Rabenstein on Site Reliability Engineering

CZ Podcast 160 - Site Reliability Engineering