systems with high up-time, a.k.a. "always on"
POPULARITY
In Part 1 of Redundancy vs. High Availability, we said that sometimes high availability and redundancy are considered to be the same thing, but we disagree. Holly and Ethan do agree that high availability can be considered a network design goal, and that redundancy is just one technique that can be used to help make... Read more »
In Part 1 of Redundancy vs. High Availability, we said that sometimes high availability and redundancy are considered to be the same thing, but we disagree. Holly and Ethan do agree that high availability can be considered a network design goal, and that redundancy is just one technique that can be used to help make... Read more »
In today’s chat, Holly and Ethan consider a question from listener Douglas who asks, “How do you approach designing a network for high availability and redundancy?” They start by defining differences between redundancy and high availability, and talk about Holly’s experience with her own customers. Then they share examples of how to achieve redundancy in... Read more »
In today’s chat, Holly and Ethan consider a question from listener Douglas who asks, “How do you approach designing a network for high availability and redundancy?” They start by defining differences between redundancy and high availability, and talk about Holly’s experience with her own customers. Then they share examples of how to achieve redundancy in... Read more »
Dive into KubeVirt with the vBrownBag crew and guest Eric Shanks!
professorjrod@gmail.comWhat keeps a business alive when the lights flicker, a server drops, or an ISP blinks? We pull back the curtain on practical resilience—how continuity planning, capacity, and clear runbooks turn chaos into a minor hiccup—then pressure-test the plan with drills, documentation, and ruthless honesty.We start by grounding COOP in the messy reality of people and places: cross-training gaps, pandemic downsizing, and the strain of return-to-office on infrastructure that never fully grew back. From there, we break down high availability without fluff—hot, warm, and cold sites, plus cloud recovery that scales on demand. Testing gets real with load and failover exercises, because hope is not a strategy. We go deep on clustering choices (active-active vs. active-passive), health checks, and the power stack that actually carries you through outages: dual PSUs, smart PDUs, UPS coverage, and generators that are not just installed but tested.Security on paper fails at the door, so we layer physical controls that work in the real world: lighting, sight lines, bollards, access vestibules, badges, biometrics, CCTV, alarms, and trained guards who can respond when seconds matter. We add deception technologies to slow attackers and capture valuable telemetry. A blunt backup story drives the point home—retention policies, daily verification, and restoration drills aren't optional. Snapshots enable quick rollback; off-site copies protect against building-level incidents; simple file naming saves hours under pressure. We even share personal lessons on NAS setups, cloud sync, and the small frictions that derail good intentions.If you care about uptime, user trust, and sleeping at night, this conversation gives you a blueprint: map critical services, set real RPO/RTO goals, diversify dependencies, practice failover, and verify backups every day. Subscribe, share with a teammate who owns “the pager,” and leave a review with your best resilience win—or the failure that taught you most.Support the showIf you want to help me with my research please e-mail me.Professorjrod@gmail.comIf you want to join my question/answer zoom class e-mail me at Professorjrod@gmail.comArt By Sarah/DesmondMusic by Joakim KarudLittle chacha ProductionsJuan Rodriguez can be reached atTikTok @ProfessorJrodProfessorJRod@gmail.com@Prof_JRodInstagram ProfessorJRod
How do you keep a computer running non-stop? This week Technology Now explores the world of fault tolerant computing. We dive into how fault tolerance works, what industries use it, and why such a useful form of computing isn't as ubiquitous as we might expect. Casey Taylor, Vice President and General Manager HPE Nonstop Compute tells us more. This is Technology Now, a weekly show from Hewlett Packard Enterprise. Every week, hosts Michael Bird and Aubrey Lovell look at a story that's been making headlines, take a look at the technology behind it, and explain why it matters to organizations.About Casey Taylor: https://www.linkedin.com/in/getcaseytaylorOur previous episode with Casey: https://hpe.lnk.to/missioncriticalfaSources:https://edition.cnn.com/2024/07/24/tech/crowdstrike-outage-cost-causehttps://edition.cnn.com/2024/07/24/tech/crowdstrike-outage-cost-causehttps://www.kovrr.com/reports/the-uk-cost-of-the-crowdstrike-incidenthttps://science.nasa.gov/mission/voyager/mission-overview/https://science.nasa.gov/mission/voyager/where-are-voyager-1-and-voyager-2-now/A. Avizienis, G. C. Gilley, F. P. Mathur, D. A. Rennels, J. A. Rohr and D. K. Rubin, "The STAR (Self-Testing And Repairing) Computer: An Investigation of the Theory and Practice of Fault-Tolerant Computer Design," in IEEE Transactions on Computers, vol. C-20, no. 11, pp. 1312-1321, Nov. 1971, doi: 10.1109/T-C.1971.223133. https://www.cs.unc.edu/~anderson/teach/comp790/papers/Siewiorek_Fault_Tol.pdf
Sponsor by SEC Playground
In this episode we are looking at how technology is enabling as close as possible to 100% up-time for the most mission-critical business operations. We'll be looking at how software and hardware are coming together to ensure the absolute pinnacle of reliability, and what it means for our organizations.Joining us to discuss is Casey Taylor, Vice President and General Manager of HPE NonStop.This is Technology Now, a weekly show from Hewlett Packard Enterprise. Every week we look at a story that's been making headlines, take a look at the technology behind it, and explain why it matters to organizations and what we can learn from it. About this week's guest: Casey Taylor: https://www.linkedin.com/in/getcaseytaylor/ Sources cited in this week's episode:TahawulTech report into the cost of IT downtime: https://www.tahawultech.com/insight/why-dns-exploits-continue-to-be-a-top-attack-vector-in-2024/ Siemens report into tech downtime in manufacturing: https://assets.new.siemens.com/siemens/assets/api/uuid:3d606495-dbe0-43e4-80b1-d04e27ada920/dics-b10153-00-7600truecostofdowntime2022-144.pdf Octopus suckers mimicked for better denture grip: https://www.kcl.ac.uk/news/octopus-suckers-fix-dentures
Tech behind the Trends on The Element Podcast | Hewlett Packard Enterprise
In this episode we are looking at how technology is enabling as close as possible to 100% up-time for the most mission-critical business operations. We'll be looking at how software and hardware are coming together to ensure the absolute pinnacle of reliability, and what it means for our organizations.Joining us to discuss is Casey Taylor, Vice President and General Manager of HPE NonStop.This is Technology Now, a weekly show from Hewlett Packard Enterprise. Every week we look at a story that's been making headlines, take a look at the technology behind it, and explain why it matters to organizations and what we can learn from it. About this week's guest: Casey Taylor: https://www.linkedin.com/in/getcaseytaylor/ Sources cited in this week's episode:TahawulTech report into the cost of IT downtime: https://www.tahawultech.com/insight/why-dns-exploits-continue-to-be-a-top-attack-vector-in-2024/ Siemens report into tech downtime in manufacturing: https://assets.new.siemens.com/siemens/assets/api/uuid:3d606495-dbe0-43e4-80b1-d04e27ada920/dics-b10153-00-7600truecostofdowntime2022-144.pdf Octopus suckers mimicked for better denture grip: https://www.kcl.ac.uk/news/octopus-suckers-fix-dentures
In this episode we are looking at how technology is enabling as close as possible to 100% up-time for the most mission-critical business operations. We'll be looking at how software and hardware are coming together to ensure the absolute pinnacle of reliability, and what it means for our organizations.Joining us to discuss is Casey Taylor, Vice President and General Manager of HPE NonStop.This is Technology Now, a weekly show from Hewlett Packard Enterprise. Every week we look at a story that's been making headlines, take a look at the technology behind it, and explain why it matters to organizations and what we can learn from it. About this week's guest: Casey Taylor: https://www.linkedin.com/in/getcaseytaylor/ Sources cited in this week's episode:TahawulTech report into the cost of IT downtime: https://www.tahawultech.com/insight/why-dns-exploits-continue-to-be-a-top-attack-vector-in-2024/ Siemens report into tech downtime in manufacturing: https://assets.new.siemens.com/siemens/assets/api/uuid:3d606495-dbe0-43e4-80b1-d04e27ada920/dics-b10153-00-7600truecostofdowntime2022-144.pdf Octopus suckers mimicked for better denture grip: https://www.kcl.ac.uk/news/octopus-suckers-fix-dentures
Send Your QuestionShould I be considering HA/DR for my application? What if my application is not that big? Is high availability and disaster recovery the same thing? Should I be spending lots of money on HA/DR or can I avoid it? What level of HA/DR is safe? These are the questions we will answer in today's episode of Dev Questions. Website: https://www.iamtimcorey.com/ Patreon: https://www.patreon.com/IAmTimCorey Ask Your Question: https://suggestions.iamtimcorey.com/ Sign Up to Get More Great Developer Content in Your Inbox: https://signup.iamtimcorey.com/
Is high availability always a good thing? Today our discussion takes an operations perspective. We look at places where you were over or under committing high availability, where you were confusing disaster recovery for high availability, and perhaps even securing the wrong service or looking at it the wrong way. We cover all of these scenarios with practical, hands-on examples that I know you will get a lot out of. This is good prep for talking about HA clusters, because the idea of coordinating and monitoring systems is core to HA and HA clusters. In our journey with RackN, a lot of customers who thought they needed very aggressive HA systems, once they are confronted with the overhead of maintaining an HA system, have to ask if you really need it. We started with an active/passive HA implementation using third party monitoring to monitor for when the system failed and spin up the second system, creating a live streaming back up to the failover system. Transcript: https://otter.ai/u/vOVZadHvRTFCZGqcI2DC3nQzDgY?utm_source=copy_url
If this past week's Crowdstrike outage taught us anything, it's that we seem to have forgotten the basics of how to run highly available environments. But what other critical skills are deteriorating? SHOW: 842SHOW TRANSCRIPT: The Cloudcast #842 TranscriptSHOW VIDEO: https://youtube.com/@TheCloudcastNET CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotwCHECK OUT OUR NEW PODCAST - "CLOUDCAST BASICS"SHOW SPONSOR:The Crowdstrike outage and market-driven brittlenessNetwork Engineering is a Dying ProfessionVista Equity writes off Pluralsight after $3.5B buyoutSHOW NOTES:WE'RE ALWAYS LOOKING TO THE FUTURESystems are more interconnected and more criticalCompanies under-staff critical operationsDevelopers are the new kingmakers, but where are all the kings?PEOPLE DON'T IGNORE STABLE FOUNDATIONS, THEY FORGET ABOUT THEMNot enough people have home labs, or go after certificationsThe industry highlights the content creators, instead of the builders/fixersThere might be an opportunity to restart a focus on foundational skillsFEEDBACK?Email: show at the cloudcast dot netTwitter: @cloudcastpodInstagram: @cloudcastpodTikTok: @cloudcastpodOut-of-the-box insights from digital leadersDelivered is your window in the minds of people behind successful digital products. Listen on: Apple Podcasts Spotify
edX ✨I build courses: https://insight.paiml.com/d69
On this episode of the Futurum Tech Webcast, hosts Randy Kerns and Krista Macomber are joined by SIOS Technical Evangelist Dave Bermingham, for a conversation on the importance of high availability solutions for regulated industries such as financial services. Their discussion covers: The complexity and criticality of ensuring uptime for applications and data in regulated industries Exploring high availability (HA) options for regulated industries, including financial services, with a focus on cost and performance considerations An introduction to SIOS DataKeeper as a solution for high availability needs
In this video I speak with Philippe Noël, about ParadeDB, which is an Elasticsearch alternative built on Postgres, modernizing the features of Elasticsearch's product suite, starting with real-time search and analytics. I hope you will enjoy and learn about the product. Chapters: 00:00 Introduction 01:12 Challenges with Elasticsearch and the Need for ParadeDB 02:29 Why Postgres? 06:30 Technical Details of ParadeDB's Search Functionality 18:25 Analytics Capabilities of ParadeDB 24:00 Understanding ParadeDB Queries and Transactions 24:22 Application Logic and Data Workflows 25:14 Using PG Cron for Data Migration 30:05 Scaling Reads and Writes in Postgres 31:53 High Availability and Distributed Systems 34:31 Isolation of Workloads 39:38 Database Upgrades and Migrations 41:21 Using ParadeDB Extensions and Distributions 43:02 Observability and Monitoring 44:42 Upcoming Features and Roadmap 46:34 Final Thoughts Important links: Links: GitHub: https://github.com/paradedb/paradedb Website: https://paradedb.com Docs: https://docs.paradedb.com/ Blog: https://blog.paradedb.com Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator If you like this episode, please hit the like button and share it with your network. Also please subscribe if you haven't yet. Database internals series: https://youtu.be/yV_Zp0Mi3xs Popular playlists: Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA- Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17 Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN Stay Curios! Keep Learning! #postgresql #datafusion #parquet #sql #OLAP #apachearrow #database #systemdesign #elasticsearch
edX ✨I build courses: https://insight.paiml.com/d69
Guy and Eitan discuss the various High Availability and Disaster Recovery options available to us in Microsoft SQL Server, their main advantages, limitations, and when it's most suitable to use or not to use them. Relevant links for further reading: AlwaysOn Basic Availability Groups / Database Mirroring AlwaysOn "Enterprize" Availability Groups Failover Cluster Instance Log Shipping Transactional Replication
In Elixir Wizards Office Hours Episode 8, hosts Sundi Myint and Owen Bickford lead an engaging Q&A session with co-host Dan Ivovich, diving deep into the nuances of DevOps. Drawing from his extensive experience, Dan navigates topics from the early days before Docker to managing diverse polyglot environments and optimizing observability. This episode offers insights for developers of all levels looking to sharpen their DevOps skills. Explore the realms of Docker, containerization, DevOps workflows, and the deployment intricacies of Elixir applications. Key topics discussed in this episode: Understanding DevOps and starting points for beginners Best practices for deploying applications to the cloud Using Docker for containerization Managing multiple programming environments with microservices Strategies for geographic distribution and ensuring redundancy Localization considerations involving latency and device specs Using Prometheus and OpenTelemetry for observability Adjusting scaling based on application metrics Approaching failure scenarios, including database migrations and managing dependencies Tackling challenges in monitoring setups and alert configurations Implementing incremental, zero-downtime deployment strategies The intricacies of hot code upgrades and effective state management Recommended learning paths, including Linux and CI/CD workflows Tools for visualizing system health and monitoring Identifying actionable metrics and setting effective alerts Links mentioned: Ansible open source IT automation engine https://www.ansible.com/ Wikimedia engine https://doc.wikimedia.org/ Drupal content management software https://www.drupal.org/ Capistrano remote server automation and deployment https://capistranorb.com/ Docker https://www.docker.com/ Circle CI CI/CD Tool https://circleci.com/ DNS Cluster https://hex.pm/packages/dnscluster ElixirConf 2023 Chris McCord Phoenix Field Notes https://youtu.be/Ckgl9KO4E4M Nerves https://nerves-project.org/ Oban job processing in Elixir https://getoban.pro/ Sidekiq background jobs for Ruby https://sidekiq.org/ Prometheus https://prometheus.io/ PromEx https://hexdocs.pm/promex/PromEx.html GitHub Actions - Setup BEAM: https://github.com/erlef/setup-beam Jenkins open source automation server https://www.jenkins.io/ DataDog Cloud Monitoring https://www.datadoghq.com/
Today on Elixir Wizards Office Hours, SmartLogic Engineer Joel Meador joins Dan Ivovich to discuss all things background jobs. The behind-the-scenes heroes of app performance and scalability, background jobs take center stage as we dissect their role in optimizing user experience and managing heavy-lifting tasks away from the main application flow. From syncing with external systems to processing large datasets, background jobs are pivotal to successful application management. Dan and Joel share their perspectives on monitoring, debugging, and securing background jobs, emphasizing the need for a strategic approach to these hidden workflows. Key topics discussed in this episode: The vital role of background jobs in app performance Optimizing user experience through background processing Common pitfalls: resource starvation and latency issues Strategies for effective monitoring and debugging of task runners and job schedulers Data integrity and system security in open source software Background job tools like Oban, Sidekiq, Resque, Cron jobs, Redis pub sub CPU utilization and processing speed Best practices for implementing background jobs Keeping jobs small, focused, and well-monitored Navigating job uniqueness, locking, and deployment orchestration Leveraging asynctask for asynchronous operations The art of continuous improvement in background job management Links mentioned in this episode: https://redis.io/ Oban job processing library https://hexdocs.pm/oban/Oban.html Resque Ruby library for background jobs https://github.com/resque Sidekiq background processing for Ruby https://github.com/sidekiq Delayed Job priority queue system https://github.com/collectiveidea/delayed_job RabbitMQ messaging and streaming broker https://www.rabbitmq.com/ Mnesia distributed telecommunications DBMS https://www.erlang.org/doc/man/mnesia.html Task for Elixir https://hexdocs.pm/elixir/1.12/Task.html ETS in-memory store for Elixir and Erlang objects https://hexdocs.pm/ets/ETS.html Cron - https://en.wikipedia.org/wiki/Cron Donate to Miami Indians of Indiana https://www.miamiindians.org/take-action Joel Meador on Tumblr https://joelmeador.tumblr.com/ Special Guest: Joel Meador.
In Elixir Wizards Office Hours Episode 2, "Discovery Discoveries," SmartLogic's Project Manager Alicia Brindisi and VP of Delivery Bri LaVorgna join Elixir Wizards Sundi Myint and Owen Bickford on an exploratory journey through the discovery phase of the software development lifecycle. This episode highlights how collaboration and communication transform the client-project team dynamic into a customized expedition. The goal of discovery is to reveal clear business goals, understand the end user, pinpoint key project objectives, and meticulously document the path forward in a Product Requirements Document (PRD). The discussion emphasizes the importance of fostering transparency, trust, and open communication. Through a mutual exchange of ideas, we are able to create the most tailored, efficient solutions that meet the client's current goals and their vision for the future. Key topics discussed in this episode: Mastering the art of tailored, collaborative discovery Navigating business landscapes and user experiences with empathy Sculpting project objectives and architectural blueprints Continuously capturing discoveries and refining documentation Striking the perfect balance between flexibility and structured processes Steering clear of scope creep while managing expectations Tapping into collective wisdom for ongoing discovery Building and sustaining a foundation of trust and transparency Links mentioned in this episode: https://smartlogic.io/ Follow SmartLogic on social media: https://twitter.com/smartlogic Contact Bri: bri@smartlogic.io What is a PRD? https://en.wikipedia.org/wiki/Productrequirementsdocument Special Guests: Alicia Brindisi and Bri LaVorgna.
Join me February 22/24 at 1pm EST, as I once again talk to experienced Business Continuity / Disaster Recovery professional, Ray Holloman. Today we talk about 2 topics. In Segment 1 we talk about 'Are your plans and exercises thinking about equity and inclusion?' where we touch on: 1. Consideration for people with disabilities in tests and exercises, 2. Having those tough conversations, 3. Increasing morale and productivity - making people feel they belong, 4. Addressing fear, 5. Universal Design, 6. Change the thinking, 7. Using Artificial Intelligence (AI) in exercise plans...and more. In Segment 2 we touch on the topic 'In Resilience, where does disaster recovery and high availability fit?' and talk about: 1. The difference between disaster recovery and high availability, 2. Critical conversations with various parties, 3. New approaches and ideas for DR and HA testing, 4. Resilience in DR and HA, 5. Difficult discussions about RTOs, 6. Selling 'resilience' to leadership, 7. Priorities - making sure you can recover first...and much more. It's another great conversation with Ray, who share many helpful insights DR, HA, BCM, and resilience professionals can use in their daily roles. You don't want to miss what Ray has to share. Enjoy!
Join me February 22/24 at 1pm EST, as I once again talk to experienced Business Continuity / Disaster Recovery professional, Ray Holloman. Today we talk about 2 topics. In Segment 1 we talk about 'Are your plans and exercises thinking about equity and inclusion?' where we touch on: 1. Consideration for people with disabilities in tests and exercises, 2. Having those tough conversations, 3. Increasing morale and productivity - making people feel they belong, 4. Addressing fear, 5. Universal Design, 6. Change the thinking, 7. Using Artificial Intelligence (AI) in exercise plans...and more. In Segment 2 we touch on the topic 'In Resilience, where does disaster recovery and high availability fit?' and talk about: 1. The difference between disaster recovery and high availability, 2. Critical conversations with various parties, 3. New approaches and ideas for DR and HA testing, 4. Resilience in DR and HA, 5. Difficult discussions about RTOs, 6. Selling 'resilience' to leadership, 7. Priorities - making sure you can recover first...and much more. It's another great conversation with Ray, who share many helpful insights DR, HA, BCM, and resilience professionals can use in their daily roles. You don't want to miss what Ray has to share. Enjoy!
ZFS High Availability with Asynchronous Replication and zrep, Stop Blogging and start documenting, 2023 in Review: Infrastructure, NovaCustom NV41 laptop review, OpenBSD Video Audio Screen Recording, HDMI Audio sound patches into GhostBSD source code, DSA removal from OpenSSH, NetBSD/evbppc 10.99.10 on the Nintendo Wii, NetBSD/amd64 current performance patch NOTES This episode of BSDNow is brought to you by Tarsnap (https://www.tarsnap.com/bsdnow) and the BSDNow Patreon (https://www.patreon.com/bsdnow) Headlines ZFS High Availability with Asynchronous Replication and zrep (https://klarasystems.com/articles/zfs-high-availability-with-asynchronous-replication-and-zrep/) Stop Blogging and start documenting (https://callfortesting.org/stopblogging/) News Roundup 2023 in Review: Infrastructure (https://freebsdfoundation.org/blog/2023-in-review-infrastructure/) NovaCustom NV41 laptop review (https://dataswamp.org/~solene/2024-01-03-laptop-review-novacustom-nv41.html) OpenBSD Video Audio Screen Recording (https://rsadowski.de/posts/2024-01-14-openbsd-video-audio-screen-recording/) HDMI Audio sound patches into GhostBSD source code /usr/ghost14/ghostbsd-src SOLVED Jan20 2024 (https://ghostbsd-arm64.blogspot.com/2024/01/hdmi-audio-sound-patches-into-ghostbsd.html) Beastie Bits DSA removal from OpenSSH (http://undeadly.org/cgi?action=article;sid=20240111105900) NetBSD/evbppc 10.99.10 on the Nintendo Wii (https://youtu.be/n-MShCcFm_w?si=-bl2725c1WwT8PBg) NetBSD/amd64 current performance patch (https://mail-index.netbsd.org/tech-kern/2024/01/23/msg029450.html) November/December 2023 FreeBSD Journal Issue (https://freebsdfoundation.org/past-issues/freebsd-14-0/) Feedback Rick - Questions (https://github.com/BSDNow/bsdnow.tv/blob/master/episodes/545/feedback/rick%20-%20questions.md) Tarsnap This weeks episode of BSDNow was sponsored by our friends at Tarsnap, the only secure online backup you can trust your data to. Even paranoids need backups. Send questions, comments, show ideas/topics, or stories you want mentioned on the show to feedback@bsdnow.tv (mailto:feedback@bsdnow.tv) Join us and other BSD Fans in our BSD Now Telegram channel (https://t.me/bsdnow)
In this episode, open source guru Kris Buytaert discusses open source ecosystems, the benefits of collaboration, and the shifts towards proprietary models in certain tools. We explore OpenTofu as a reaction to Terraform, ponder whether an “Ansible of IaC” will emerge, and delve into the deeper meaning of licenses, ecosystems, and governance models—emphasizing that “one open source is not equal to another.”Join us in the exploration of the hallmarks of healthy open source and what lies beyond licenses as we assess community integrity.Kris Buytaert is a long time Linux and Open Source Consultant. He's one of instigators of the devops movement, currently working for o11y.eu / @inuits.He is frequently speaking at, or organizing different international conferences and has written about the same subjects in different Books, Papers and Articles.He spends most of his time working on bridging the gap between developers and operations with a strong focus on High Availability, Scalability, Virtualization, and Large Infrastructure Management projects. Hence, he is trying to build infrastructures that can survive the 10th-floor test—better known today as the cloud—while actively promoting the DevOps idea.Sponsored by: https://www.env0.com/
In this episode of the Kubernetes Bytes podcast, Bhavin sits down with Jason Dobies - Director of Edge Engineering at SUSE to talk about all things K3s. They discuss why Kubernetes is best suited for Edge deployments, and why K3s was built and how it helps users architect their edge solutions. The discussion goes into topics like Security, Storage, High Availability at the Edge. Check out our website at https://kubernetesbytes.com/ Episode Sponsor: Elotl https://elotl.co/luna https://www.elotl.co/luna-free-trial Timestamps: 01:20 Cloud Native News 06:01 Interview with Jason 50:42 Key takeaways Cloud Native News: https://vmblog.com/archive/2024/01/29/dynatrace-to-acquire-runecast-to-enhance-cloud-native-security-and-compliance.aspx - https://chronosphere.io/news/chronosphere-acquires-calyptia/ Show links: https://k3s.io/ https://www.linkedin.com/in/jdob/
Sergio Castro joins Lois Houston and Nikita Abraham to explore multicloud, some of its use cases, and the reasons why many businesses are embracing this strategy. A-Team Chronicles: https://www.ateam-oracle.com/ Oracle University Blog: https://blogs.oracle.com/oracleuniversity/ Oracle MyLearn: https://mylearn.oracle.com/ Oracle University Learning Community: https://education.oracle.com/ou-community X (formerly Twitter): https://twitter.com/Oracle_Edu LinkedIn: https://www.linkedin.com/showcase/oracle-university/ Special thanks to Arijit Ghosh, David Wright, the OU Podcast Team, and the OU Studio Team for helping us create this episode. -------------------------------------------------------- Episode Transcript: 00:00 Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we'll bring you foundational training on the most popular Oracle technologies. Let's get started. 00:26 Nikita: Welcome to the Oracle University Podcast! I'm Nikita Abraham, Principal Technical Editor with Oracle University, and with me is Lois Houston, Director of Innovation Programs. Lois: Hi there! If you've been following along with us, you'll know we just completed our first three seasons of the Oracle University Podcast. We've had such a great time exploring OCI, Data Management, and Cloud Applications business processes. And we've had some pretty awesome special guests, too. 00:56 Nikita: Yeah, it's been so great having them on and so educational so do check out those episodes if you missed any of them. Lois: As we close out the year, we thought this would be a good time to revisit some of our most popular episodes with you. Over the next few weeks, you'll be able to listen to six of our most popular episodes from this year. Nikita: Right, this is the best of the best–according to you–our listeners. 01:20 Lois: Today's episode is #1 of 6 and is a throwback to a discussion with our Principal OCI Instructor Sergio Castro on multi-cloud. Keep in mind that this chat took place before the release of Oracle University's course and certification on multi-cloud. It's available now on mylearn.oracle.com so if it interests you, you should go check it out. Nikita: We began by asking Sergio to help us with the basics and explain what multi-cloud is. So, let's dive right in. Here we go! 01:51 Sergio: Good question. So multi-cloud is leveraging the best offering of two or more cloud service providers. This as a strategy for an IT solution. And Oracle embraces multi-cloud. This strategy was clearly communicated during Open World in Las Vegas last year. We even had demos where OCI presenters opened the cloud Graphic User Interface of other providers during our live sessions. So the concise answer to the question is multi-cloud is two or more cloud vendors providing a consolidated solution to a customer. 02:29 Nikita: So, would an example of this be when a customer uses OCI and Azure? Sergio: Absolutely. Yes, exactly. That's what it is. We can say that our official multi-cloud approach started with the interconnect agreement with Azure. But customers, they have already been leveraging our FastConnect partners for interconnecting with other cloud providers. The interconnect agreement with Azure just made it easier. Oracle tools such as Oracle Integration and Golden Gate have been multi-cloud ready even prior to our official announcement. And if you look at the Oracle's document... the documents from Oracle, you can find VPN access to other cloud providers, but we can talk about that shortly. 03:16 Nikita: OK. So, why would organizations use a multi-cloud strategy? What do they gain by doing that? Sergio: Oh, there are many reasons why organizations might want to use a multi-cloud strategy. For example, a customer might want to have vendor redundancy. Having the application running with one vendor and having the other vendor just stand by in case something goes wrong with that cloud provider. So it is best practices not to rely on just one cloud service provider. Another customer might want to have the application with one tier or the application tier with one cloud provider and their database tier with another cloud provider. 03:53 Sergio: So this is a solution leveraging the best to cloud providers. Another company or another reason might be a company acquired another one, you know purchasing a second company, and they have different cloud providers and they just want to integrate their cloud resources. So every single cloud provider offer unique solutions and customers want to leverage these strong points. For example, we all know that AWS was the first infrastructure access service provider, and the industry adopted them. Then other players came along like OCI and customers realized that there are better and less expensive options that now they can take advantage of. So cloud migration is another reason why multi-cloud interconnectivity is needed. 04:42 Lois: Wow! There really are a lot of different use cases for multi-cloud. Sergio: Yeah, absolutely. There is, Lois. So Golden Gate, for example, this is an Oracle product. Oracle Golden Gate allows replication from two different databases. So if a customer wants to replicate the Oracle Database in OCI, in Oracle Cloud Infrastructure, to a SQL server in Azure, this is possible. And now there's an OCI to Azure interconnect (live) and it can facilitate this, this database replication. And if a start-up needs to communicate OCI to Google Cloud Platform, for example, but a digital circuit is not economically viable, then we have published step-by-step configuration instructions for site-to-site VPN, and this includes all the steps on the Google Cloud Platform as well. So these are some of the different use cases. 05:37 Lois: So, what should you keep in mind when you're designing a multi-cloud solution? Sergio: The first thing that comes to mind is business continuity. It is very important to have High Availability and Disaster Recovery strategies. This to keep the lights on and focus on the organization's current technology, the organization's current needs, the company's vision, and the offering from the cloud service providers out there. The current offerings that each cloud service provider brings to this company. For example, if an organization's on-premises, current deployment consists of Microsoft applications and Oracle Databases, and they want to use as much as they can of their current knowledge base that their staff has acquired through the years, it only makes sense to take the apps to Azure and the database to Oracle Cloud Infrastructure and either leverage ODSA, Oracle Database Solution for Azure, or our OCI-Azure interconnect regions. We have 12 of those. 06:39 Sergio: So ODSA was designed with Azure cloud architects in mind. The Oracle Database solution for Azure. For each database provision using ODSA, the service delivers OCI database metrics, OCI events, and OCI logs to tools such as Azure Application Insights, Azure Event Grid, and Azure Log Analytics. But the concise key points to keep in mind are latency, security, data movement, orchestration, and operation management. 07:10 Nikita: So, latency... security... Can you tell us a little bit more about these? Sergio: Yes, latency is crucial. If an application needs, let's say X milliseconds, 3 milliseconds response time, the multi-cloud solution better meet these needs. We recently published a blog post where we released the millisecond response of our 12 interconnect sites to Azure and OCI. We have 12 interconnect sites of Azure regions to 12 regions from OCI. Now, regarding security, in Oracle, we pride ourselves for being a security company. Security is at our core of who we are and we have taken this approach to multi-cloud. This for encryption of data at rest, encryption of data in transit, masking the data in the database, security key management, patching service, Identity and Access Management, Web Application Firewall. All of these solutions from Oracle are very well suited for multi-cloud approach. 08:17 Lois: OK, what about data movement, orchestration and operation management? You mentioned those. Sergio: I mentioned Golden Gate earlier. So you can use this awesome tool for replication. You can also use this for migration. But data movement is much more than replication, like real live transactions taking place and backup strategies. We have options for all of this. Our object storage, our bulky regions backup strategies. Now for orchestration, the Oracle API Gateway avoids vendor lock-in and enables you to publish APIs with private endpoints that are accessible from within your network and which you can expose with a public IP address. This in case you want to accept traffic from the internet. 09:07 Nikita: Ah, that makes sense. Thanks for explaining those, Sergio. Now, what multi-cloud services does OCI have? Sergio: So I already mentioned a few like ODSA, the Oracle Database Solution for Azure. So, this is where Azure customers can easily provision, access, and operate an Oracle Database enterprise-grade and the Oracle Cloud Infrastructure with a familiar Azure-like experience. ODSA was jointly announced back in July 2022 by our CTO Larry Ellison and Microsoft's Satya Nadella. He's the CEO. This was last year. And we also announced the MySQL Heatwave, which is available on AWS. This solution offers online transactional processing analytics, machine learning, and automation with a single, MySQL database. So OCI multi-cloud approach started when the OCI regions interconnected via FastConnect to Azure regions Express Route. This was back in June of 2019. 10:12 Sergio: Other products for multi-cloud include OCI integration services, OCI Golden Gate, the Oracle API Gateway, Observability and Management, and Oracle Data Sync to name a few. Nikita: So we've been working in multi-cloud services since 2019. Interesting. Lois: It really is. Sergio, can you tell us a little bit about the type of organizations that can benefit from multi-cloud? 10:36 Sergio: Absolutely. My pleasure. So organizations of all sizes and of all industries can benefit from multi-cloud, from start-ups to companies in the top 100 of the Forbes list and from every corner of the world, you name it, every corner of the world. So it's available worldwide for customers, the Oracle customers. There are also customers, and we know this of other providers. So in terms of cloud, it's to the customers' benefit that cloud service providers have a multi-cloud strategy. In OCI , OCI has been a pioneer in multi-cloud. It was in 2019 when the FastConnect to Express Route partnership was announced. And Site-to-Site VPN is also available to all three of our major cloud competitors. So the beauty of the last word, cloud competitors, is that indeed they are our competitors and we try to win businesses away from them. 11:29 Sergio: But at the same time, our customers demand the ability for cloud providers to work with each other and our customers are right. And for this reason, we embrace multi-cloud. Recently, the federal government announced that they selected four cloud providers: OCI, AWS, Azure, and Google Cloud Platform. And also, Uber announced a major deal with OCI and Google Cloud Platform. So these customers, they want us to work together. So multi-cloud is a way to go, strategy and we want to make our customers happy. So we will operate and work with these cloud providers, service providers. 12:09 Nikita: That's really great. So a customer can take advantage of the benefits of OCI, even if they have other services running on another cloud provider. Now if I wanted to become a multi-cloud developer or a cloud architect, how would I go about getting started? Is there a certification I can get? Sergio: Absolutely. Excellent question. I love this question. So this depends on where you are in your cloud journey. If you are already a cloud knowledgeable engineer with either AWS or Azure, you can start with our OCI for Azure Architect and OCI for AWS Architect. We have courses for both. And if you are just getting started with cloud and you want to learn OCI, you can start with our OCI Foundations as the path to OCI and as you progress along, we have OCI Architect Associate, we have OCI Architect Professional. So there's a clear path, but if you have a specialty like a developer's or operations or multi-cloud certification, so we have all of this for you. And regarding the OCI Architect Professional certification, it contains in the learning path a lesson and a demo on how to interconnect OCI and Azure from the ground up. 13:23 Lois: And all of this training is available for free on mylearn.oracle.com, right? Sergio: Yes, that is correct, Lois. Just visit the site, mylearn.oracle.com, and create an account. The site keeps track of your learning progress and you can always come back and continue from where you left off, at your own speed. 13:42 Lois: That's great. And what if I don't want to get certified right now? Sergio: Of course, you do not have to be pursuing a certification to gain access to the training in MyLearn. If you are only interested in the OCI to Azure interconnection lesson, for example, you can go right to that course in MyLearn, bypassing all the other material. Just watch that lesson. If you're interested, follow along with the demo on your own environments. 14:09 Nikita: So you can take as much or as little training as you want. That's wonderful. Sergio: Absolutely it is. And with regards to other OCI products that are great for multi-cloud, our API Gateway is greatly covered in our OCI Developer Professional certification. The awesome news that I'm bringing to you right now is that soon Oracle University will release a new OCI multi-cloud certification. This is going to be accompanied by with the learning path and the multi-cloud certification, this is what I'm currently at this moment working on. We are designing the material. We are having fun right now doing the labs, and shortly, we will write the test questions. 14:51 Lois: That's great news. You know I love to share a sneak peek at new training we're working on. Thank you so much, Sergio, for giving us your time today. This was really insightful. Sergio: On the contrary, thank you. And thanks to everyone who's listening. I encourage you to go ahead and link your multiple cloud accounts and if you have questions, feel free to reach out. You can find me in the Oracle University Learning Community. 15:15 Nikita: We hope you enjoyed that conversation. And like we were saying before, the multi-cloud course has been released and has quickly become one of our most sought-after certifications. So, if you want to access the multi-cloud course, visit mylearn.oracle.com. Lois: Join us next week for another throwback episode. Until then, this is Lois Houston… Nikita: And Nikita Abraham, signing off! 15:39 That's all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We'd also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.
Hey Everyone, In this video I talk to Franck Pachot about internals of YugabyteDB. Franck has joined the show previously to talk about general database internals and its again a pleasure to host him and talk about DistributedSQL, YugabyteDB, ACID properties, PostgreSQL compatibility etc. Chapters: 00:00 Introduction 01:26 What does Cloud Native means? 02:57 What is Distributed SQL? 03:47 Is DistributedSQL also based on Sharding? 05:44 What problem does DistributedSQL solves? 07:32 Writes - Behind the scenes. 10:59 Reads: Behind the scenes. 17:01 BTrees vs LSM: How is the data written do disc? 25:02 Why RocksDB? 29:52 How is data stored? Key Value? 33:56 Transactions: Complexity, SQL vs NoSQL 42:51 MVCC in YugabyteDB: How does it work? 45:08 Default Transaction Isolation level in YugabyteDB 51:57 Fault Tolerance & High Availability in Yugabyte 56:48 Thoughts on Postgres Compatibility and Future of Distributed SQL 01:03:53 Usecases not suitable for YugabyteDB Previous videos: Database Internals: Part1: https://youtu.be/DiLA0Ri6RfY?si=ToGv9NwjdyDE4LHO Part2: https://youtu.be/IW4cpnpVg7E?si=ep2Yb-j_eaWxvRwc Geo Distributed Applications: https://youtu.be/JQfnMp0OeTA?si=Rf2Y36-gnpQl18yj Postgres Compatibility: https://youtu.be/2dtu_Ki9TQY?si=rcUk4tiBmlsFPYzY I hope you liked this episode, please hit the like button and subscribe to the channel for more. Popular playlists: Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA- Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17 Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN Franck's Twitter and Linkedin: https://twitter.com/FranckPachot and https://www.linkedin.com/in/franckpachot/ Connect and follow here: https://twitter.com/thegeeknarrator and https://www.linkedin.com/in/kaivalyaapte/ Keep learning and growing. Cheers, The GeekNarrator
Nikolay and Michael discuss HA (high availability) — what it means, tools and techniques for maximising it, while going through some of the more common causes of downtime. Here are some links to some things they mentioned:https://en.wikipedia.org/wiki/High_availability https://postgres.fm/episodes/upgrades https://github.com/shayonj/pg_easy_replicate/ pg_easy_replicate discussion on Hacker News https://news.ycombinator.com/item?id=36405761 https://postgres.fm/episodes/connection-poolers https://www.postgresql.org/docs/current/libpq.html Support load balancing in libpq (new feature in Postgres 16) https://commitfest.postgresql.org/42/3679/ target_session_attrs options for high availability and scaling (2021; a post by Laurenz Albe) https://www.cybertec-postgresql.com/en/new-target_session_attrs-settings-for-high-availability-and-scaling-in-postgresql-v14/Postgres 10 highlight - read-write and read-only mode of libpq (2016, a post by Michael Paquier) https://paquier.xyz/postgresql-2/postgres-10-libpq-read-write/ Postgres 10 highlight - Quorum set of synchronous standbys (2017, a post by Michael Paquier) https://paquier.xyz/postgresql-2/postgres-10-quorum-sync/ https://github.com/zalando/patroni https://postgres.fm/episodes/replication https://blog.rustprooflabs.com/2021/06/postgres-bigint-by-default Zero-downtime Postgres schema migrations need this: lock_timeout and retries (2021) https://postgres.ai/blog/20210923-zero-downtime-postgres-schema-migrations-lock-timeout-and-retries A fix in Patroni to mitigate a very long shutdown attempt when archive_command has a lot of WALs to archive https://github.com/zalando/patroni/pull/2067 ~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is brought to you by:Nikolay Samokhvalov, founder of Postgres.aiMichael Christofides, founder of pgMustardWith special thanks to:Jessie Draws for the amazing artwork
What does high availability look like in 2023? Richard chats with Allan Hirt about his work with high-availability solutions today - not just on-premises but also in the cloud. Allan talks about the frustration folks had with moving workloads in the cloud during the pandemic panic, lift-and-shifting workloads focusing on getting things working quickly rather than cost-effectively. The results can be costly, to the point where some folks considering moving back off the cloud again - but does that make sense? Allan talks about creating high availability efficiently wherever you want to run your workloads!Links:Always On on SQL Server with Azure VMsSQL Server 2022Azure Regions and Availability ZonesOperations ManagerRecorded May 11, 2023
Although everyone wants high availability from IT systems, the cost to achieve it must be weighed against the benefits. This episode of Utilizing Edge focuses on HA solutions at the edge with Bruce Kornfeld of StorMagic, Alastair Cooke, and Stephen Foskett. Although it might be tempting to build the same infrastructure at the edge as in the data center, but this can get very expensive. Thinking about multi-node server clusters and RAID storage, the risk of a so-called split brain means not just two nodes but three must be deployed in most cases. StorMagic addresses this issue in a novel way, with a remote node providing a quorum witness and reducing the need for on-site hardware. Edge infrastructure also relies on so-called hyperconverged systems, which use software to create advanced services on simple and inexpensive hardware. Hosts: Stephen Foskett: https://www.twitter.com/SFoskett Alastair Cooke: https://www.twitter.com/DemitasseNZ StorMagic Representative: Bruce Kornfeld, Chief Marketing and Product Officer at StorMagic: https://www.linkedin.com/in/brucekornfeld/ Tags: #UtilizingEdge, #EdgeStorage, #EdgeComputing, @StorMagic
Mark and I share more highly researched, thoughtful conversation on human welfare and the environment. We see things differently, but I consider our conversations the type we should have more of.This session we coverThe book Limits to Growth as well as the concepts underlying limits to growthEarth's carrying capacityHow much wealth is consumed by food and fuel, now and historically, and how much it's droppedHow the low cost and high availability of energy has allowed us to devote more money for other things, inventions, and life improvementsWhat is pollution?and plenty more. Hosted on Acast. See acast.com/privacy for more information.
In this episode of Scaling Postgres, we discuss how the stats collector disappears in PG15, steps to mitigate high latency connections, how to run Postgres in the browser and the future of high availability. Subscribe at https://www.scalingpostgres.com to get notified of new episodes. Links for this episode: https://www.percona.com/blog/postgresql-15-stats-collector-gone-whats-new/ https://pganalyze.com/blog/5mins-postgres-improve-query-network-latency-performance-pipeline-mode-copy-tc https://www.crunchydata.com/blog/crazy-idea-to-postgres-in-the-web-browser https://www.enterprisedb.com/blog/pg-phriday-defining-high-availability-postgres-world https://www.enterprisedb.com/blog/ansible-benchmark-framework-postgresql https://www.timescale.com/blog/what-does-a-postgresql-commitfest-manager-do-and-should-you-become-one/ https://postgis.net/2022/08/25/tip-upgrading-postgis-sfcgal/ https://www.cybertec-postgresql.com/en/migrate-scheduled-jobs-to-pg_timetable-from-pgagent/ https://postgres.fm/episodes/how-to-become-a-dba https://postgresql.life/post/antonin_houska/ https://www.rubberduckdevshow.com/episodes/56-live-streaming-laravel-with-aaron-francis/
In this episode of Scaling Postgres, we discuss new Postgres releases, a new privilege escalation CVE, chaos testing a high availability kubernetes cluster as well as addressing other H/A questions. Subscribe at https://www.scalingpostgres.com to get notified of new episodes. Links for this episode: https://www.postgresql.org/about/news/postgresql-145-138-1212-1117-1022-and-15-beta-3-released-2496/ https://www.enterprisedb.com/blog/postgresql-extensions-impacted-cve-2022-2625-privilege-escalation https://coroot.com/blog/chaos-testing-zalando-postgres-operator https://www.timescale.com/blog/how-high-availability-works-in-our-cloud-database/ https://www.enterprisedb.com/blog/pg-phriday-dos-and-donts-postgres-high-availability-qa https://pganalyze.com/blog/5mins-postgres-linux-readahead-effective-io-concurrency https://smallthingssql.com/having-a-less-understood-sql-clause https://postgrespro.com/blog/pgsql/5969673 https://postgres.fm/episodes/vacuum https://postgresql.life/post/adam_wright/ https://www.rubberduckdevshow.com/episodes/54-open-source-experiences-pay-gem-with-chris-oliver/
Since the launch of our Horizon Subscription Upgrade Program in 2019, many VMware Horizon customers have upgraded to our Horizon Universal Subscription offering, taking advantage of several benefits including:Flexible multi-cloud deployment options – deploy and host Horizon anywhere and in multiple clouds at the same time, including on-premises, Microsoft Azure, AWS, Google Cloud, Oracle Cloud, and IBM Cloud. IT can continue to deploy virtual desktops and apps on-premises, leverage the speed and scalability of hyperscaler capacity, or use a combination of both to help their employees quickly get secure access to corporate resources. Simplified cross-pod and cross-cloud management – use the cloud-hosted, VMware-managed Horizon Control Plane, which includes multi-cloud SaaS services that allow IT to monitor and manage images, desktops, and apps across Horizon deployments, including both on-premises and in the cloud. For example, IT can use the Universal Broker service to enable a global entitlement layer across a multi-pod or multi-cloud deployment, providing a great user experience to access any desktop from anywhere with a single URL, and reducing TCO by removing the need for a global server load balancer. Hybrid and multi-cloud use cases – add new cloud capacity to existing on-premises deployments to enable High Availability, Disaster Recovery, and burst capacity for temporary and seasonal workers without the CapEx requirements for additional data center hardware. Additionally, customers can move desktop and app workloads strategically to the cloud to be closer to cloud-hosted applications to reduce latency and optimize performance and user experience.Host: Andy WhitesideCo-host: Rizwan Shaikh
In this episode of Cyber Security Inside, Camille and Tom take a dive into cloud security with Jo Peterson, Vice President Cloud & Security Services, Forbes Technology Council, CompTIA Advisory-Infrastructure. The conversation covers: - How the cloud has become a more integral part of businesses, and where we are headed with cloud. - What your responsibilities are for cloud security and the questions to ask when choosing a provider. - How artificial intelligence and the Internet of Things interact with cloud security. - What the biggest concerns are in cloud security and what experts are doing to make it more secure. ...and more. Don't miss it! We were honored to have Jo on the podcast, who has accomplished some amazing things this year! Check out her accolades: - Onalytica Who's Who in Cybersecurity https://onalytica.com/wp-content/uploads/2022/02/Whos-Who-in-Cybersecurity.pdf - Engati LinkedIn 30 Top Voices in Tech https://www.engati.com/blog/linkedin-top-voices-in-tech - Thinkers360 Top 150 Women B2B Leaders to Follow in 2022 https://www.thinkers360.com/150-women-b2b-thought-leaders-you-should-follow-in-2022/ - 2016-2022 CRN Women of the Channel Recipient https://www.crn.com/rankings-and-lists/wotc2022.htm - Onalytica Who's Who in the Cloud https://onalytica.com/blog/posts/whos-who-in-cloud/ The views and opinions expressed are those of the guests and author and do not necessarily reflect the official policy or position of Intel Corporation. Here are some key takeaways: - When you don't own the hardware you are using, what can you do to keep yourself secure? We are all sharing a lot of the same underlying infrastructure, and sharing information and data on the public cloud. Keeping it secure is very important. - Customers are concerned with outages and resiliency of cloud systems. And there are some things that customers can do. Having things like High Availability, housing workloads across multiple availability zones, supporting region routing, backing up data, encrypting data, and more! These are the top concerns of customers right now. - Often cloud breaches happen with unsecured assets because someone internally made a mistake somewhere. The Shared Responsibility Model needs to be flexible and apply to each cloud provider. And knowing your responsibility within that model is important. - Splitting up an application between on-prem and in a CSP sometimes depends on financial means. It is more costly to run something in the cloud because of bandwidth and latency. So for one application, some of the storage might be on-prem and some of the computing might be done in the cloud, with you traversing back and forth. - Splitting up presences across geographic locations is also smart. If you have a west coast presence, but there is an earthquake that damages your systems, having an east coast presence as well is useful. And you can balance the application with application load balancers in those different availability zones. - There are a lot of suggestions and how-tos for best practices and using availability zones. But it also takes some technical knowledge and practice on how to build a secure cloud environment. At the end of the day, you are building your own infrastructure. - Cloud has grown and changed a lot over time, and it is still growing and changing. Especially with work from home, how we connect to the cloud and use it has changed. Maybe it's time to do identity based, maybe the tech for VPNs still hold. We have to rethink who we are letting access data, and continually rethink as things change. - What advice does Jo Peterson have for people trying to select a cloud service provider partner? Know your inventory first, and know what you want to move to the cloud. Then look at what you have chosen and decide what specialization you might want to go with based on what you have. - When looking for a cloud service provider, it is important to know what you need and to find someone who specializes in that. If you are multinational, find someone who knows the regulations. If you are a beginner, find someone that can guide you and help you with what you need, specifically. - Artificial intelligence and the cloud are both increasing in use and they support each other. Businesses need both in the future, and they work together. With the Internet of Things, AI and the cloud will both be utilized. Some interesting quotes from today's episode: “Recently, one of the major cloud hyperscalers had an outage. They actually had a couple in a row. And cloud systems are expected to always be on and news like that makes the headlines. What I'm hearing customers talk about is maybe the need to rethink a strategy about having all their eggs in one basket.” - Jo Peterson “Have you secured your user end points? That translates into all end points. You might have the users squared away, but maybe you don't have your VM squared away. Maybe you don't have your server squared away.” - Jo Peterson “Wherever the disaster happens, it's still a disaster. So if you're running in a different availability zone, you're theoretically dealing with a whole other stack of infrastructure.” - Jo Peterson on how availability zones are useful protections from natural disasters to hackers “All of the hyperscalers do a really great job of helping to inform and educate potential clients. So every one of them has how-to guides. But at the end of the day, it's you building your infrastructure. So what I see happen in shops that don't have a lot of help, is they'll go to a managed service provider, a CSP, first to get that sort of architectural best practice from that company. And they'll learn as they go.” - Jo Peterson “Well, cloud is a teenager, and it's growing up. There's things that are happening as it grows up and matures. The world around it is changing and its world is changing. So there's this sort of dual effect.” - Jo Peterson “Current estimates expect today's $2.5 billion ML market (cloud ML market) to reach $13 billion by 2025. It's a pretty big increase, right? And Deloitte put out a 2020 study of AI that revealed that 83% of organizations expect AI to be critical to their business success in the next two years. So cloud drives measurable benefits for AI programs.” - Jo Peterson “I think we're just going to be seeing more AI and cloud together, like peanut butter and jelly.” - Jo Peterson “I think we're going to see, particularly in certain verticals, like retail, healthcare… We'll see edge cloud deployments. And he who has the data and he who uses the data is going to be first. You're going to see market disruption. You're going to see first to market advantage by companies that are using that edge, that customer data most creatively.” - Jo Peterson
In Episode 275, Scott sits down with Karl Rautenstrauch, a Principal Product Manager at Microsoft to discuss the new Azure File Migration program. Karl explains how the program enables access to best of breed migration tooling which can help you easily perform managed migrations of files to Azure Storage, including Blobs, Files, and Azure NetApp Files. Sponsors ShareGate - ShareGate's industry-leading products help IT professionals worldwide migrate their business to the Office 365 or SharePoint, automate their Office 365 governance, and understand their Azure usage & costs Spot by NetApp – The cloud automation platform that makes it easy to deliver continuously optimized infrastructure at the lowest possible cost Office365AdminPortal.com - Providing admins the knowledge and tools to run Office 365 successfully Intelligink - We focus on the Microsoft Cloud so you can focus on your business Show Notes Migrate the critical file data you need to power your applications New Azure File Migration program streamlines unstructured data migration Migrating your files to Azure has never been easier Migration guides Migrate your files to Azure - Azure IaaS Day 2021 Migration videos – https://aka.ms/filemigrationvideos Data Dynamics - https://aka.ms/datadynamicsguide Komprise - https://aka.ms/kompriseguide Accelerate infrastructure migration with Azure Storage | Azure Storage Day 2021 Azure Marketplace links Komprise Elastic Data Migration and Analytics, offer sponsored by Microsoft Azure Data Dynamics StorageX Migration, offer sponsored by Azure Azure Storage migration overview About our guest Karl is an IT veteran with 23 years of experience. He has been a systems administrator and architect for Fortune 500 companies, a national solutions architect for NetApp and Microsoft, and now a Principal Product Manager in Azure Engineering. Karl likes to be in the thick of emerging technologies. In his 9 years with Microsoft, he has focused on Azure Storage, Backup, Disaster Recovery, Edge Computing, and High Availability. In his role as a Product Manager with the Azure Storage team, he focuses on strategic partnerships to enable any and every workload to run on Azure. You can find him on Twitter as @Kloud_Karl or connect with him on LinkedIn. About the sponsors Intelligink utilizes their skill and passion for the Microsoft cloud to empower their customers with the freedom to focus on their core business. They partner with them to implement and administer their cloud technology deployments and solutions. Visit Intelligink.com for more info.
Goldman Sachs uses Trino to reduce last-mile ETL and provide a unified way of accessing data through federated joins. Making a variety of data sets from different sources available in one spot for our data science team was a tall order. Data must be quickly accessible to data consumers and systems like Trino must be reliable for users to trust this singular access point for their data.Join us on this next episode as we discuss with engineers from Goldman Sachs on how they integrated Trino and achieved scaling and high availability.- Intro Song: 00:00- Intro: 00:28- News: 8:39- Concept of the month: High Availability with Trino: 20:23- PR of the month: PR 8956 Add support for external db for schema management in mongodb connector: 1:04:09- Bonus PR of the month: PR 8202 Metadata for alias in elasticsearch connector only uses the first mapping: 1:15:15- Demo of the month: Trino Fiddle: A tool for easy online testing and sharing of Trino SQL problems and their solutions: 1:32:08- Question of the month: Does trino hive connector supports CarbonData?: 1:38:09Show Notes: https://trino.io/episodes/33.htmlShow Page: https://trino.io/broadcast/
It's frustrating when critical infrastructure encounters an issue that results in a disruption of service. High Availability is a concept that aims to help alleviate (or hopefully eliminate) such downtime, and is a very attractive goal for system administrators. In this episode, Jay and Joao discuss high availability, as well as its pros and cons.
Irene Huang joins Scott Hanselman to show him Cross-region Load Balancer, which recently became available for Public Preview. Cross-region Load Balancer is a Public layer-4 network load balancer serving as a single point of contact for global traffic. It provides efficient routing by leveraging Microsoft's global backbone network and geo-proximity load balancing algorithm. You can build regional resilient application by setting up a Cross-region Load Balancer in front of regional deployments.[0:00:00]– Introduction[0:04:12]– Concepts[0:06:31]– Demo: Config & deploy[0:14:10]– Demo: Verifying normal behavior[0:15:53]– Demo: Testing failover[0:17:46]– Discussion and wrap-upCross-region load balancerTutorial: Create a cross-region Azure Load Balancer using the Azure portal Azure Load Balancer overviewImprove application scalability and resiliency by using Azure Load BalancerCreate a free account (Azure)
Irene Huang joins Scott Hanselman to show him Cross-region Load Balancer, which recently became available for Public Preview. Cross-region Load Balancer is a Public layer-4 network load balancer serving as a single point of contact for global traffic. It provides efficient routing by leveraging Microsoft's global backbone network and geo-proximity load balancing algorithm. You can build regional resilient application by setting up a Cross-region Load Balancer in front of regional deployments.[0:00:00]– Introduction[0:04:12]– Concepts[0:06:31]– Demo: Config & deploy[0:14:10]– Demo: Verifying normal behavior[0:15:53]– Demo: Testing failover[0:17:46]– Discussion and wrap-upCross-region load balancerTutorial: Create a cross-region Azure Load Balancer using the Azure portal Azure Load Balancer overviewImprove application scalability and resiliency by using Azure Load BalancerCreate a free account (Azure)
Irene Huang joins Scott Hanselman to show him Cross-region Load Balancer, which recently became available for Public Preview. Cross-region Load Balancer is a Public layer-4 network load balancer serving as a single point of contact for global traffic. It provides efficient routing by leveraging Microsoft's global backbone network and geo-proximity load balancing algorithm. You can build regional resilient application by setting up a Cross-region Load Balancer in front of regional deployments.[0:00:00]– Introduction[0:04:12]– Concepts[0:06:31]– Demo: Config & deploy[0:14:10]– Demo: Verifying normal behavior[0:15:53]– Demo: Testing failover[0:17:46]– Discussion and wrap-upCross-region load balancerTutorial: Create a cross-region Azure Load Balancer using the Azure portal Azure Load Balancer overviewImprove application scalability and resiliency by using Azure Load BalancerCreate a free account (Azure)
Irene Huang joins Scott Hanselman to show him Cross-region Load Balancer, which recently became available for Public Preview. Cross-region Load Balancer is a Public layer-4 network load balancer serving as a single point of contact for global traffic. It provides efficient routing by leveraging Microsoft's global backbone network and geo-proximity load balancing algorithm. You can build regional resilient application by setting up a Cross-region Load Balancer in front of regional deployments.[0:00:00]– Introduction[0:04:12]– Concepts[0:06:31]– Demo: Config & deploy[0:14:10]– Demo: Verifying normal behavior[0:15:53]– Demo: Testing failover[0:17:46]– Discussion and wrap-upCross-region load balancerTutorial: Create a cross-region Azure Load Balancer using the Azure portal Azure Load Balancer overviewImprove application scalability and resiliency by using Azure Load BalancerCreate a free account (Azure)
Rick Spencer joins Donovan to chat about deploying Bitnami Node.js High Availability with Azure Cosmos DB, a free listing in Azure Marketplace that uses ARM to automatically spin up a three-node Node.js cluster behind a load balancer with a shared file system and Azure Cosmos DB integration. See how you can quickly get a sample MEAN app from GitHub to a highly available production environment in the Azure cloud, with very little configuration or sysadmin knowledge required.For more information, see:Bitnami Node.js For Microsoft Azure Multi-Tier Solutions (docs)Bitnami Node.js High-Availability Cluster (Azure Marketplace)Bitnami sample MEAN application (GitHub)Create a Free Account (Azure)Follow @donovanbrown Follow @AzureFriday Follow @rickspencer_3
Rick Spencer joins Donovan to chat about deploying Bitnami Node.js High Availability with Azure Cosmos DB, a free listing in Azure Marketplace that uses ARM to automatically spin up a three-node Node.js cluster behind a load balancer with a shared file system and Azure Cosmos DB integration. See how you can quickly get a sample MEAN app from GitHub to a highly available production environment in the Azure cloud, with very little configuration or sysadmin knowledge required.For more information, see:Bitnami Node.js For Microsoft Azure Multi-Tier Solutions (docs)Bitnami Node.js High-Availability Cluster (Azure Marketplace)Bitnami sample MEAN application (GitHub)Create a Free Account (Azure)Follow @donovanbrown Follow @AzureFriday Follow @rickspencer_3
Rick Spencer joins Donovan to chat about deploying Bitnami Node.js High Availability with Azure Cosmos DB, a free listing in Azure Marketplace that uses ARM to automatically spin up a three-node Node.js cluster behind a load balancer with a shared file system and Azure Cosmos DB integration. See how you can quickly get a sample MEAN app from GitHub to a highly available production environment in the Azure cloud, with very little configuration or sysadmin knowledge required.For more information, see:Bitnami Node.js For Microsoft Azure Multi-Tier Solutions (docs)Bitnami Node.js High-Availability Cluster (Azure Marketplace)Bitnami sample MEAN application (GitHub)Create a Free Account (Azure)Follow @donovanbrown Follow @AzureFriday Follow @rickspencer_3
We have a first PS4 kernel exploit, the long awaited OpenZFS devsummit report by Allan, DragonflyBSD 5.0 is out, we show you vmadm to manage jails, and parallel processing with Unix tools. This episode was brought to you by Headlines The First PS4 Kernel Exploit: Adieu (https://fail0verflow.com/blog/2017/ps4-namedobj-exploit/) The First PS4 Kernel Exploit: Adieu Plenty of time has passed since we first demonstrated Linux running on the PS4. Now we will step back a bit and explain how we managed to jump from the browser process into the kernel such that ps4-kexec et al. are usable. Over time, ps4 firmware revisions have progressively added many mitigations and in general tried to lock down the system. This post will mainly touch on vulnerabilities and issues which are not present on the latest releases, but should still be useful for people wanting to investigate ps4 security. Vulnerability Discovery As previously explained, we were able to get a dump of the ps4 firmware 1.01 kernel via a PCIe man-in-the-middle attack. Like all FreeBSD kernels, this image included “export symbols” - symbols which are required to perform kernel and module initialization processes. However, the ps4 1.01 kernel also included full ELF symbols (obviously an oversight as they have been removed in later firmware versions). This oversight was beneficial to the reverse engineering process, although of course not a true prerequisite. Indeed, we began exploring the kernel by examining built-in metadata in the form of the syscall handler table - focusing on the ps4-specific entries. Each process object in the kernel contains its own “idt” (ID Table) object. As can be inferred from the snippet above, the hash table essentially just stores pointers to opaque data blobs, along with a given kind and name. Entries may be accessed (and thus “locked”) with either read or write intent. Note that IDTTYPE is not a bitfield consisting of only unique powers of 2. This means that if we can control the kind of an identry, we may be able to cause a type confusion to occur (it is assumed that we may control name). Exploitation To an exploiter without ps4 background, it might seem that the easiest way to exploit this bug would be to take advantage of the write off the end of the malloc'd namedobjusrt object. However, this turns out to be impossible (as far as I know) because of a side effect of the ps4 page size being changed to 0x4000 bytes (from the normal of 0x1000). It appears that in order to change the page size globally, the ps4 kernel developers opted to directly change the related macros. One of the many changes resulting from this is that the smallest actual amount of memory which malloc may give back to a caller becomes 0x40 bytes. While this also results in tons of memory being completely wasted, it does serve to nullify certain exploitation techniques (likely completely by accident…). Adieu The namedobj exploit was present and exploitable (albeit using a slightly different method than described here) until it was fixed in firmware version 4.06. This vulnerability was also found and exploited by (at least) Chaitin Tech, so props to them! Taking a quick look at the 4.07 kernel, we can see a straightforward fix (4.06 is assumed to be identical - only had 4.07 on hand while writing this post): int sys_namedobj_create(struct thread *td, void *args) { // ... rv = EINVAL; kind = *((_DWORD *)args + 4) if ( !(kind & 0x4000) && *(_QWORD *)args ) { // ... (unchanged) } return rv; } And so we say goodbye to a nice exploit. I hope you enjoyed this blast from the past :) Keep hacking! OpenZFS Developer Summit 2017 Recap (https://www.ixsystems.com/blog/openzfs-devsummit-2017/) The 5th annual OpenZFS Developer Summit was held in San Francisco on October 24-25. Hosted by Delphix at the Children's Creativity Museum in San Francisco, over a hundred OpenZFS contributors from a wide variety of companies attended and collaborated during the conference and developer summit. iXsystems was a Gold sponsor and several iXsystems employees attended the conference, including the entire Technical Documentation Team, the Director of Engineering, the Senior Analyst, a Tier 3 Support Engineer, and a Tier 2 QA Engineer. Day 1 of the conference had 9 highly detailed, informative, and interactive technical presentations from companies which use or contribute to OpenZFS. The presentations highlighted improvements to OpenZFS developed “in-house” at each of these companies, with most improvements looking to be made available to the entire OpenZFS community in the near to long term. There's a lot of exciting stuff happening in the OpenZFS community and this post provides an overview of the presented features and proof-of-concepts. The keynote was delivered by Mark Maybee who spoke about the past, present, and future of ZFS at Oracle. An original ZFS developer, he outlined the history of closed-source ZFS development after Oracle's acquisition of Sun. ZFS has a fascinating history, as the project has evolved over the last decade in both open and closed source forms, independent of one another. While Oracle's proprietary internal version of ZFS has diverged from OpenZFS, it has implemented many of the same features. Mark was very proud of the work his team had accomplished over the years, claiming Oracle's ZFS products have accounted for over a billion dollars in sales and are used in the vast majority of Fortune 100 companies. However, with Oracle aggressively moving into cloud storage, the future of closed source ZFS is uncertain. Mark presented a few ideas to transform ZFS into a mainstream and standard file system, including adding more robust support for Linux. Allan Jude from ScaleEngine talked about ZStandard, a new compression method he is developing in collaboration with Facebook. It offers compression comparable to gzip, but at speeds fast enough to keep up with hard drive bandwidth. According to early testing, it improves both the speed and compression efficiency over the current LZ4 compression algorithm. It also offers a new “dictionary” feature for improving image compression, which is of particular interest to Facebook. In addition, when using ZFS send and receive, it will adapt the compression ratio to make the most efficient use of the network bandwidth. Currently, deleting a clone on ZFS is a time-consuming process, especially when dealing with large datasets that have diverged over time. Sara Hartse from Delphix described how “clone fast delete” speeds up clone deletion. Rather than traversing the entire dataset during clone deletion, changes to the clone are tracked in a “live list” which the delete process uses to determine which blocks to free. In addition, rather than having to wait for the clone to finish, the delete process backgrounds the task so you can keep working without any interruptions. Sara shared the findings of a test they ran on a clone with 500MB of data, which took 45 minutes to delete with the old method, and under a minute using the live list. This behavior is an optional property as it may not be appropriate for long-lived clones where deletion times are not a concern. At this time, it does not support promoted clones. Olaf Faaland from Lawrence Livermore National Labs demonstrated the progress his team has made to improve ZFS pool imports with MMP (Multi-Modifier Protection), a watchdog system to make sure that ZFS pools in clustered High Availability environments are not imported by more than one host at a time. MMP uses uberblocks and other low-level ZFS features to monitor pool import status and otherwise safeguard the import process. MMP adds fields to on-disk metadata so it does not depend on hardware, such as SAS. It supports multi-node HA configs and does not affect non-HA systems. However, it does have issues with long I/O delays so existing HA software is recommended as an additional fallback. Jörgen Lundman of GMO Internet gave an entertaining talk on the trials and tribulations of porting ZFS to OS X. As a bonus, he talked about porting ZFS to Windows, and showed a working demo. While not yet in a usable state, it demonstrated a proof-of-concept of ZFS support for other platforms. Serapheim Dimitropoulos from Delphix discussed Faster Allocation with the Log Spacemap as a means of optimizing ZFS allocation performance. He began with an in-depth overview of metaslabs and how log spacemaps are used to track allocated and freed blocks. Since blocks are only allocated from loaded metaslabs but freed blocks may apply to any metaslab, over time logging the freed blocks to each appropriate metaslab with every txg becomes less efficient. Their solution is to create a pool-wide metaslab for unflushed entries. Shailendra Tripathi from Tegile presented iFlash: Dynamic Adaptive L2ARC Caching. This was an interesting talk on what is required to allow very different classes of resources to share the same flash device–in their case, ZIL, L2ARC, and metadata. To achieve this, they needed to address the following differences for each class: queue priority, metaslab load policy, allocation, and data protection (as cache has no redundancy). Isaac Huang of Intel introduced DRAID, or parity declustered RAID. Once available, this will provide the same levels of redundancy as traditional RAIDZ, providing the administrator doubles the amount of options for providing redundancy for their use case. The goals of DRAID are to address slow resilvering times and the write throughput of a single replacement drive being a bottleneck. This solution skips block pointer tree traversal when rebuilding the pool after drive failure, which is the cause of long resilver times. This means that redundancy is restored quickly, mitigating the risk of losing additional drives before the resilver completes, but it does require a scrub afterwards to confirm data integrity. This solution supports logical spares, which must be defined at vdev creation time, which are used to quickly restore the array. Prakash Surya of Delphix described how ZIL commits currently occur in batches, where waiting threads have to wait for the batch to complete. His proposed solution was to replace batch commits and to instead notify the waiting thread after its ZIL commit in order to greatly increase throughput. A new tunable for the log write block timeout can also be used to log write blocks more efficiently. Overall, the quality of the presentations at the 2017 OpenZFS conference was high. While quite technical, they clearly explained the scope of the problems being addressed and how the proposed solutions worked. We look forward to seeing the described features integrated into OpenZFS. The videos and slides for the presentations should be made available over the next month or so at the OpenZFS website. OpenZFS Photo Album (https://photos.google.com/share/AF1QipNxYQuOm5RDxRgRQ4P8BhtoLDpyCuORKWiLPT0WlvUmZYDdrX3334zu5lvY_sxRBA?key=MW5fR05MdUdPaXFKVDliQVJEb3N3Uy1uMVFFdVdR) DragonflyBSD 5.0 (https://www.dragonflybsd.org/release50/) DragonFly version 5.0 brings the first bootable release of HAMMER2, DragonFly's next generation file system. HAMMER2 Preliminary HAMMER2 support has been released into the wild as-of the 5.0 release. This support is considered EXPERIMENTAL and should generally not yet be used for production machines and important data. The boot loader will support both UFS and HAMMER2 /boot. The installer will still use a UFS /boot even for a HAMMER2 installation because the /boot partition is typically very small and HAMMER2, like HAMMER1, does not instantly free space when files are deleted or replaced. DragonFly 5.0 has single-image HAMMER2 support, with live dedup (for cp's), compression, fast recovery, snapshot, and boot support. HAMMER2 does not yet support multi-volume or clustering, though commands for it exist. Please use non-clustered single images for now. ipfw Updates IPFW has gone through a number of updates in DragonFly and now offers better performance. pf and ipfw3 are also still supported. Improved graphics support The i915 driver has been brought up to match what's in the Linux 4.7.10 kernel. Intel GPUs are supported up to the Kabylake generation. vga_switcheroo(4) module added, allowing the use of Intel GPUs on hybrid-graphics systems. The new apple_gmux driver enables switching to the Intel video chipset on dual Intel/NVIDIA and Intel/Radeon Macbook computers. Other user-affecting changes efisetup(8) added. DragonFly can now support over 900,000 processes on a single machine. Client-side SSH by default does not try password authentication, which is the default behavior in newer versions of OpenSSH. Pass an explicit '-o PasswordAuthentication=yes' or change /etc/ssh/ssh_config if you need the old behavior. Public key users are unaffected. Clang status A starting framework has been added for using clang as the alternate base compiler in DragonFly, to replace gcc 4.7. It's not yet complete. Clang can of course be added as a package. Package updates Many package updates but I think most notably we need to point to chrome60 finally getting into dports with accelerated video and graphics support. 64-bit status Note that DragonFly is a 64-bit-only operating system as of 4.6, and will not run on 32-bit hardware. AMD Ryzen is supported and DragonFly 5.0 has a workaround for a hardware bug (http://lists.dragonflybsd.org/pipermail/commits/2017-August/626190.html). DragonFly quickly released a v5.0.1 with a few patches Download link (https://www.dragonflybsd.org/download/) News Roundup (r)vmadm – managing FreeBSD jails (https://blog.project-fifo.net/rvmadm-managing-freebsd-jails/) We are releasing the first version (0.1.0) of our clone of vmadm for FreeBSD jails today. It is not done or feature complete, but it does provides basic functionality. At this point, we think it would be helpful to get it out there and get some feedback. As of today, it allows basic management of datasets, as well as creating, starting, stopping, and destroying jails. Why another tool to manage jails However, before we go into details let's talk why we build yet another jail manager? It is not the frequent NIH syndrome, actually quite the opposite. In FiFo 0.9.2 we experimented with iocage as a way to control jails. While iocage is a useful tool when used as a CLI utility it has some issues when used programmatically. When managing jails automatically and not via a CLI tool things like performance, or a machine parsable interface matter. While on a CLI it is acceptable if a call takes a second or two, for automatically consuming a tool this delay is problematic. Another reason for the decision was that vmadm is an excellent tool. It is very well designed. SmartOs uses vmadm for years now. Given all that, we opted for adopting a proven interface rather than trying to create a new one. Since we already interface with it on SmartOS, we can reuse a majority of our management code between SmartOS and FreeBSD. What can we do Today we can manage datasets, which are jail templates in the form of ZFS volumes. We can list and serve them from a dataset-server, and fetch those we like want. At this point, we provide datasets for FreeBSD 10.0 to 11.1, but it is very likely that the list will grow. As an idea here is a community-driven list of datasets (https://datasets.at/) that exist for SmartOS today. Moreover, while those datasets will not work, we hope to see the same for BSD jails. After fetching the dataset, we can define jails by using a JSON file. This file is compatible with the zone description used on SmartOS. It does not provide all the same features but a subset. Resources such as CPU and memory can be defined, networking configured, a dataset selected and necessary settings like hostname set. With the jail created, vmadm allows managing its lifetime, starting, stopping it, accessing the console and finally destroying it. Updates to jails are supported to however as of today they are only taken into account after restarting the jail. However, this is in large parts not a technical impossibility but rather wasn't high up on the TODO list. It is worth mentioning that vmadm will not pick up jails created in other tools or manually. Only using vmadm created jails was a conscious decision to prevent it interfering with existing setups or other utilities. While conventional tools can manage jails set up with vmadm just fine we use some special tricks like nested jails to allow for restrictions required for multi-tenancy that are hard or impossible to achieve otherwise. Whats next First and foremost we hope to get some feedback and perhaps community engagement. In the meantime, as announced earlier this year (https://blog.project-fifo.net/fifo-in-2017/), we are hard at work integrating FreeBSD hypervisors in FiFo, and as of writing this, the core actions work quite well. Right now only the barebone functions are supported, some of the output is not as clear as we would like. We hope to eventually add support for behyve to vmadm the same way that it supports KVM on SmartOS. Moreover, the groundwork for this already exists in the nested jail techniques we are using. Other than that we are exploring ways to allow for PCI pass through in jails, something not possible in SmartOS zones right now that would be beneficial for some users. In general, we want to improve compatibility with SmartOS as much as possible and features that we add over time should make the specifications invalid for SmartOS. You can get the tool from github (https://github.com/project-fifo/r-vmadm). *** Parallel processing with unix tools (http://www.pixelbeat.org/docs/unix-parallel-tools.html) There are various ways to use parallel processing in UNIX: piping An often under appreciated idea in the unix pipe model is that the components of the pipe run in parallel. This is a key advantage leveraged when combining simple commands that do "one thing well" split -n, xargs -P, parallel Note programs that are invoked in parallel by these, need to output atomically for each item processed, which the GNU coreutils are careful to do for factor and sha*sum, etc. Generally commands that use stdio for output can be wrapped with the stdbuf -oL command to avoid intermixing lines from parallel invocations make -j Most implementations of make(1) now support the -j option to process targets in parallel. make(1) is generally a higher level tool designed to process disparate tasks and avoid reprocessing already generated targets. For example it is used very effictively when testing coreutils where about 700 tests can be processed in 13 seconds on a 40 core machine. implicit threading This goes against the unix model somewhat and definitely adds internal complexity to those tools. The advantages can be less data copying overhead, and simpler usage, though its use needs to be carefully considered. A disadvantage is that one loses the ability to easily distribute commands to separate systems. Examples are GNU sort(1) and turbo-linecount The example provided counts lines in parallel: The examples below will compare the above methods for implementing multi-processing, for the function of counting lines in a file. First of all let's generate some test data. We use both long and short lines to compare the overhead of the various methods compared to the core cost of the function being performed: $ seq 100000000 > lines.txt # 100M lines $ yes $(yes longline | head -n9) | head -n10000000 > long-lines.txt # 10M lines We'll also define the add() { paste -d+ -s | bc; } helper function to add a list of numbers. Note the following runs were done against cached files, and thus not I/O bound. Therefore we limit the number of processes in parallel to $(nproc), though you would generally benefit to raising that if your jobs are waiting on network or disk etc. + We'll use this command to count lines for most methods, so here is the base non multi-processing performance for comparison: $ time wc -l lines.txt $ time wc -l long-lines.txt split -n Note using -n alone is not enough to parallelize. For example this will run serially with each chunk, because since --filter may write files, the -n pertains to the number of files to split into rather than the number to process in parallel. $ time split -n$(nproc) --filter='wc -l' lines.txt | add You can either run multiple invocations of split in parallel on separate portions of the file like: $ time for i in $(seq $(nproc)); do split -n$i/$(nproc) lines.txt | wc -l& done | add Or split can do parallel mode using round robin on each line, but that's huge overhead in this case. (Note also the -u option significant with -nr): $ time split -nr/$(nproc) --filter='wc -l' lines.txt | add Round robin would only be useful when the processing per item is significant. Parallel isn't well suited to processing a large single file, rather focusing on distributing multiple files to commands. It can't efficiently split to lightweight processing if reading sequentially from pipe: $ time parallel --will-cite --block=200M --pipe 'wc -l' < lines.txt | add Like parallel, xargs is designed to distribute separate files to commands, and with the -P option can do so in parallel. If you have a large file then it may be beneficial to presplit it, which could also help with I/O bottlenecks if the pieces were placed on separate devices: split -d -n l/$(nproc) lines.txt l. Those pieces can then be processed in parallel like: $ time find -maxdepth 1 -name 'l.*' | xargs -P$(nproc) -n1 wc -l | cut -f1 -d' ' | add If your file sizes are unrelated to the number of processors then you will probably want to adjust -n1 to batch together more files to reduce the number of processes run in total. Note you should always specify -n with -P to avoid xargs accumulating too many input items, thus impacting the parallelism of the processes it runs. make(1) is generally used to process disparate tasks, though can be leveraged to provide low level parallel processing on a bunch of files. Note also the make -O option which avoids the need for commands to output their data atomically, letting make do the synchronization. We'll process the presplit files as generated for the xargs example above, and to support that we'll use the following Makefile: %: FORCE # Always run the command @wc -l < $@ FORCE: ; Makefile: ; # Don't include Makefile itself One could generate this and pass to make(1) with the -f option, though we'll keep it as a separate Makefile here for simplicity. This performs very well and matches the performance of xargs. $ time find -name 'l.*' -exec make -j$(nproc) {} + | add Note we use the POSIX specified "find ... -exec ... {} +" construct, rather than conflating the example with xargs. This construct like xargs will pass as many files to make as possible, which make(1) will then process in parallel. OpenBSD gives a hint on forgetting unlock mutex (http://nanxiao.me/en/openbsd-gives-a-hint-on-forgetting-unlock-mutex/) OpenBSD gives a hint on forgetting unlock mutex Check following simple C++ program: > ``` #include int main(void) { std::mutex m; m.lock(); return 0; } ``` The mutex m forgot unlock itself before exiting main function: m.unlock(); Test it on GNU/Linux, and I chose ArchLinux as the testbed: $ uname -a Linux fujitsu-i 4.13.7-1-ARCH #1 SMP PREEMPT Sat Oct 14 20:13:26 CEST 2017 x86_64 GNU/Linux $ clang++ -g -pthread -std=c++11 test_mutex.cpp $ ./a.out $ The process exited normally, and no more words was given. Build and run it on OpenBSD 6.2: clang++ -g -pthread -std=c++11 test_mutex.cpp ./a.out pthread_mutex_destroy on mutex with waiters! The OpenBSD prompts “pthreadmutexdestroy on mutex with waiters!“. Interesting! *** Beastie Bits Updates to the NetBSD operating system since OSHUG #57 & #58 (http://mailman.uk.freebsd.org/pipermail/ukfreebsd/2017-October/014148.html) Creating a jail with FiFo and Digital Ocean (https://blog.project-fifo.net/fifo-jails-digital-ocean/) I'm thinking about OpenBSD again (http://stevenrosenberg.net/blog/bsd/openbsd/2017_0924_openbsd) Kernel ASLR on amd64 (https://blog.netbsd.org/tnf/entry/kernel_aslr_on_amd64) Call for Participation - BSD Devroom at FOSDEM (https://people.freebsd.org/~rodrigo/fosdem18/) BSD Stockholm Meetup (https://www.meetup.com/BSD-Users-Stockholm/) *** Feedback/Questions architect - vBSDCon (http://dpaste.com/15D5SM4#wrap) Brad - Packages and package dependencies (http://dpaste.com/3MENN0X#wrap) Lars - dpb (http://dpaste.com/2SVS18Y) Alex re: PS4 Network Throttling (http://dpaste.com/028BCFA#wrap) ***