POPULARITY
containerd was born from community desire for a core, standalone runtime to act as a piece of plumbing that applications like Kubernetes could use. It sits between command line tools like Docker, which it was spun out from, and lower-level runtimes like runC or gVisor, which execute the container’s code. This week’s guest is Derek McGowan, a Software Engineer at Docker and a containerd maintainer-d. Along with the news of the week, Adam and Craig discuss the many Vancouvers. Do you have something cool to share? Some questions? Let us know: web: kubernetespodcast.com mail: kubernetespodcast@google.com twitter: @kubernetespod Chatter of the week Vancouver, Vancouver, and George Vancouver South Bend, North Bend, and Bend Cosmpolis “50 Year Sensation: the Dave McMacken Retrospective” (album art show in Astoria, Oregon) News of the week Istio 1.3 is out Google’s Anthos now incudes Anthos Service Mesh, Cloud Run for Anthos and more Cloud Native Application Bundles hit 1.0 Episode 61 with Ralph Squillace and Jeremy Rickard Nominations for the annual CNCF Community Awards Bloomberg hits 90% utilization with Kubernetes Mistakes that “cost” thousands by Gajus Kuizinas Kubernetes Edge working group publishes whitepaper Isopod, by Cruise Pulumi 1.0 5 RBAC mistakes you must avoid (number 4 will shock you) OpenShift 4.2 disconnected install Red Hat Quay 3.1 Microsoft AKS brings Scale Sets and Standard LB to GA Upstream kernel bugs Amazom EKS adds cluster tagging and IAM roles for service accounts Deep dive into AWS Fargate by Abhisheck Ray from Amazon Kong introduces Kuma, “universal service mesh” Google introduces Cloud Dataproc for Kubernetes Apache Flink operator from Google Cloud Container runtime security bypasses on Falco by Mark “Antitree” Manning Rafay Systems lands $8m in Series A funding Links from the interview containerd Original announcement The many meanings of ‘container runtime’ kubelet and Container Runtime Interfaces runC, gVisor, Kata Containers, and the Windows Host Compute Service (HCS) ctr debug tool containerd’s graduation from the CNCF containerd shim API gVisor shim Firecracker containerd integration Kata Containers shim Windows Container shim rkt announced in 2014 with appC spec Open Container Initiative libcontainer, which became runC Web Assembly (WASM) BuildKit 1.3.0 releases are coming Contribution opportunities: Reporting issues Plugin ecosystem Derek McGowan and containerd on Twitter
Happy Independence Day to our American listeners! Mark Mandel is back today as he and Gabi Ferrara interview Bill Creekbaum of Informatica to learn how they work with Google Cloud for a better big data user experience. Mark Mirchandani is hanging around the studio as well, bringing some cool things of the week and helping with the question of the week! Informatica provides data managing products that offer complete solutions focusing on metadata management, integration, governance, security, data quality, and discoverability. Bill’s job at Informatica is to ensure these products really take advantage of the strengths of Google Cloud Platform. One such example is a product that allows customers to design in Informatica and push their projects to Cloud Dataproc. Informatica also offers similar capabilities in BigQuery. When moving data from on-prem to the cloud, customers can use Informatica and Google Cloud together for a seamless transition, cost savings, and easier data control. Together, Informatica and Google Cloud can also facilitate the acquisition of high quality data. To have better, more trustworthy output, data inputed needs to be safe to access, have few or no duplicates and null values, and be complete. To achieve this, developers usually use a combination of the Informatica tools Intelligent Cloud Services, Enterprise Data Catalog, and Big Data Management, and the Google tools BigQuery, Cloud Storage, Analytics, Dataproc, and Pub/Sub. Bill’s closing advice for companies comes in three parts: take stock of the data you’ve got, set goals, and develop a well-rounded team. Bill Creekbaum Bill Creekbaum is Sr. Director of Product Management for Cloud, Big Data, and Analytic Ecosystems at Informatica. He is focused on delivering market leading unified data management platforms and services that help customers take advantage of their greatest assets, data. Bill has been in product management and product marketing for more than 20 years and for the past 10 has been focused on successfully delivering SaaS and Cloud Applications to the market. Prior to joining Informatica, Bill has worked at SnapLogic, GoodData, Oracle, Microsoft, Mindjet, and more. See more of Bill’s experience on LinkedIn. Cool things of the week Google Cloud + Chronicle: The security moonshot joins Google Cloud blog GCP Podcast Episode 135: VirusTotal with Emi Martinez podcast Introducing Equiano, a subsea cable from Portugal to South Africa blog Kubernetes 1.15: Extensibility and Continuous Improvement blog Future of CRDs: Structural Schemas blog See how your code actually executes with Stackdriver Profiler, now GA blog Interview Informatica site Informatica for GCP site BigQuery site Cloud Storage site Cloud Dataproc site Intelligent Cloud Services site Enterprise Data Catalog site Big Data Management site Google Analytics site Pub/Sub site Google Cloud & Informatica: Accelerate your Data-Driven Digital Transformation webinar Informatica for Google BigQuery data sheet Informatica Intelligent Cloud Services for Google BigQuery site Question of the week If I want to have my App Engine Application serve any subdomain on my custom domain, how do I do that? Where can you find us next? Gabi is done traveling. Mark Mirch’ is working on Stack Chat. Mark Mandel is going to Tokyo Next, Open Source in Gaming Day , and the North American Open Source Summit. Sound Effect Attribution “small group laugh 6.flac” by tim.kahn of Freesound.org “Chewing, Carrot, A” by Inspector J of Freesound.org “Testtone1000hz” by Jobro of Freesound.org
In the CloudCloud Data Warehouse Benchmark: Redshift, Snowflake, Azure, Presto, BigQueryhttps://fivetran.com/blog/warehouse-benchmarkExtending the SQL capabilities of your Cloud Dataproc cluster with the Presto optional componenthttps://cloud.google.com/blog/products/data-analytics/extending-the-sql-capabilities-of-your-cloud-dataproc-cluster-with-the-presto-optional-componentGive meaning to 100 billion analytics events a dayhttps://medium.com/teads-engineering/give-meaning-to-100-billion-analytics-events-a-day-d6ba09aa8f44Introducing Amazon Corretto, a No-Cost Distribution of OpenJDK with Long-Term Supporthttps://aws.amazon.com/fr/blogs/opensource/amazon-corretto-no-cost-distribution-openjdk-long-term-support/Uber’s Big Data Platform: 100+ Petabytes with Minute Latencyhttps://eng.uber.com/uber-big-data-platform/https://eng.uber.com/hoodie/AWS Releases New Pricing Calculatorhttps://www.cbronline.com/news/aws-pricing-calculatorWill Cloud Computing Kill Open Source Development?https://www.infoq.com/articles/will-cloud-computing-kill-open-sourceDatabase“This is What Happens Larry”: Amazon Finally Dumps Oracle Data Warehousehttps://www.cbronline.com/news/aws-oracle-data-warehouseCockroachDB 2.0 geo-partitioninghttps://www.youtube.com/watch?v=v2QK5VgLx6ETiKV : A distributed transactional key-value databasehttps://tikv.org/https://github.com/pingcap/tidbKafka worldCertifs pour la communauté !!! DatascienceUber Introduces PyML: Their Secret Weapon for Rapid Machine Learning Developmenthttps://towardsdatascience.com/uber-introduces-pyml-their-secret-weapon-for-rapid-machine-learning-development-c0f40009a617Paperspace gradient : Saas datascience platformhttps://www.paperspace.com/gradientPandora wants to map the “podcast genome” so it can recommend your next favorite showhttp://www.niemanlab.org/2018/11/pandora-wants-to-map-the-podcast-genome-so-it-can-recommend-your-next-favorite-show/-----------------------------Lisez le blog d'Affini-Techhttp://blog.affini-tech.com-------------------------------------------------------------http://www.bigdatahebdo.com https://twitter.com/bigdatahebdoVincent : https://twitter.com/vhe74Alex : https://twitter.com/alexanderDejaCette publication est sponsorisée par Affini-Tech ( http://affini-tech.com https://twitter.com/affinitech )On recrute ! venez cruncher de la data avec nous ! écrivez nous à recrutement@affini-tech.com
In the CloudCloud Data Warehouse Benchmark: Redshift, Snowflake, Azure, Presto, BigQueryhttps://fivetran.com/blog/warehouse-benchmarkExtending the SQL capabilities of your Cloud Dataproc cluster with the Presto optional componenthttps://cloud.google.com/blog/products/data-analytics/extending-the-sql-capabilities-of-your-cloud-dataproc-cluster-with-the-presto-optional-componentGive meaning to 100 billion analytics events a dayhttps://medium.com/teads-engineering/give-meaning-to-100-billion-analytics-events-a-day-d6ba09aa8f44Introducing Amazon Corretto, a No-Cost Distribution of OpenJDK with Long-Term Supporthttps://aws.amazon.com/fr/blogs/opensource/amazon-corretto-no-cost-distribution-openjdk-long-term-support/Uber’s Big Data Platform: 100+ Petabytes with Minute Latencyhttps://eng.uber.com/uber-big-data-platform/https://eng.uber.com/hoodie/AWS Releases New Pricing Calculatorhttps://www.cbronline.com/news/aws-pricing-calculatorWill Cloud Computing Kill Open Source Development?https://www.infoq.com/articles/will-cloud-computing-kill-open-sourceDatabase“This is What Happens Larry”: Amazon Finally Dumps Oracle Data Warehousehttps://www.cbronline.com/news/aws-oracle-data-warehouseCockroachDB 2.0 geo-partitioninghttps://www.youtube.com/watch?v=v2QK5VgLx6ETiKV : A distributed transactional key-value databasehttps://tikv.org/https://github.com/pingcap/tidbKafka worldCertifs pour la communauté !!! DatascienceUber Introduces PyML: Their Secret Weapon for Rapid Machine Learning Developmenthttps://towardsdatascience.com/uber-introduces-pyml-their-secret-weapon-for-rapid-machine-learning-development-c0f40009a617Paperspace gradient : Saas datascience platformhttps://www.paperspace.com/gradientPandora wants to map the “podcast genome” so it can recommend your next favorite showhttp://www.niemanlab.org/2018/11/pandora-wants-to-map-the-podcast-genome-so-it-can-recommend-your-next-favorite-show/-----------------------------Lisez le blog d'Affini-Techhttp://blog.affini-tech.com-------------------------------------------------------------http://www.bigdatahebdo.com https://twitter.com/bigdatahebdoVincent : https://twitter.com/vhe74Alex : https://twitter.com/alexanderDejaCette publication est sponsorisée par Affini-Tech ( http://affini-tech.com https://twitter.com/affinitech )On recrute ! venez cruncher de la data avec nous ! écrivez nous à recrutement@affini-tech.com
Juliet Hougland and Michelle Casbon are on the podcast this week to talk about data science with Melanie and Mark. We had a great discussion about methodology, applications, tools, pipelines, challenges and resources. Juliet shared insights into the unique data science ownership workflow from idea to deployment at Stitch Fix, and Michelle dove into how Kubeflow is playing a role to help drive reliability in model development and deployment. Juliet Hougland Juliet Hougland leads the Workflow, Environment, and Execution team at Stichfix. She is a data scientist and engineer with expertise in computational mathematics and years of hands-on machine learning and big data experience. She has built and deployed production ML models, advised Fortune 500 companies on infrastructure and worked on a variety of open source projects (Apache Spark, Scalding, and Kiji) at the intersection of big data and machine learning. Michelle Casbon Michelle Casbon is a Senior Engineer on the Google Cloud Platform Developer Relations team, where she focuses on open source contributions and community engagement for machine learning and big data tools. Prior to joining Google, she was at several San Francisco-based startups as a Senior Engineer and Director of Data Science. Within these roles, she built and shipped machine learning products on distributed platforms using both AWS and GCP. Michelle’s development experience spans more than a decade and has primarily focused on multilingual natural language processing, system architecture and integration, and continuous delivery pipelines for machine learning applications. She especially loves working with open source projects and is an active contributor to Kubeflow. Michelle holds a masters degree from the University of Cambridge. Cool things of the week Sandeep Dinesh: Kubernetes Best Practices YouTube CNCF TOC voted to accept Helm as an incubation-level hosted project to CNCF site Andriod P in Beta blog Agones 0.2.0 site Securing cloud-connected devices with Cloud IoT and Microchip blog Interview flotilla-os repo Kubeflow repo Cloud Dataproc site & docs Spark site & community site scikit-learn site xgboost repo PyTorch site TensorFlow site and github Kubernetes site github Introducing ultramem Google Compute Engine machine types blog #114 Machine Learning Bias and Fairness with Timnit Gebru and Margaret Mitchell podcast Machine Learning Flash Clards site Open Source Data Science Masters site DockerCon SF site Question of the week If I have written a gRPC Service, but I’m using a language/platform that isn’t supported - is there any way I can access it as REST? grpc-gateway Envoy proxy Transcoding Where can you find us next? Mark is speaking at the San Francisco Kubernetes Meetup: Scaling Game Servers and the Conduit Service Mesh on June 14th. Melanie is speaking at a joint WiMLDS and PyLadies event “Paths to Data Science” on June 26th and Stanford AI4ALL on June 28th.
Holden Karau is on the podcast this week to talk all about Spark and Beam, two open source tools that helps process data at scale, with Mark and Melanie. Holden Karau Holden Karau is a transgender Canadian open source developer advocate @ Google with a focus on Apache Spark, BEAM, and related “big data” tools. She is the co-author of Learning Spark, High Performance Spark, and another Spark book that’s a bit more out of date. She is a commiter on and PMC on Apache Spark and committer on SystemML & Mahout projects. She was tricked into the world of big data while trying to improve search and recommendation systems and has long since forgotten her original goal. Cool things of the week Twitter’s collaboration with Google Cloud blog & tweet Kaggle CERN TrackML Particle Tracking Challenge Competition site Open-sourcing gVisor, a sandboxed container runtime blog & repo Announcing Stackdriver Kubernetes Monitoring blog MLPerf: collaborative effort to standardize ML benchmarks site Interview Spark site & community site Beam site Cloud Dataflow site & docs Cloud Dataproc site & docs Using Spark on Kubernetes Engine blog Testing future Apache Spark releases and changes on Google Kubernetes Engine and Cloud Dataproc blog Spark Packages site Spark testing base repo Flink site Arrow site Upcoming Talks: PyCon 2018 & Debugging PySpark talk Scala Days & Keeping the “fun” in Spark talk Strata London & Understanding Spark tuning with auto-tuning talk J on the Beach & General Purpose Big Data Systems are eating the world talk Spark Summit 2018 & Accelerating TF with Apache Arrow on Spark talk Question of the week I have a continuous integration build process setup with Container Builder, but it’s all sequential. I want to speed things up by processing parts of it in parallel. How do I do that? Configure Build Step Order docs Where can you find us next? Mark can be found streaming Agones development on Twitch. Melanie is speaking at the internet2 Global Summit, May 9th in San Diego, and will also be talking at the Understand Risk Forum on May 17th, in Mexico City. Special shout out: Google I/O and PyCon are both happening this week
Today Francesc and Mark have the honor to be joined by Alim Jaffer and Mo Firouz from Heroic Labs to discuss their open source framework for social and realtime apps and games. About Alim Jaffer A member of the founding team, Alim joined Heroic Labs in 2016 as the VP of Product after having worked in startups focused in the games and health verticals. He is based in Vancouver, Canada and San Francisco. About Mo Firouz Mo cofounded Heroic Labs and is part of the core engineering team. Mo has worked on various products in Heroic Labs including the core Nakama server as well as Heroic Managed Cloud where he was primarily responsible for automating server provisioning and the monitoring stack with Kubernetes. Mo previously worked as a system architect in VisualDNA and built scalable big-data analytics systems, and prior to that built realtime high frequency trading systems. Cool things of the week CRE life lessons: What is a dark launch, and what does it do for me? blog post Cloud Dataproc is now even faster and easier to use for running Apache Spark and Apache Hadoop announcement Canary Deployments using Istio blog post Interview Heroic Labs heroiclabs.com Heroic Labs on GitHub repository Heroic Labs Documentation Google Container Engine CockroachDB Question of the week Accessing Cloud SQL instances from Cloud Functions? Use SQL Proxy, as for the Managed Instance Group, which we cover on episode 81. Connecting MySQL Client from Compute Engine About the Cloud SQL Proxy CloudSQL Proxy GitHub repo Where can you find us next? Francesc just released a justforfunc episode on Contributing to the Go project. He'll be soon taking some well deserved holidays! Mark will be speaking at Pax Dev and then attending Pax West right after.
In the tenth episode of this podcast, your hosts Francesc and Mark interview Graham Polley and Pablo Caif, who are both Google Developer Experts who work at Shine Technologies. About Graham Graham is a senior software engineer based out of Melbourne Australia. He's passionate about promoting the adoption of cloud technologies into software development, and regularly blogs and gives presentations. Graham has extensive experience in building big data solutions for clients using the Google technology stack, and in particular with BigQuery & Dataflow. Graham works very closely with the GCP engineering team in the US, where he is a member of their cloud platform trusted tester program, and the solutions he helps build are used as internal exemplars of developer use cases. Graham is also a GDE on the GCP. You can contact Graham through Twitter, blog and Google Developer Expert Profile. About Pablo Pablo is a passionate software engineer who enjoys solving complex problems, and devising simple solutions. He works at Shine Technologies and he is part of a team that uses BigQuery and Dataflow to solve challenging and complex data processing business requirements. Pablo considers that scalability and performance are paramount to developing a great solution, and that is why he has been using Dataflow and BigQuery to bring these solutions to reality. Pablo is also a GDE on GCP. You can contact Pablo through Twitter, and blog. Cool thing of the week Google Cloud Platform Next Join the largest gathering of the Google Cloud Platform community to explore the latest developments in cloud technology. Come meet the people that help build Google Cloud Platform, such as engineers and product managers as well as network with experienced cloud architects, managers and engineers who have deployed GCP in their organizations. Interview Shine Technologies homepage Google Developer Experts about page BigQuery docs Cloud DataFlow docs Cloud Dataproc docs Google Cloud Dataproc and the 17 minute train challenge blog post A week in the life of a Google Developer Expert blog post Messages in the sky blog post Shine with BigQuery: The 30 Terabyte challenge video Question of the week The Google App Engine Admin API doc