POPULARITY
Health insurance denials are a big topic in America right now. Holden Karau joins me to talk about using AI to appeal health claim insurance denials. Fight Health Insurance: https://fighthealthinsurance.com/
How might Russia's war on Ukraine change after President-elect Trump takes office? NPR's Joanna Kakissis explains. Then, Here & Now's Karyn Miller-Medzon brings us to a Boston hospital that is helping Ukrainian doctors rebuild their country's decimated health care system. And, President-elect Donald Trump has promised to place tariffs on goods from China. Scott Kennedy of the Center for Strategic and International Studies joins us to explain what that could mean for consumers. Then, a new artificial intelligence-fueled platform called Fight Health Insurance helps people generate appeals to denied health insurance claims. Holden Karau, the site's creator, joins us to explain how it works.Learn more about sponsor message choices: podcastchoices.com/adchoicesNPR Privacy Policy
This interview was recorded for the GOTO Book Club.gotopia.tech/bookclubRead the full transcription of the interview hereAdi Polak - VP of Developer Experience at Treeverse & Contributing to lakeFS OSSHolden Karau - Co-Author of "Kubeflow for Machine Learning" & many more books & Open Source Engineer at NetflixDESCRIPTIONLearn how to build end-to-end scalable machine learning solutions with Apache Spark. With this practical guide, author Adi Polak introduces data and ML practitioners to creative solutions that supersede today's traditional methods. You'll learn a more holistic approach that takes you beyond specific requirements and organizational goals--allowing data and ML practitioners to collaborate and understand each other better.Scaling Machine Learning with Spark examines several technologies for building end-to-end distributed ML workflows based on the Apache Spark ecosystem with Spark MLlib, MLflow, TensorFlow, and PyTorch. If you're a data scientist who works with machine learning, this book shows you when and why to use each technology.You will:• Explore machine learning, including distributed computing concepts and terminology• Manage the ML lifecycle with MLflow• Ingest data and perform basic preprocessing with Spark• Explore feature engineering, and use Spark to extract features• Train a model with MLlib and build a pipeline to reproduce it• Build a data system to combine the power of Spark with deep learning• Get a step-by-step example of working with distributed TensorFlow• Use PyTorch to scale machine learning and its internal architecture* Book description: © O'ReillyThe interview is based on the book "Scaling Machine Learning with Spark"RECOMMENDED BOOKSAdi Polak • Machine Learning with Apache SparkHolden Karau, Trevor Grant, Boris Lublinsky, Richard Liu & Ilan Filonenko • Kubeflow for Machine LearningHolden Karau • Distributed Computing 4 KidsHolden Karau • Scaling Python with DaskHolden Karau & Boris Lublinsky • Scaling Python with RayHolden Karau & Rachel Warren • High Performance SparkHolden Karau, Konwinski, Wendell & Zaharia • Learning SparkHolden Karau & Krishna Sankar • Fast Data Processing with Spark 2nd EditionHolden Karau • Fast Data Processing with Spark 1st EditionTwitterLinkedInFacebookLooking for a unique learning experience?Attend the next GOTO conference near you! Get your ticket: gotopia.techSUBSCRIBE TO OUR YOUTUBE CHANNEL - new videos posted almost daily
This episode features an interview with Holden Karau, an Open Source Engineer at Netflix. Holden is best known for her work on Apache Spark, her advocacy in the open source software movement, and her creation of a variety of related projects including spark-testing-base. Previously, Holden worked at Big Tech companies like Apple, IBM, and Google as a software engineer and developer advocate.In this episode, Sam sits down with Holden to discuss the data analysis stack, functional programming, and the future of open source software data tooling.-------------------“These things are not one off. We may think that they're one off and they don't need testing, but that's not the reality. When you write something, it needs to be maintainable and as software people, the only real way that I think we know to make something vaguely maintainable is to at least have tests. And these tests need to cover common failure cases that we've experienced. And certainly, there's different approaches to this. There's property based testing, there's golden sets, all kinds of different options. I don't think necessarily any one approach is right or better here, but I think we need something. We need less untitled 5.IPython Notebook running in production, scheduled every hour. That is not a way to run a company.” – Holden Karau-------------------Episode Timestamps:(02:27): What open source data means to Holden(04:37): What interested Holden in mathematical computer science (09:51): What drew Holden to Spark(12:49): What Holden has learned about cognitive systems(20:02): What we need to learn as developers and data specialists(25:28): The future of the data analysis stack(31:21): Improvements in data tooling over the next 5 years(34:25): A question Holden wishes to be asked(40:51): Holden's advice for open source data project committers(43:18): Executive producer, Audra Montenegro's backstage takeaways-------------------Links:LinkedIn - Connect with HoldenBuy Holden's booksVisit Holden's website
From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) ABSTRACT This talk will go through both the improvements that have been made in Kubernetes for batch analytic workloads as well as some of the current pain experienced by users and developers moving their workloads to Kube. In this talk you will learn about how we “cheated” back in the YARN and Mesos days to make things go fast, why Kubernetes doesn't like those cheats, and what some alternatives are.
This interview was recorded for the GOTO Book Club.gotopia.tech/bookclubRead the full transcription of the interview hereHolden Karau - Co-Author of "Kubeflow for Machine Learning" & Open Source Engineer at NetflixAdi Polak - VP of Developer Experience at Treeverse & Contributing to lakeFS OSSDESCRIPTIONMachine Learning has been declared dead several times but that's far from true. Join Adi Polak, vice president of developer experience at Treeverse, and Holden Karau, open source engineer at Netflix, in their conversation about Kubeflow and how it provides better tooling in the ML space. The discussion touches on Holden's book “Kubeflow for Machine Learning” and expands to cover the worlds of Ray and Dask.RECOMMENDED BOOKSHolden Karau, Trevor Grant, Boris Lublinsky, Richard Liu & Ilan Filonenko • Kubeflow for Machine LearningHolden Karau • Distributed Computing 4 KidsHolden Karau • Scaling Python with DaskHolden Karau & Boris Lublinsky • Scaling Python with RayHolden Karau & Rachel Warren • High Performance SparkHolden Karau, Konwinski, Wendell & Zaharia • Learning SparkHolden Karau & Krishna Sankar • Fast Data Processing with Spark 2nd EditionHolden Karau • Fast Data Processing with Spark 1st EditionAdi Polak • Machine Learning with Apache SparkPhil Winder • Reinforcement LearningAurélien Géron • Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlowTwitterLinkedInFacebookLooking for a unique learning experience?Attend the next GOTO conference near you! Get your ticket at gotopia.techSUBSCRIBE TO OUR YOUTUBE CHANNEL - new videos posted almost daily.Discovery MattersA collection of stories and insights on matters of discovery that advance life...Listen on: Apple Podcasts Spotify Health, Wellness & Performance Catalyst w/ Dr. Brad CooperLooking for a catalyst to optimize your health, wellness & performance? You've found it!!Listen on: Apple Podcasts Spotify
Holden's tech story started at a very early age, and her tech-savviness grew very quickly. She told us how her parents and childhood neighbors helped her get on the right track. Going from one nostalgic memory to the next, Holden took us from her love for "Search" to Amazon through internships and moving to the West Coast. We talked about functional programming, Scala, and going to New York. We finally discussed interviewing, joining Google, and teaching.Here are the links from the show:https://www.twitter.com/holdenkarauhttps://en.wikipedia.org/wiki/Holden_Karauhttps://tokentransit.com/https://cs.uwaterloo.ca/~plragde/https://www.amazon.ca/Dead-Water-Creek-Morgan-Mystery/dp/1550024523/ref=sr_1_1?crid=3QPD3BKFLZUG0CreditsCover Heliotrope by Blue Dot Sessions is licensed CC BY-NC-ND 4.0.Your host is Timothée (Tim) Bourguignon, more about him at timbourguignon.fr.Gift the podcast a rating on one of the significant platforms https://devjourney.info/subscribeSupport the show (https://www.patreon.com/timbourguignon)
Holden Karau is a true champion for open source and becoming a better data engineer. Feel the power of her spark as you hear Holden's perspective as an individual contributor in a data team, and enjoy as she shares her hobbies, practical advice and approaches to working in data.
Holden Karau is best known for her work on Apache Spark™, her advocacy for open source software, and her creation and maintenance of a variety of related projects, including spark-testing-base. Get to know Holden before we look at the data dream team from the perspective of an individual contributor.
Diana Pojar Staff Data Engineer at Slack April, 2020 blog, twitter, linkedin Tell us a little about your current role: your title, the company you wor... https://staffeng.com/stories/diana-pojar blogtwitterlinkedintechnical leadershipJosh WillsStan BabourineBogdan GazaTravis CrawfordCamille Fournier Lara HoganJosh WillsVicki BoykisDavid GascaJulia GraceHolden KarauJohn AllspawCharity MajorsTheo SchlossnagleJessica Joy KerrSarah CatanzaroOrange Bookmy Goodreads accountReady to read another story?
RECENT UPDATES: FindCollabs $5000 Hackathon Ends Saturday April 15th, 2019 New version of Software Daily, our app and ad-free subscription service Software Daily is looking for help with Android engineering, QA, machine learning, and more Distributed stream processing allows developers to build applications on top of large sets of data that are being rapidly created. The post Streaming with Holden Karau appeared first on Software Engineering Daily.
Patrick talks with Holden Karau, Developer Advocate at Google, about ways to incorporate Spark in your application without impacting performance or having to learn Scala. See omnystudio.com/listener for privacy information.
Data is constantly flowing through a system. From telemetry data to data from IoT devices, as the amount of data increases, challenges emerge on how to process it. Holden Karau, Open Source Big Data Developer Advocate at Google, talked about how data can be processed and analyzed in batches or in streams. Holden explained the infrastructure needed to process data at scale and fundamental performance improvements for these systems.
Mandy Waite joins Mark and Melanie to share what is developer relations and how trust and empathy are key to its success. We discuss meeting developers where they are and the wide variety of differing communities that exist across the technology ecosystem. Mandy Waite Mandy Waite has worked at Google for nearly 8 years, 6 of which have been spent growing and nurturing the Cloud Advocacy team. She heads up the Infrastructure and Ops Advocacy team in Google Cloud with a focus on Cloud Native, DevOps, SRE, Observability and Security. Cool things of the week Better cost control with Google Cloud Billing programmatic notifications blog Music in Motion: a Firebase and IoT story blog Google Cloud Codelabs and Challenges codelabs Kubernetes Podcast site and blog Interview Google Cloud Platform site #46 Borg and K8s with John Wilkes podcast #118 OpenCensus with Morgan McLean and JBD podcast Felipe Hoffa & BigQuery reddit, blog and podcast Livestreaming with Jen Tong Twitch, Holden Karau Twitch, and Chris Broadfoot Twitch Ben Treynor on What is ‘Site Reliability Engineering’ interview Solomon Hykes at dotScale on Docker video Istio site and #85 Istio with Varun Talwar and Sven Mawson podcast Kubernetes site Docker site The Core Competencies of Developer Relations blog Question of the week Where do I go to learn about GDPR in regards to Google Cloud Platform? Google Cloud: Ready for GDPR blog Google Cloud & the General Data Protection Regulation site Where can you find us next? Mark is speaking at the Monthly SF Game Development Community, presenting on You Can’t Just Add More Servers on May the 30th in San Francisco. Melanie is speaking at a joint WiMLDS and PyLadies event “Paths to Data Science” on June 26th. More details to come.
Holden Karau is on the podcast this week to talk all about Spark and Beam, two open source tools that helps process data at scale, with Mark and Melanie. Holden Karau Holden Karau is a transgender Canadian open source developer advocate @ Google with a focus on Apache Spark, BEAM, and related “big data” tools. She is the co-author of Learning Spark, High Performance Spark, and another Spark book that’s a bit more out of date. She is a commiter on and PMC on Apache Spark and committer on SystemML & Mahout projects. She was tricked into the world of big data while trying to improve search and recommendation systems and has long since forgotten her original goal. Cool things of the week Twitter’s collaboration with Google Cloud blog & tweet Kaggle CERN TrackML Particle Tracking Challenge Competition site Open-sourcing gVisor, a sandboxed container runtime blog & repo Announcing Stackdriver Kubernetes Monitoring blog MLPerf: collaborative effort to standardize ML benchmarks site Interview Spark site & community site Beam site Cloud Dataflow site & docs Cloud Dataproc site & docs Using Spark on Kubernetes Engine blog Testing future Apache Spark releases and changes on Google Kubernetes Engine and Cloud Dataproc blog Spark Packages site Spark testing base repo Flink site Arrow site Upcoming Talks: PyCon 2018 & Debugging PySpark talk Scala Days & Keeping the “fun” in Spark talk Strata London & Understanding Spark tuning with auto-tuning talk J on the Beach & General Purpose Big Data Systems are eating the world talk Spark Summit 2018 & Accelerating TF with Apache Arrow on Spark talk Question of the week I have a continuous integration build process setup with Container Builder, but it’s all sequential. I want to speed things up by processing parts of it in parallel. How do I do that? Configure Build Step Order docs Where can you find us next? Mark can be found streaming Agones development on Twitch. Melanie is speaking at the internet2 Global Summit, May 9th in San Diego, and will also be talking at the Understand Risk Forum on May 17th, in Mexico City. Special shout out: Google I/O and PyCon are both happening this week