O'Reilly Data Show - O'Reilly Media Podcast

Machine learning for operational analytics and business intelligence

Play Episode Listen Later Oct 10, 2019 51:38

In this episode of the Data Show, I speak with Peter Bailis, founder and CEO of Sisu, a startup that is using machine learning to improve operational analytics. Bailis is also an assistant professor of computer science at Stanford University, where he conducts research into data-intensive systems and where he is co-founder of the DAWN […]

ceo stanford university sisu data show

Machine learning and analytics for time series data

Play Episode Listen Later Sep 26, 2019 40:31

In this episode of the Data Show, I speak with Arun Kejariwal of Facebook and Ira Cohen of Anodot (full disclosure: I’m an advisor to Anodot). This conversation stemmed from a recent online panel discussion we did, where we discussed time series data, and, specifically, anomaly detection and forecasting. Both Kejariwal (at Machine Zone, Twitter, […]

data show

Understanding deep neural networks

Play Episode Listen Later Sep 12, 2019 39:31

In this episode of the Data Show, I speak with Michael Mahoney, a member of RISELab, the International Computer Science Institute, and the Department of Statistics at UC Berkeley. A physicist by training, Mahoney has been at the forefront of many important problems in large-scale data analysis. On the theoretical side, his works spans algorithmic […]

statistics uc berkeley mahoney michael mahoney data show

Becoming a machine learning practitioner

Play Episode Listen Later Aug 29, 2019 33:22

In this episode of the Data Show, I speak with Kesha Williams, technical instructor at A Cloud Guru, a training company focused on cloud computing. As a full stack web developer, Williams became intrigued by machine learning and started teaching herself the ML tools on Amazon Web Services. Fast forward to today, Williams has built […]

ml amazon web services cloud guru kesha williams data show

Labeling, transforming, and structuring training data sets for machine learning

Play Episode Listen Later Aug 15, 2019 40:51

In this episode of the Data Show, I speak with Alex Ratner, project lead for Stanford’s Snorkel open source project; Ratner also recently garnered a faculty position at the University of Washington and is currently working on a company supporting and extending the Snorkel project. Snorkel is a framework for building and managing training data. […]

university washington stanford ratner snorkel data show

Make data science more useful

Play Episode Listen Later Aug 1, 2019 35:04

In this episode of the Data Show, I speak with Cassie Kozyrkov, technical director and chief decision scientist at Google Cloud. She describes “decision intelligence” as an interdisciplinary field concerned with all aspects of decision-making, and which combines data science with the behavioral sciences. Most recently she has been focused on developing best practices that […]

google cloud cassie kozyrkov data show

Acquiring and sharing high-quality data

Play Episode Listen Later Jul 18, 2019 39:20

In this episode of the Data Show, I spoke with Roger Chen, co-founder and CEO of Computable Labs, a startup focused on building tools for the creation of data networks and data exchanges. Chen has also served as co-chair of O’Reilly’s Artificial Intelligence Conference since its inception in 2016. This conversation took place the day […]

ceo reilly chen roger chen data show

Tools for machine learning development

Play Episode Listen Later Jul 3, 2019 39:24

In this week’s episode of the Data Show, we’re featuring an interview Data Show host Ben Lorica participated in for the Software Engineering Daily Podcast, where he was interviewed by Jeff Meyerson. Their conversation mainly centered around data engineering, data architecture and infrastructure, and machine learning (ML). Here are a few highlights: Tools for productive […]

tools ml jeff meyerson ben lorica data show

Enabling end-to-end machine learning pipelines in real-world applications

Play Episode Listen Later Jun 20, 2019 42:53

In this episode of the Data Show, I spoke with Nick Pentreath, principal engineer at IBM. Pentreath was an early and avid user of Apache Spark, and he subsequently became a Spark committer and PMC member. Most recently his focus has been on machine learning, particularly deep learning, and he is part of a group […]

ibm spark pmc apache spark data show

Bringing scalable real-time analytics to the enterprise

Play Episode Listen Later Jun 9, 2019 37:12

In this episode of the Data Show, I spoke with Dhruba Borthakur (co-founder and CTO) and Shruti Bhat (SVP of Product) of Rockset, a startup focused on building solutions for interactive data science and live applications. Borthakur was the founding engineer of HDFS and creator of RocksDB, while Bhat is an experienced product and marketing […]

product cto bhat hdfs rocksdb data show

Bringing scalable real-time analytics to the enterprise

Play Episode Listen Later Jun 9, 2019 37:12

In this episode of the Data Show, I spoke with Dhruba Borthakur (co-founder and CTO) and Shruti Bhat (SVP of Product) of Rockset, a startup focused on building solutions for interactive data science and live applications. Borthakur was the founding engineer of HDFS and creator of RocksDB, while Bhat is an experienced product and marketing […]

product cto bhat hdfs rocksdb data show

Applications of data science and machine learning in financial services

Play Episode Listen Later May 23, 2019 42:32

In this episode of the Data Show, I spoke with Jike Chong, chief data scientist at Acorns, a startup focused on building tools for micro-investing. Chong has extensive experience using analytics and machine learning in financial services, and he has experience building data science teams in the U.S. and in China. We had a great […]

china chong acorns data show

Applications of data science and machine learning in financial services

Play Episode Listen Later May 23, 2019 42:32

In this episode of the Data Show, I spoke with Jike Chong, chief data scientist at Acorns, a startup focused on building tools for micro-investing. Chong has extensive experience using analytics and machine learning in financial services, and he has experience building data science teams in the U.S. and in China. We had a great […]

china chong acorns data show

Real-time entity resolution made accessible

Play Episode Listen Later May 9, 2019 27:09

In this episode of the Data Show, I spoke with Jeff Jonas, CEO, founder and chief scientist of Senzing, a startup focused on making real-time entity resolution technologies broadly accessible. He was previously a fellow and chief scientist of context computing at IBM. Entity resolution (ER) refers to techniques and tools for identifying and linking […]

ceo er ibm entity jeff jonas senzing data show

Why companies are in need of data lineage solutions

Play Episode Listen Later Apr 25, 2019 34:29

In this episode of the Data Show, I spoke with Neelesh Salian, software engineer at Stitch Fix, a company that combines machine learning and human expertise to personalize shopping. As companies integrate machine learning into their products and systems, there are important foundational technologies that come into play. This shouldn’t come as a shock, as […]

stitch fix data show

What data scientists and data engineers can do with current generation serverless technologies

Play Episode Listen Later Apr 11, 2019 36:32

In this episode of the Data Show, I spoke with Avner Braverman, co-founder and CEO of Binaris, a startup that aims to bring serverless to web-scale and enterprise applications. This conversation took place shortly after the release of a seminal paper from UC Berkeley (“Cloud Programming Simplified: A Berkeley View on Serverless Computing”), and this […]

ceo data show

It’s time for data scientists to collaborate with researchers in other disciplines

Play Episode Listen Later Mar 28, 2019 36:08

In this episode of the Data Show, I spoke with Forough Poursabzi-Sangdeh, a postdoctoral researcher at Microsoft Research New York City. Poursabzi works in the interdisciplinary area of interpretable and interactive machine learning. As models and algorithms become more widespread, many important considerations are becoming active research areas: fairness and bias, safety and reliability, security […]

data show

Algorithms are shaping our lives—here’s how we wrest back control

Play Episode Listen Later Mar 14, 2019 44:15

In this episode of the Data Show, I spoke with Kartik Hosanagar, professor of technology and digital business, and professor of marketing at The Wharton School of the University of Pennsylvania. Hosanagar is also the author of a newly released book, A Human’s Guide to Machine Intelligence, an interesting tour through the recent evolution of […]

university pennsylvania guide human wharton school machine intelligence kartik hosanagar data show

Why your attention is like a piece of contested territory

Play Episode Listen Later Feb 28, 2019 43:05

In this episode of the Data Show, I spoke with P.W. Singer, strategist and senior fellow at the New America Foundation, and a contributing editor at Popular Science. He is co-author of an excellent new book, LikeWar: The Weaponization of Social Media, which explores how social media has changed war, politics, and business. The book […]

social media singer popular science new america foundation likewar the weaponization data show

The technical, societal, and cultural challenges that come with the rise of fake media

Play Episode Listen Later Feb 14, 2019 30:53

In this episode of the Data Show, I spoke with Siwei Lyu, associate professor of computer science at the University at Albany, State University of New York. Lyu is a leading expert in digital media forensics, a field of research into tools and techniques for analyzing the authenticity of media files. Over the past year, […]

new york university albany state university data show

Using machine learning and analytics to attract and retain employees

Play Episode Listen Later Jan 31, 2019 46:54

In this episode of the Data Show, I spoke with Maryam Jahanshahi, research scientist at TapRecruit, a startup that uses machine learning and analytics to help companies recruit more effectively. In an upcoming survey, we found that a “skills gap” or “lack of skilled people” was one of the main bottlenecks holding back adoption of […]

data show

How machine learning impacts information security

Play Episode Listen Later Jan 17, 2019 39:49

In this episode of the Data Show, I spoke with Andrew Burt, chief privacy officer and legal engineer at Immuta, a company building data management tools tuned for data science. Burt and cybersecurity pioneer Daniel Geer recently released a must-read white paper (“Flat Light”) that provides a great framework for how to think about information […]

burt immuta data show

In the age of AI, fundamental value resides in data

Play Episode Listen Later Jan 3, 2019 29:41

In this episode of the Data Show, I spoke with Haoyuan Li, CEO and founder of Alluxio, a startup commercializing the open source project with the same name (full disclosure: I’m an advisor to Alluxio). Our discussion focuses on the state of Alluxio (the open source project that has roots in UC Berkeley’s AMPLab), specifically […]

ceo uc berkeley alluxio data show

Trends in data, machine learning, and AI

Play Episode Listen Later Dec 20, 2018 28:37

For the end-of-year holiday episode of the Data Show, I turned the tables on Data Show host Ben Lorica to talk about trends in big data, machine learning, and AI, and what to look for in 2019. Lorica also showcased some highlights from our upcoming Strata Data and Artificial Intelligence conferences. Here are some highlights […]

ai artificial intelligence lorica ben lorica strata data data show

Tools for generating deep neural networks with efficient network architectures

Play Episode Listen Later Dec 6, 2018 32:20

In this episode of the Data Show, I spoke with Alex Wong, associate professor at the University of Waterloo, and co-founder of DarwinAI, a startup that uses AI to address foundational challenges with deep learning in the enterprise. As the use of machine learning and analytics become more widespread, we’re beginning to see tools that […]

university ai waterloo alex wong data show

Building tools for enterprise data science

Play Episode Listen Later Nov 21, 2018 31:28

In this episode of the Data Show, I spoke with Vitaly Gordon, VP of data science and engineering at Salesforce. As the use of machine learning becomes more widespread, we need tools that will allow data scientists to scale so they can tackle many more problems and help many more people. We need automation tools […]

salesforce data show

Lessons learned while helping enterprises adopt machine learning

Play Episode Listen Later Nov 8, 2018 31:31

In this episode of the Data Show, I spoke with Francesca Lazzeri, an AI and machine learning scientist at Microsoft, and her colleague Jaya Mathew, a senior data scientist at Microsoft. We conducted a couple of surveys this year—“How Companies Are Putting AI to Work Through Deep Learning” and “The State of Machine Learning Adoption […]

state ai microsoft data show

Machine learning on encrypted data

Play Episode Listen Later Oct 25, 2018 41:22

In this episode of the Data Show, I spoke with Alon Kaufman, CEO and co-founder of Duality Technologies, a startup building tools that will allow companies to apply analytics and machine learning to encrypted data. In a recent talk, I described the importance of data, various methods for estimating the value of data, and emerging […]

ceo data show

How social science research can inform the design of AI systems

Play Episode Listen Later Oct 11, 2018 45:30

In this episode of the Data Show, I spoke with Jacob Ward, a Berggruen Fellow at Stanford University. Ward has an extensive background in journalism, mainly covering topics in science and technology, at National Geographic, Al Jazeera, Discovery Channel, BBC, Popular Science, and many other outlets. Most recently, he’s become interested in the interplay between […]

bbc stanford university national geographic ward discovery channel al jazeera popular science jacob ward berggruen fellow data show

Why it’s hard to design fair machine learning models

Play Episode Listen Later Sep 27, 2018 34:24

In this episode of the Data Show, I spoke with Sharad Goel, assistant professor at Stanford, and his student Sam Corbett-Davies. They recently wrote a survey paper, “A Critical Review of Fair Machine Learning,” where they carefully examined the standard statistical tools used to check for fairness in machine learning models. It turns out that […]

stanford critical review data show

Using machine learning to improve dialog flow in conversational applications

Play Episode Listen Later Sep 13, 2018 45:07

In this episode of the Data Show, I spoke with Alan Nichol, co-founder and CTO of Rasa, a startup that builds open source tools to help developers and product teams build conversational applications. About 18 months ago, there was tremendous excitement and hype surrounding chatbots, and while things have quieted lately, companies and developers continue […]

cto rasa data show

Building accessible tools for large-scale computation and machine learning

Play Episode Listen Later Aug 30, 2018 53:32

In this episode of the Data Show, I spoke with Eric Jonas, a postdoc in the new Berkeley Center for Computational Imaging. Jonas is also affiliated with UC Berkeley’s RISE Lab. It was at a RISE Lab event that he first announced Pywren, a framework that lets data enthusiasts proficient with Python run existing code […]

uc berkeley python berkeley center eric jonas data show

Simplifying machine learning lifecycle management

Play Episode Listen Later Aug 16, 2018 37:25

In this episode of the Data Show, I spoke with Harish Doddi, co-founder and CEO of Datatron, a startup focused on helping companies deploy and manage machine learning models. As companies move from machine learning prototypes to products and services, tools and best practices for productionizing and managing models are just starting to emerge. Today’s […]

ceo data show

How privacy-preserving techniques can lead to more robust machine learning models

Play Episode Listen Later Aug 2, 2018 36:43

In this episode of the Data Show, I spoke with Chang Liu, applied research scientist at Georgian Partners. In a previous post, I highlighted early tools for privacy-preserving analytics, both for improving decision-making (business intelligence and analytics) and for enabling automation (machine learning). One of the tools I mentioned is an open source project for […]

georgian partners data show

Specialized hardware for deep learning will unleash innovation

Play Episode Listen Later Jul 19, 2018 41:18

In this episode of the Data Show, I spoke with Andrew Feldman, founder and CEO of Cerebras Systems, a startup in the blossoming area of specialized hardware for machine learning. Since the release of AlexNet in 2012, we have seen an explosion in activity in machine learning, particularly in deep learning. A lot of the […]

ceo andrew feldman data show

Data regulations and privacy discussions are still in the early stages

Play Episode Listen Later Jul 5, 2018 33:19

In this episode of the Data Show, I spoke with Aurélie Pols of Mind Your Privacy, one of my go-to resources when it comes to data privacy and data ethics. This interview took place at Strata Data London, a couple of days before the EU General Data Protection Regulation (GDPR) took effect. I wanted her […]

aur pols data show

Managing risk in machine learning models

Play Episode Listen Later Jun 21, 2018 32:34

In this episode of the Data Show, I spoke with Andrew Burt, chief privacy officer at Immuta, and Steven Touw, co-founder and CTO of Immuta. Burt recently co-authored a white paper on managing risk in machine learning models, and I wanted to sit down with them to discuss some of the proposals they put forward […]

cto burt immuta data show

The real value of data requires a holistic view of the end-to-end data pipeline

Play Episode Listen Later Jun 7, 2018 31:05

In this episode of the Data Show, I spoke with Ashok Srivastava, senior vice president and chief data officer at Intuit. He has a strong science and engineering background, combined with years of applying machine learning and data science in industry. Prior to joining Intuit, he led the teams responsible for data and artificial intelligence […]

intuit data show

The evolution of data science, data engineering, and AI

Play Episode Listen Later May 24, 2018 30:14

This episode of the Data Show marks our 100th episode. This podcast stemmed out of video interviews conducted at O’Reilly’s 2014 Foo Camp. We had a collection of friends who were key members of the data science and big data communities on hand and we decided to record short conversations with them. We originally conceived […]

o'reilly data show

Companies in China are moving quickly to embrace AI technologies

Play Episode Listen Later May 10, 2018 28:52

In this episode of the Data Show, I spoke with Jason Dai, CTO of Big Data Technologies at Intel, and one of my co-chairs for the AI Conference in Beijing. I wanted to check in on the status of BigDL, specifically how companies have been using this deep learning library on top of Apache Spark, […]

cto beijing intel apache spark ai conference bigdl data show

Teaching and implementing data science and AI in the enterprise

Play Episode Listen Later Apr 26, 2018 38:46

In this episode of the Data Show, I spoke with Jerry Overton, senior principal and distinguished technologist at DXC Technology. I wanted the perspective of someone who works across industries and with a variety of companies. I specifically wanted to explore the current state of data science and AI within companies and public sector agencies. […]

ai dxc technology data show

The importance of transparency and user control in machine learning

Play Episode Listen Later Apr 12, 2018 23:19

In this episode of the Data Show, I spoke with Guillaume Chaslot, an ex-YouTube engineer and founder of AlgoTransparency, an organization dedicated to helping the public understand the profound impact algorithms have on our lives. We live in an age when many of our interactions with companies and services are governed by algorithms. At a […]

guillaume chaslot data show

What machine learning engineers need to know

Play Episode Listen Later Mar 29, 2018 32:16

In this episode of the Data Show, I spoke with Jesse Anderson, managing director of the Big Data Institute, and my colleague Paco Nathan, who recently became co-chair of Jupytercon. This conversation grew out of a recent email thread the three of us had on machine learning engineers, a new job role that LinkedIn recently pegged […]

jesse anderson big data institute jupytercon paco nathan data show

How to train and deploy deep learning at scale

Play Episode Listen Later Mar 15, 2018 39:10

In this episode of the Data Show, I spoke with Ameet Talwalkar, assistant professor of machine learning at CMU and co-founder of Determined AI. He was an early and key contributor to Spark MLlib and a member of AMPLab. Most recently, he helped conceive and organize the first edition of SysML, a new academic conference […]

cmu sysml data show

Using machine learning to monitor and optimize chatbots

Play Episode Listen Later Mar 6, 2018 27:47

In this episode of the Data Show, I spoke with Ofer Ronen, GM of Chatbase, a startup housed within Google’s Area 120. With tools for building chatbots becoming accessible, conversational interfaces are becoming more prevalent. As Ronen highlights in our conversation, chatbots are already enabling companies to automate many routine tasks (mainly in customer interaction). […]

google gm data show

Unleashing the potential of reinforcement learning

Play Episode Listen Later Mar 1, 2018 33:24

In this episode of the Data Show, I spoke with Danny Lange, VP of AI and machine learning at Unity Technologies. Lange previously led data and machine learning teams at Microsoft, Amazon, and Uber, where his teams were responsible for building data science tools used by other developers and analysts within those companies. When I […]

amazon ai microsoft uber lange unity technologies data show

Graphs as the front end for machine learning

Play Episode Listen Later Feb 15, 2018 45:13

In this episode of the Data Show, I spoke with Leo Meyerovich, co-founder and CEO of Graphistry. Graphs have always been part of the big data revolution (think of the large graphs generated by the early social media startups). In recent months, I’ve come across companies releasing and using new tools for creating, storing, and […]

ceo graphs data show

Machine learning needs machine teaching

Play Episode Listen Later Feb 1, 2018 45:12

In this episode of the Data Show, I spoke with Mark Hammond, founder and CEO of Bonsai, a startup at the forefront of developing AI systems in industrial settings. While many articles have been written about developments in computer vision, speech recognition, and autonomous vehicles, I’m particularly excited about near-term applications of AI to manufacturing, […]

ceo ai bonsai mark hammond data show

How machine learning can be used to write more secure computer programs

Play Episode Listen Later Jan 18, 2018 28:12

In this episode of the Data Show, I spoke with Fabian Yamaguchi, chief scientist at ShiftLeft. His 2015 Ph.D. dissertation sketched out how the combination of static analysis, graph mining, and machine learning, can be used to develop tools to augment security analysts. In a recent post, I argued for machine learning tools to augment […]

data show

Bringing AI into the enterprise

Play Episode Listen Later Jan 4, 2018 44:13

In this episode of the Data Show, I spoke with Kristian Hammond, chief scientist of Narrative Science and professor of EECS at Northwestern University. He has been at the forefront of helping companies understand the power, limitations, and disruptive potential of AI technologies and tools. In a previous post on machine learning, I listed types […]

ai northwestern university narrative science eecs data show

How machine learning will accelerate data management systems

Play Episode Listen Later Dec 21, 2017 34:46

In this episode of the Data Show, I spoke with Tim Kraska, associate professor of computer science at MIT. To take advantage of big data, we need scalable, fast, and efficient data management systems. Database administrators and users often find themselves tasked with building index structures (“indexes” in database parlance), which are needed to speed […]

mit database data show

O'Reilly Data Show - O'Reilly Media Podcast

Search for episodes from O'Reilly Data Show - O'Reilly Media Podcast with a specific topic:

Latest episodes from O'Reilly Data Show - O'Reilly Media Podcast

Machine learning for operational analytics and business intelligence

Machine learning and analytics for time series data

Understanding deep neural networks

Becoming a machine learning practitioner

Labeling, transforming, and structuring training data sets for machine learning

Make data science more useful

Acquiring and sharing high-quality data

Tools for machine learning development

Enabling end-to-end machine learning pipelines in real-world applications

Bringing scalable real-time analytics to the enterprise

Bringing scalable real-time analytics to the enterprise

Applications of data science and machine learning in financial services

Applications of data science and machine learning in financial services

Real-time entity resolution made accessible

Why companies are in need of data lineage solutions

What data scientists and data engineers can do with current generation serverless technologies

It’s time for data scientists to collaborate with researchers in other disciplines

Algorithms are shaping our lives—here’s how we wrest back control

Why your attention is like a piece of contested territory

The technical, societal, and cultural challenges that come with the rise of fake media

Using machine learning and analytics to attract and retain employees

How machine learning impacts information security

In the age of AI, fundamental value resides in data

Trends in data, machine learning, and AI

Tools for generating deep neural networks with efficient network architectures

Building tools for enterprise data science

Lessons learned while helping enterprises adopt machine learning

Machine learning on encrypted data

How social science research can inform the design of AI systems

Why it’s hard to design fair machine learning models

Using machine learning to improve dialog flow in conversational applications

Building accessible tools for large-scale computation and machine learning

Simplifying machine learning lifecycle management

How privacy-preserving techniques can lead to more robust machine learning models

Specialized hardware for deep learning will unleash innovation

Data regulations and privacy discussions are still in the early stages

Managing risk in machine learning models

The real value of data requires a holistic view of the end-to-end data pipeline

The evolution of data science, data engineering, and AI

Companies in China are moving quickly to embrace AI technologies

Teaching and implementing data science and AI in the enterprise

The importance of transparency and user control in machine learning

What machine learning engineers need to know

How to train and deploy deep learning at scale

Using machine learning to monitor and optimize chatbots

Unleashing the potential of reinforcement learning

Graphs as the front end for machine learning

Machine learning needs machine teaching

How machine learning can be used to write more secure computer programs

Bringing AI into the enterprise

How machine learning will accelerate data management systems

Claim O'Reilly Data Show - O'Reilly Media Podcast

On the way!