Data Crunch

Follow Data Crunch
Share on
Copy link to clipboard

If you want to learn how data science, artificial intelligence, machine learning, and deep learning are being used to change our world for the better, you’ve subscribed to the right podcast. We talk to entrepreneurs and experts about their experiences employing new technology—their approach, their s…

Vault Analytics


    • Mar 24, 2022 LATEST EPISODE
    • infrequent NEW EPISODES
    • 21m AVG DURATION
    • 80 EPISODES

    4.8 from 78 ratings Listeners of Data Crunch that love the show mention: data, quality, thought, interesting, life, informative, great, like, new, ginette.



    Search for episodes from Data Crunch with a specific topic:

    Latest episodes from Data Crunch

    Creating a Database for AI with Activeloop

    Play Episode Listen Later Mar 24, 2022 27:50


    Working with structured and semistructured data can be hard, but it's currently much easier than working with unstructured data, like images, video, audio, and text. We talk with Davit Buniatyan, CEO of Activeloop, who chats with us about how he works to make unstructured data for machine learning easier and faster to work with. 

    Streamlining Construction with AI

    Play Episode Listen Later Dec 17, 2021 24:14


    What's it like to build an AI product in an industry that still uses outdated project management technology and has no clear conceptual model? After receiving his PhD at Stanford, René Morkos speaks to this very situation. He describes his journey building an AI product in the construction industry.

    The Future of Unstructured Data with Graviti

    Play Episode Listen Later Oct 29, 2021 29:38 Transcription Available


    After graduating from the University of Pennsylvania with a master's degree in artificial intelligence/robotics, Edward Cui  was one of the first Uber self-driving car engineers. He's had a lot of experience working with unstructured data and shares how we can increase the efficiency of modeling unstructured data by an order of magnitude.

    Data Strategy in the Education Sector

    Play Episode Listen Later Oct 1, 2021 17:41


    What is the secret culprit behind overworked teachers and administrators in much of the educational system? We're joined by one of Data Crunch's finest, James Thomas, who tells from both a technological and personal standpoint the real difficulties faced by our teachers and students, and how the right approach to data can solve many of their problems.

    CEOs: Here Is What You Should Know about GPT-3

    Play Episode Listen Later Sep 1, 2021 6:13


    If you haven't heard, GPT-3 is a machine learning model that can write text. Text that looks like a human wrote it—almost out of thin air. The real opportunity is in the ease of use. GPT-3 doesn't replace your writers. It augments them, making writing faster and more accessible without compromising your results. The last thing you want to do is compromise the quality of your content because copy is an exponential business multiplier. The more you can apply it, the more business you'll get. Despite advances in technology, one thing that hasn't changed is that people still talk about products and services—and what they say matters. 

    Cyber Security in Higher Ed

    Play Episode Listen Later Jul 31, 2021 18:50


    Higher education institutions house lots of important data that bad actors can access and sell on the dark web, like students' social security numbers, financial aid information, and national security research. Protecting this information should be a high priority for institutions, but it's not always easy to apply best practices and enforce compliance measures. We chat with Brandon Sherman about these issues.

    GE Aviation's Dinakar Deshmukh Discusses Data

    Play Episode Listen Later Jun 28, 2021 17:16


    As the VP of Data Science and Analytics for GE Aviation, Dinakar Deshmukh talks about how he, and the large team he is over, solve big problems internally and externally by splicing the power of data science with deep domain knowledge. 

    Telmo Silva Talks ClicData

    Play Episode Listen Later May 4, 2021 30:24


    Telmo Silva created ClicData, an end-to-end SAAS BI platform, which as he describes, is the little guy coming up in the BI platform world. He talks about how his company was started, where it’s been, and where it’s going with cutting-edge R&D. He also offers additional thoughts on the role of data in the business world today.

    Pricing with Cactus Raazi

    Play Episode Listen Later Apr 16, 2021 27:11


    Keeping quality customers is the aim of nearly every healthy business. Cactus Raazi challenges the typical methods of doing this and suggests alternative data-focused pricing strategies in order for businesses to survive in the future. 

    AI Making Developers more Effective

    Play Episode Listen Later Mar 25, 2021 26:06


    Robin Purohit talks to us about how he and his company are creating AI tools to help developers be more effective. Learn what their approach is, how they're training their models, and where they're headed in the future. 

    Overcoming Cultural Hurdles in Tech

    Play Episode Listen Later Feb 27, 2021 23:40


    Traffic Equilibrium and a PhD

    Play Episode Listen Later Jan 30, 2021 25:18


    Machine Learning and Flight with Ian Cassidy

    Play Episode Listen Later Dec 31, 2020 22:08


    Ian Cassidy: When you did a PCA, a principal component analysis, like, it was like beautiful. There was, like, a red circle in the middle of, you know, the blue on purchase, you know, data points. And there were the red purchase ones and they were all clustered together. It was, it was really interesting. And like the, the machine learning model had a really good time trying to predict that the ones in that red cluster where the things that people were were interested in purchasing. Ginette: I'm Ginette, Curtis: and I'm Curtis, Ginette: and you are listening to Data Crunch, Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics, training, and consulting company. If you want to become the type of tech talent we talk about on our show today, you’ll need to master algorithms, machine learning concepts, computer science basics, and many other important concepts. Brilliant is a great place to start digging into these. The nice thing about Brilliant is that you can learn in bite-sized pieces at your own pace, and with a bit of consistent effort, you can tackle some really tough subjects. With 60+ courses that combine story-telling, code-writing, and interactive challenges, Brilliant helps develop the skills that are crucial to school, job interviews, and careers. Sign up for free and start learning by going to Brilliant.org slash Data Crunch, and also the first 200 people that go to that link will get 20% off the annual premium subscription.  Now onto our show. We’ve waited to publish today’s episode because Covid has taken a toll on the travel industry and lots of things have changed since we recorded this episode, but there’s good information in this episode, so we don’t want to wait too long to publish it. Hopefully 2021 changes the travel industry’s fortunes and this information becomes even more applicable. So today we chat with Ian Cassidy, former senior data scientist at Upside Business Travel. Ian: I'm Ian Cassidy. And my interests are in the machine learning optimization realm, since I have experience with that from my grad school days, and a little bit about Upside is we are a travel company, travel management company. We offer a product that is no fees, 100% free. And in fact, if you spend over a hundred thousand dollars booking travel on our website, we offer a 3% cash back, as well as free customer service, 24/7, no contracts. So that's you sign up with us, no contracts, you get all of this as soon as you sign up. We are a one-stop shop to book and manage all of your travel. In one place, we offer flights, hotels, rental cars, and we also offer expense integration and reporting for companies looking to, to manage all of their, their travelers and, and their expenses for that.Curtis: Right on. We talked before about the journey that your company has gone through, uh, to figure out how to best use data, you know, how to target and what really works with, with machine learning and things like this. So I'd love to just talk a little bit about that: where you guys started and how you guys made some decisions, what you learned along the way and what you're, what you're up to from a data science perspective.Ian: Yeah, sure. So, uh, you know, like you mentioned, things have changed quite a bit at Upside. We started off as a B2C company where we were targeting what we were calling do it yourself travelers. You did not have to be logged into our site in order to start doing a search and book flights or hotels. So that kind of made it interesting from a data collection perspective. We had like some unique IDs about who the people were that were doing the searching, but it was, it was largely kind of, you know, we didn't really know much about you when you, when you were searching. So when we started, one of the main things that we were trying to improve upon was our sorting of inventory....

    Implementing ML Algorithms with Ylan Kazi

    Play Episode Listen Later Dec 1, 2020 26:28


    Hiring Top Tech Talent

    Play Episode Listen Later Oct 31, 2020 18:49


    Making Data Assets Profitable with VDC

    Play Episode Listen Later Sep 30, 2020 23:34


    Many companies are sitting on data assets that could be revenue streams for them, without knowing it. Matt Staudt of VDC discusses making latent data profitable. Ginette: I'm Ginette, Curtis: and I'm Curtis, Ginette: and you are listening to Data Crunch, Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics, training, and consulting company. Ginette: Today, we chat with the president and CEO at the Venture Development Center, Matt Staudt. Matt Staudt: The company that I'm with is VDC, Venture Development Center. Basically VDC is an organization that works in the alternative big data, bringing buyer and seller together. So we have a unique perspective on available data assets that are out in the marketplace and a unique perspective of the companies that utilize them, and what they're specifically looking for in the way of points of, uh, value for various data assets. My background was originally in the marketing and advertising area, where I owned a company for 20 years, IMG, Interactive Marketing Group. I left that in 2007 and joined this, which was more or less of a lifestyle organization. And we made it a full-fledged organization company back in 2010.Curtis: Now, when you say data assets, can you put a little bit of definition around that for the listeners? Just so they understand how you define a data asset? 'Cause I imagine there may be some things that you think are valuable that maybe they haven't thought of, or maybe it'll help expand our thinking around what a data asset is.Matt: Yeah, sure. In my, in my terminology "data asset" basically falls into eight different categories, where assets basically come from within the information world. So they could be things like transaction data or crowdsource data. They could be things like search data or social data sets. They fall into various categories, traditional data, meaning assets that are business to business or business to consumer generally aggregated by large companies that most everybody's heard of Dun & Bradstreet, Infogroup, Axcium, the credit bureaus, et cetera. Alternative data in our world are companies that have unique data points, unique. They're collecting unique pieces of information, usually as a byproduct of their core business. And we look at the assets that the data sets, the actual data points that they collect. And we figure out if there might be something of value to take to the marketplace, usually to the large consumers of the data, the big aggregators that I previously mentioned, but oftentimes it also fits well with some of our mid-tier players. And we have a significant amount of relationships in the brand grouping, meaning large organizations that they themselves are looking to try and take advantage of big data and utilize data in sales, marketing operations, in order to transform or help to administer certain activities that they have going on.Curtis: Do you find that this is maybe industry specific, like for example, a big insurance company, or if you're in healthcare or something like this, it tends to be more data intensive that you see more activity there or, or is this really applicable across the board? What kind of industries do you find have a lot of applications?Matt: Yeah. Well, it's interesting on the surface, you certainly think that there's probably industries that would have a larger appetite and a larger need for data than, than other organizations, but going, you know, through the list of companies that we've helped over the last 15 or 20 years, it really runs the gamut. I mean, we've worked with insurances, you mentioned insurance, insurance companies. I mentioned credit bureaus. We work with credit bureaus, risk and fraud, sales and marketing, sometimes large brands within those retail environments. So it really truly has run the gamut for us. There's,

    Machine Learning with Max Sklar

    Play Episode Listen Later Aug 28, 2020 20:57


    Think Differently with Graph Databases

    Play Episode Listen Later Jul 31, 2020 31:26


    Data, Epidemiology, and Public Health

    Play Episode Listen Later Jul 17, 2020 29:41


    With recent events being what they are, epidemiology has come into the spotlight. What do epidemiologists do and how does data shape their everyday experience? Sitara and Mee-a from "Donuts and Data" fill us in.     Ginette: I'm Ginette, Curtis: and I'm Curtis, Ginette: and you are listening to Data Crunch, Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Ginette: Data crunch is produced by the Data Crunch Corporation, an analytics training and consulting company. Many people are on the lookout for online math and science resources right now, particularly data and statistics courses, and whether you're a student looking to get ahead, a professional brushing up on cutting-edge topics, or someone who just wants to use this time to understand the world better, you should check out Brilliant. Brilliant’s thought-provoking math, science, and computer science content helps guide you to mastery by taking complex concepts and breaking them up into bite-sized understandable chunks. You'll start by having fun with their interactive explorations, over time you'll be amazed at what you can accomplish. Sign up for free and start learning by going to Brilliant.org slash Data Crunch, and also the first 200 people that go to that link will get 20% off the annual premium subscription. Now onto the show. Curtis: I'd like to welcome Sitara and Mee-a from the Instagram account Donuts and Data to talk to us today. I guess let's just have you guys introduce yourselves, as opposed to me trying to introduce you cause you know what you do better than I do. So maybe we just have some introductions. Sitara: So I'm Sitara one half of Donuts and Data. I'm a PhD student in epidemiology at the University of Texas Health Science Center. I'm also a research assistant in a lab that I work in. Mee-a: And I'm Mee-a. I am an infectious disease epidemiologist that works in the public sector. I actually met Sitara through the lab that she's currently working in. Curtis: Nice. And I'm excited to have you guys on. I just, I think epidemiology is a really interesting space, especially with what, you know, with what's going on now with COVID. I think it's more pertinent than it ever has been. Not that it ever hasn't been pertinent, but maybe it's more top of mind for people. So I'd love maybe just to have you guys level set with everybody, like what is epidemiology. There's probably some confusion about what that is and maybe how you guys got into it. And then we can get into what your day to day is and, and what it's all about. Sitara: So, epidemiology, I think everyone's kind of understanding is setting patterns of disease in the, in the human population. And so in that sense, what Mee-a and I do are the same, but instead of studying infectious diseases or the natural science part of epidemiology, what I focus on is how human behavior contributes to those patterns of disease. So I look for patterns in data associated like demographics or just behaviors, diet, nutrition, and how that contributes to getting diseases. Mee-a: For me in the public sector, it's going to be a lot of looking at incidents, rates of infectious diseases. It . . . primarily with COVID-19 right now, and just different ways that we can try to possibly implement infection prevention measures. So we are dealing a little bit more with, I don't want to say the medical side of it because we aren't clinicians, but we are dealing more with the medical side of, of the infectious disease than we are with, with the data compared to when I was in academia, at least. Curtis: So take us through maybe the end goal, right? So what you guys are working on. You're hoping to come out with, I think, some recommendations for people to, to take maybe a better understanding of how the disease spreads, so we get in front of it. What does that look like? Mee-a: I always thought that epidemiology's gold standard of what we try to achieve is probably...

    Vast ETL Efficiency Gain with Upsolver

    Play Episode Listen Later Jul 1, 2020 22:40


    Data Flexibility in Healthcare

    Play Episode Listen Later May 31, 2020 27:28


    Education and AI

    Play Episode Listen Later Apr 24, 2020 27:39


    For David Guralnick, education, AI, and cognitive psychology have always held possibility. With many years of experience in this niche, David runs a company that designs education programs, which employ AI and machine learning, for large companies, universities, and everything in between.   David Guralnick: Somehow what's happened in a lot of the uses of technology and education to this point is we've taken the mass education system that was there only to solve a scalability problem, not because it was the best educational method. So we've taken that and now we've scaled that even further online because it's easy to do and easy to track. Ginette Methot: I’m Ginette, Curtis Seare: and I’m Curtis, Ginette: and you are listening to Data Crunch, Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company. Curtis: First off, I'd like to thank everyone who has taken the Tableau fundamentals zombie course that we announced the last episode. We've been getting a lot of great feedback from you. It's fun to see how people are enjoying the course and thinking that it's fun and also clear and it's helping them learn the fundamentals of Tableau. The reason we made that course is because Tableau and data visualization are really important skills. They can help you get a better job, they can help you add value to your organization. And so we hope that the course is helping people out. Also, according to the feedback that we have received, we've made a couple of enhancements to the course, so there are now quizzes to test your knowledge. There are quick tips with each of the videos to help you go a little bit further than even what the videos teach. We've also included a way to earn badges and a certificate so that you can show off your skills to your employer or whoever. And we've also thrown in a couple other bonuses. One is our a hundred plus page manual that we actually use to train at fortune 500 companies so that'll have screenshots and tutorials and tips and tricks on the Tableau fundamentals. And we have also included a checklist and a cheat sheet, both of which we actually use internally in our consulting practice to help us do good work. One of them will help you know which kind of chart to use in any given scenario that you may encounter, whether that's a bar chart or a scatter plot or any number of other more advanced charts. And the other is a checklist that you can run down and say, "do I have this, this, this and this in my visualization before I take it to present to someone to make sure that that's going to be a good experience." So hopefully all of that equals something that is really going to help you guys. And something also where you can learn Tableau and have fun doing it, saving the world from the zombie apocalypse, and the price has risen a little bit since last time. But for our long-time listeners here, if you use the code "podcastzombie" without any spaces in the middle, then that'll go ahead and take off 25% of the list price that is currently on the page. So hopefully more of you guys can take it and keep giving us feedback so we can keep improving it. And we would love to hear from you Ginette: Now onto the show today. We chat with David Guralnick, president and CEO of kaleidoscope learning. David: I've had a long time interest in both education and technology going way, way back. I was, I was lucky enough to go to an elementary school outside of Washington DC called Green acres school in Rockville, Maryland, which was very project based. So it was non-traditional education. You worked on projects, you worked collaboratively with people, your teachers' role was almost as much an advisor and mentor as a traditional teacher. It wasn't person in front of the room talking at you, and you learn how to, you know,

    Upskilling from Home

    Play Episode Listen Later Apr 1, 2020 13:25


    How to Reduce Uncertainty in Early Stage Venture Funding

    Play Episode Listen Later Feb 29, 2020 24:11


    Data in Healthcare with Ron Vianu

    Play Episode Listen Later Jan 30, 2020 20:01


    If you've ever tried to find a doctor in the United States, you likely know how hard it is to find one who's the right fit—it takes quite a bit of research to find good information to make an informed choice. Wouldn't it be nice to easily find a doctor who is the right fit for you? Using data, Covera Health aims to do just that in the radiology specialty. Ron Vianu: I think the tools are really improving year over year to a significant degree, but like anything else, the tools themselves are only as useful as how you apply them. You can have the most amazing tools that could understand very large datasets, but you know how you approach looking for solutions, I think can dramatically impact. Do you yield anything useful Ginette Methot: I’m Ginette, Curtis Seare: and I’m Curtis, Ginette: and you are listening to Data Crunch, Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company. If you're a business leader listening to our podcast and would like to move 10 times faster and be 10 times smarter than your competitors, we're running a webinar on February 13th where you can learn how to do this and more. Just go to datacrunchcorp.com/go to sign up today for free. If you're a subject matter expert in your field, like our guest today, and you're looking to understand data science and machine learning, brilliant.org is a great place to dig deeper. Their classes, help you understand algorithms, machine learning concepts, computer science basics, and many other important concepts in data science and machine learning. The nice thing about brilliant.org is that you can learn in bite-sized pieces at your own pace. Their courses have storytelling, code writing and interactive challenges, which makes them entertaining, challenging, and educational. Sign up for free and start learning by going to brilliant.org/data crunch. And also the first 200 people that go to that link will get 20% off the annual premium subscription. Today we chat with Ron Vianu, the CEO of Covera Health. Let's get right to it. Curtis: What inspired you to get into what you're doing, uh, to start Covera health? Where did the idea come from and what drives you? So if we could start there and learn a little bit about you and the beginnings of Covera health, that would be great. Ron: Sure. Uh, and I, I guess it's important to state that, you know, I'm a problem solver by nature, and my entire professional career, I've been a serial entrepreneur building companies to solve very specific problems. And as it relates to Covera, the, the Genesis of it was understanding that there were two problems in the market with respect to, uh, the healthcare space, which is where we're focused that were historically unsolved and there were no efforts really to solve them in, from my perspective, a data-driven way. And that was around understanding quality of physicians that is predictive to whether or not they'll be successful with individual patients as they walk through their practice. And so if you, and we're focused on the world of radiology, which today is highly commoditized and what that means is that there was a presumption that wherever you get an MRI or a CT study for some injury or illness, it doesn't matter where you go. It's more about convenience and price perhaps. Whereas what we understand given our research and the, the various things that we've published since our beginning is that one, it's like every other medical specialty. It's highly variable. Two, since radiology supports all other medical specialties in a, as a tool for diagnosis, diagnostic purposes, any sort of variability within that specialty has a cascading effect on patients downstream. And so for us, the beginning was, is this something that is solvable through data?

    Data Literacy with Ben Jones

    Play Episode Listen Later Dec 19, 2019 29:59


    We talk with Ben Jones, CEO of Data Literacy, who's on a mission to help everyone understand the language of data. He goes over some common data pitfalls, learning strategies, and unique stories about both epic failures and great successes using data in the real world. Ginette Methot: I’m Ginette, Curtis Seare: and I’m Curtis, Ginette: and you are listening to Data Crunch, Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company. It’s becoming increasingly important in our world to be data literate and to understand the basics of AI and machine learning, and Brilliant.org is a great place to dig deeper into this and related topics. Their classes help you understand algorithms, machine learning concepts, computer science basics, and many other important concepts in data science and machine learning. The nice thing about Brilliant.org is that you can learn in bite-sized pieces at your own pace. Their courses have storytelling, code-writing, and interactive challenges, which makes them entertaining, challenging, and educational. Sign up for free and start learning by going to Brilliant.org/DataCrunch, and also the first 200 people that go to that link will get 20% off the annual premium subscription. Curtis: Ben Jones is here with me on the podcast today. This is a couple months coming. Excited to have him on the show. He's well known in the data visualization community, he's done a lot of great work there. Uh, used to work for Tableau. Now he's off doing his own thing, has a company called Data Literacy, which is interesting. We're going to dig into that and also has a new book out called Avoiding Data Pitfalls. So all of this is really great stuff and we're happy to have you here, Ben. Before we get going, just give yourself a brief introduction for anyone who may not know you and we can go from there. Ben: Yeah, great. Thanks Curtis. You mentioned some of the highlights there. I uh, worked for Tableau for about seven years running the Tableau public platform, uh, in which time I wrote a book called Communicating Data with Tableau. And the fun thing was for me that launched kind of a teaching, um, mini side gig for me at the University of Washington, which really made me fall in love with this idea of just helping people get excited about working with data. Having that light bulb moment where they feel like they've got what it takes. And so that's what caused me to really want to lead Tableau and launch my own company Data Literacy at dataliteracy.com which is where I help people, you know, as I say, learn the language of data, right? Whether that's reading charts and graphs, whether that's exploring data and communicating it to other people through training programs to the public as well as working one on one with clients and such. So it's been a been an exciting year doing that. Also, other things about me, I live here in Seattle, I love it up here and go hiking and backpacking when I can and have three teenage boys all in high school. So that keeps me busy too. And it's been a fun week for me getting this book out and seeing it's a start to ship and seeing people get it. Curtis: Let's talk a little bit about that because the book, it sounds super interesting, right? Avoiding Data Pitfalls, and there are a lot of pitfalls that people fall into. So I'm curious what you're seeing, why you decided to write the book, how difficult of a process it was and then some of the insights that you have in there as well. Ben: Yeah, so I feel like the tools that are out there now are so powerful and way more so than when I was going to school in the 90s, and it's amazing what you can do with those tools. And I think also it's amazing that it's amazing how easy it is to mislead yourself. And so I started realizing that that's sometim...

    Social Media and Machine Learning

    Play Episode Listen Later Nov 21, 2019 22:30


    How do you build a comprehensive view of a topic on social media? Jordan Breslauer would say you let a machine learning tool scan the social sphere and add information as conversations evolve, with help from humans in the loop. Ginette Methot: I’m Ginette, Curtis Seare: and I’m Curtis, Ginette: and you are listening to Data Crunch, Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company. Ginette: Many of you want to gain a deeper understanding of data science and machine learning, and Brilliant.org is a great place to dig deeper into these topics. Their classes help you understand algorithms, machine learning concepts, computer science basics, probability, computer memory, and many other important concepts in data science and machine learning. The nice thing about Brilliant.org is that you can learn in bite-sized pieces at your own pace. Their courses have storytelling, code-writing, and interactive challenges, which makes them entertaining, challenging, and educational. Sign up for free and start learning by going to Brilliant.org slash Data Crunch, and also the first 200 people that go to that link will get 20% off the annual premium subscription. Let’s get into our conversation with Jordan Breslauer, senior director of data analytics and customer success at social standards. Jordan: My name is Jordan Breslauer. I'm the senior director of data analytics and customer success at social standards. I've always been a data geek as it pertains to sports. I think of Moneyball when I was younger, I always wanted to be kind of a the next Billy Bean and I, when I started working for sports franchises right after high school and early college days, I just realized that, that type of work culture is wasn't for me, but I was so, so into trying to answer questions with data that had no previously clear answer, you know? I loved answering subjective questions like, or what makes the best player or how do, how do I know who the best player is? And I thought what was always fun was to try and bring some sort of structured subjectivity to those sorts of questions through using data. And that's really what got me passionate about data in the first place. But then I just started to apply it to a number of different business questions that I always thought were quite interesting, which have a great deal of subjectivity. And that led me to Nielsen originally where my main question that I was answering on a day-to-day basis, what was, what makes a great ad? Uh, what I found though is that advertising at least, especially as it pertains to TV, is really where brands were moving away from and a lot of the real consumer analytics that people were looking for were trying to underpin people in their natural environment, particularly on social media. And I hadn't seen any company that had done it well. Uh, and I happened to meet social standards during my time at Nielsen and was truly just blown away with this ability to essentially take a large input of conversations that people were happening or happening, I should say, and bring some sort of structure to them to actually be able to analyze them and understand what people were talking about as it pertained to different types of topics. And so I think that's really what brought me here was the fascination with this huge amount of data behind the ways that people were talking about on social. And the fact that it had some structure to it, which actually allowed for real analytics to be put behind it. Curtis: It's a hard thing to do though. Right? You know, to answer this question of how do we extract real value or real insight from social media and you'd mentioned historically or up to this point, companies that that are trying to do that missed the mark.

    Deep Learning, Microwaves, and Bugs

    Play Episode Listen Later Nov 8, 2019 18:39


    Sometimes AI and deep learning are not only overkill, but also a subpar solution. Learn when to use them and when not. Diego from Northwestern's Deep Learning Institute discusses practical AI and deep learning in industry. He covers insights on how to train models well, the difference between textbook and real AI problems, and the problem of multiple explanations. Diego Klabjan: One aspect of the problem it has to have in order to be, to be amenable to AI is complexity, right? So if you have, if you have a nice data with, I don't know, 20, 30 features that you can quote, put in a spreadsheet, right? So then, then AI is going to be an overkill and it's actually sort of not, is going to be an overkill. It's going to be a subpar solution. Ginette Methot: I’m Ginette, Curtis Seare: and I’m Curtis, Ginette: and you are listening to Data Crunch, Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company. We’d like to hear what you want to learn on our future podcast episodes, and so we’re running a give away until our next podcast episode comes out. We’re giving away our book Simple Predictive Analytics. All you have to do is go on to LinkedIn and tag The Data Crunch Corporation in a post with your suggestion, and we’ll randomly pick a winner from those who submit. If you win and you’re in the US, we’ll send you a physical copy, and if you’re in another country, we’ll send you an electronic copy. Can’t wait to hear from you. Today, we chat with Professor Diego Klabjan the director of the Master of Science in Analytics and director of the Deep Learning Lab at Northwestern University. Diego: My name is Diego Klabjan. So I'm a faculty at Northwestern University in the department of industrial engineering and management sciences. I actually spend my entire career in academia. So I graduated from Georgia tech in '99, and then I spent six years at the university of Illinois Urbana-Champaign and got my tenure there. And then I was recruited here at Northwestern as a tenured faculty member a year later. So I'm at Northwestern for approximately 14 years. Yeah, so I'm the director of the master of science in analytics, actually founding director of the master of science in analytics, so I established the master's program back in 2010, and I'm directing it since then. And recently, I also became the director of the center for deep learning, which is a relatively new initiative at Northwestern. Sort of we, we are having discussions for the last year and a half, and about half a year ago, we officially kicked it off with a few founding members. So my expertise is in machine learning and deep learning. So I have, I run sort of a very big research program. So I advise more than 15 PhD students from a variety of, of departments and the vast majority of them do deep learning research. Yeah, so I started, I started deep learning what was around six, seven years ago. So I was definitely not sort of one of the, one of the early or the earliest faculty members conducting, studying, being attached to deep learning. But I wasn't that late to the game either. Right. So I still, I still remember approximately six, seven years ago attending deep learning conferences with like 50 attendees, and now, now those conferences are like 5,000 people. Just astonishing. Curtis: That's crazy. How you've seen that grow. Diego: Yup. Um, yeah, and I'm also, so the last word is ah, I'm also a founder of OPEX analytics, which is a consulting company. I no longer have much to do with the company, uh, but sort of have experience also on the business side. Curtis: Great. So this, uh, the deep learning Institute started about a year or two ago, is that right? Did I understand that right? Diego: Yeah, that's correct. I mean, so we,

    Potential Advantages of Blockchain for Data Scientists

    Play Episode Listen Later Oct 22, 2019 25:53


    Luciano Pesci is bullish on blockchain and data science. Since blockchain offers a complete historical record, no one can delete or alter prior information written into the record. He sees this characteristic as a massive advantage for data scientists.  Luciano Pesci: And the key for data scientists and leaders who are gonna oversee data sciences, you've got to get a narrow enough problem to demonstrate one quick win and I mean in 90 days. If in 90 days you can't come back to the organization and show, "we have made real progress on these metrics in your understanding so that you can make these decisions," they're not going to continue to do it. Ginette Methot: I’m Ginette, Curtis Seare: and I’m Curtis, Ginette: and you are listening to Data Crunch, Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company. Ginette: No matter what your position in a company is, knowing about data, how it works, and what it can do for you is vital to the success of your organization. Fortunately there are ways for you and those in your organization to learn about data. Brilliant dot org, an online educational resource, has on-demand classes in data basics that can help you understand this growing area, providing you with tools and the framework you need to break up complex concepts into bite-sized chunks. You can sign up for free, preview courses, and start learning by going to Brilliant.org/DataCrunch, and also the first 200 people that go to that link will get 20% off the annual premium subscription. Ginette: The CEO of Emperitas, Luciano Pesci, joins us today. Let’s get right into the episode. Curtis: What inspired you to get into data? What inspired you to to start the company you're working at now and how'd you get going? Luciano: All of it was a complete accident. Yeah, none of it, not the schooling, the business, none of it was intentional. Curtis: Okay, let's hear about it. Luciano: My first business was actually recording studio and a record label, and I had signed, among other acts, my own band, and we got a management deal, and we went to LA. We started to tour with national acts, and I thought that was going to be my career path without a doubt, and so I didn't take the ACT/SAT at the time, barely graduated high school, and then the band fell apart. And I was like, "well, what am I going to do?" So I went back to school, had a transformative experience, got drawn into economics, and then within economics really found data. Curtis: And what drew you to economics? Luciano: I like studying people. I think it's the most complete picture of people. So there's a lot of other disciplines that sort of dive deeper when it comes to people's psychological characteristics, their behavioral components. But economics was about the entire system and how an individual functions within that bigger system. And the reason I got to data from that was that the key assumption of modern economics is perfect information. So this is usually where critics of what is called the classical model in economics come in and say, "well, you can't have perfect information, so therefore you can't have optimizing behavior." And one of the beautiful lessons of the last 20 years, especially with data science is it might not be perfect information, but you can get really good information to make optimized choices. And so the represented that, that method of going into the real world and optimizing all these processes that we were learning about in the textbooks and at the abstract theory level. Curtis: Interesting. And that's, there's not a lot of places, if any, that I know of that teach that approach, right? Or have good coursework around that. Did you kind of figure this out on your own or how'd you, how'd you come to that?

    How to Predict World Events with Predata

    Play Episode Listen Later Oct 9, 2019 16:42


    There have been some spectacular fails when it comes to looking at Internet traffic, think Google Flu Trends; however, Predata, a company that helps people understand global events and market moves by interpreting signals in Internet traffic, has honed human-in-the-loop machine learning to get to the bottom of geopolitical risk and price movement. Predata uncovers predictive behavior by applying machine learning techniques to online activity. The company has built the most comprehensive predictive analytics platform for geopolitical risk, enabling customers to discover, quantify and act on dynamic shifts in online behavior. The Predata platform provides users with quantitative measurements of digital concern and predictive indicators for different types of risk events for any given country or topic. Dakota Killpack: Over the past few years, we’ve have collected a very large annotated data set about human judgment for how relevant many, many pieces of web content are to various tasks. Ginette Methot: I’m Ginette, Curtis Seare: and I’m Curtis, Ginette: and you are listening to Data Crunch, Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company. Let’s jump into our episode today with the director of Machine Learning at Predata. Dakota: My name is Dakota Killpack and I'm the director of machine learning at Predata, and Predata is a company that using machine learning to look at the, the spectrum of human behavior online organizes it into useful signals about people's attention and we use those to influence how people make decisions by giving them a factor of what people are paying attention to. Because attention is a scarce cognitive resource. People tend to pay attention only to very important things, If they're about to act in a way that might cause problems for our potential clients, they'll, they'll spend a lot of time online doing research, making preparations, and by unlocking this attention dimension to web traffic, we're able to give some unique insights to our clients. Curtis: Can we jump into maybe a concrete use case into what you're talking about just to frame and put some details around how someone might use that service? Dakota: Absolutely. So one example that I find particularly useful for revealing how attention works online is looking at what soybean farmers did in response to a tariffs earlier this year. So knowing that the, they weren't going to get a very good price on soybeans at that particular moment. A lot of them were looking up how to store their grain online and purchasing these very long grain storage bags, purchasing some obscure scientific equipment needed to insert big needles into the bags to get a sample for testing the soybeans and moisture testing devices to make sure they wouldn't grow mold. And all of these webpages are things that tend to get very little traffic. And when we see an increase in traffic to all of them, at the same time, we know that a, a very influential group of individuals, namely farmers, is paying attention to this topic. Using that we're able to give early warning to our clients. Curtis: Sounds like looking for needles in a haystack of data. Right? So how do you determine what is a useful bit of information in the context of what your clients are looking for? Do they kind of have an idea of what you're looking for and then you'd go out and search for that or, or does your algorithm find anomalies in the data and then characterize those anomalies so that you can then report that back? How does it work? Dakota: It’s a mix of both. Because the, the Internet is such a rich and complex domain. It's, it's very dangerous to just look for anomalies at scale. There there've been some high profile failures, most notably the Google Flu Trends

    Structuring Your Data Science Dream Team

    Play Episode Listen Later Sep 26, 2019 15:30


    The way you organize your data science team will greatly affect your business’s outcome. This episode discusses different structures for a data science team, as well as top down versus bottom up approaches, how to get data science solutions into production organically, and how to be part of the business while remaining in contact with other data scientists on the team. Mark Lowe: Having lived through small scale, two people working, to large scale, thousands of people in your organization, the way that you organize the data science team has dramatic effect on its productivity. Ginette Methot: I’m Ginette, and I’m Curtis, and you are listening to Data Crunch, a podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company. Building effective data science processes is tough. Mode, the data science platform, has compiled three tips to make it a bit easier: don’t over plan, there’s no one process that fits everyone, and waste time. That’s right. Waste time. Read more at mode.com/dsp M O D E.com/D S P. Today we’re going to talk about effective ways you can organize your data science team, and we’ll hear lots of great insights from our guest. Let’s get to it. Mark: My name is Mark Lowe. I’m currently the senior principal data scientist here at Valassis. Curtis Seare: Describe just a little bit about what Valassis does. Mark: So we work with pretty much every major manufacturer retailer in the U.S. Our work kind of runs the gamut in terms of solving problems for them in terms of how do I influence customers. And so we manage a lot of print products that go reach every household, every week and of course a lot of digital products. So everything from display advertising, campaign, search campaign, social. Pretty much any distribution mechanism that can influence customers, we try to use those channels. Curtis: And in working on these problems we talked a little bit about earlier what the approaches for data science. Some people try to bin it in a software development kind of a role, an agile role, and how that usually doesn’t work for data science cause it’s more of an experimental type of a thing. Can you comment on its similarities and differences and how you should be approaching data sites? Mark: I think that’s a great question. Honestly, if you, if you asked me 10 years ago if this was an interesting question, I would have found it very boring. But having, having lived through small-scale, two people working, to large scale, thousands of people in your organization, the way that you organize the data science team has dramatic effect on its productivity, and there’s no one size that fits all. Honestly, you kind of have to cater the organization of the data science team to where the company is. For example, the two common models that are deployed and, and we’ve, we’ve lived in both of them is kinda thinking about data science as an internal consulting group. So I have a a pool of data scientists. Stakeholders throughout the company come to me and ask, they say, “I have this problem. I think it needs data science” and then the data science lead or team. Yes, we do need a data scientist working on that. Here’s a person with that specialty. So kind of farming out individuals on the team to solve particular problems. So it’s a fairly centralized organization and that, you know, there’s a lot of benefits to that. One, you’ve got strong sense of community as a team. Oftentimes you’re very tightly organized together. You function as a data science unit. You can try to make sure that you’re putting the right skillset for the right problem. As you know, as you’ve talked to that, there’s, there is no one definition of data science, there’s no one skillset. So oftentimes the data science team has a mixture of skills across the team,

    The Hidden World of Data Science in Utilities

    Play Episode Listen Later Sep 19, 2019 19:05


    David Millar is a man bringing analytical solutions to an industry that historically has had little data. But with the explosion of smart devices, that is all changing, and the way utilities operate is as well. David Millar: The way that electricity markets work is that you have what's called the day ahead market. And so the day before, let's say one o'clock tomorrow, markets run, and this is a big optimization problem. Ginette Methot: I'm Ginette Curtis Seare: And I'm Curtis Ginette: And you are listening to Data Crunch, Curtis: A podcast about how applied data science, machine learning and artificial intelligence are changing the world. Ginette: Data Crunch is produced by the Data Crunch Corporation and analytics training and consulting company. Ginette: The father of lean startup methodology once said “There are no facts inside the building so get the heck outside.” The utilities industry is no different. Sometimes the facts that’ll make your machine learning career are waiting just outside your office. Read more at mode.com/MLutilities. m o d e dot com slash M L utilities.  Ginette: David Millar is a man bringing analytical solutions to an industry that historically has had little data. But with the explosion of smart devices, that's all changing, and the way utilities operate is as well. Let's get into it. David: I'm, ah, Dave Millar. I am the director of resource planning consulting at Ascend Analytics where I lead the research client consulting team. And so my team and I work with utilities primarily to help them make decisions using analytics, regarding their longterm power portfolio. So primarily I read looking at we'll say we're retiring coal plants or retired, retired gas plant. What would we replace it with? Renewable energy. We need batteries. How do we approach these questions using analytics in order to help us come up with the best solution going forward. Curtis: You had talked a little bit about, you sent me some notes about how the, the sector that you're in, the power sector, you know, is kind of slow moving, right? It's not known for these quick changes and innovations, but you are starting to see some things that, that's gonna change this fundamentally. And so if we could jump into that and, and then get your perspective, I'd love to hear about it. David: Yeah, the power sector basically didn't change from the time of once they figured out that we're going to use alternating current that it didn't really change much in the past hundred years, that the model is essentially the same. You have big power stations that are far away from the load centers and then you have this transition network and flow of electricity is really one direction, right, from, from the big power plants to your home. And technology is rapidly changing that and it creates a space to becoming both more digital and more decentralized. So, on the digital front, we, we actually have generation technologies, that don't use anything, any spinning parts, right? so you have solar, solar power, and you have, now we're seeing more and more batteries being connected to solar. And so those are both digital technologies that are increasingly becoming this default, energy source, wind or solar and batteries and and just because the cost of the signals is have, dramatically over the past 10, 10. It's really happened over the past 10 years. And so now renewables are at parity with the more conventional sources of electricity. So gas, power and natural gas power, coal power. Curtis: Is that in terms of like how much energy they're currently producing parity or just effectiveness or efficiency. What is that parity? David: Parity in terms of costs. So, you know, as renewables drop in costs, especially as batteries drop in costs, that means that when, when I look at a problem with my clients, we're comparing, technologies that essentially have the ability, similar attributes,

    The Good Fight against Shadow IT

    Play Episode Listen Later Sep 12, 2019 22:21


    Simeon Schwarz has been walking the data management tightrope for years. In this episode, he helps us see the hidden organizational and economic impacts that come from leading a data management initiative, and how to understand and overcome the inertia, fears, and status quo that hold good data management back. Simeon Schwarz: Fighting against shadow IT . . . you have to find a way to adopt it, you have to find a way to incorporate it, and you have to find a way to leverage it. You will never be able to completely eliminate it. Ginette Methot: I'm Ginette. Curtis Seare: And I'm Curtis. Ginette: And you are listening to Data Crunch, Curtis: A podcast about how applied data science, machine learning and artificial intelligence are changing the world. Ginette: This might come as a surprise to some, but......tools won’t build a data-driven culture.  The right people will.  Read more at mode.com/datadrivenculture. m o d e dot com slash data driven culture. Ginette: Today we speak with Simeon Schwarz. He’s been working in data management for over twenty years and owns his own consultancy, Data Management Solutions. Simeon: Being in the data management function, you're de facto seeing the life blood of how the business flows, how the uh, where the information goes, how the decision are made. Curtis: So have you been focused mainly in a, in a specific industry or have you spend a lot in your career? Simeon: I've started in telecom. I've built first cell phone carrier back in my home country. I worked in academia, in a retail, ecommerce, and then 10 years in financial services, most recently, and now I do insurance. So a lot of different fields. Curtis: So you've run the gamut. That's interesting. And now that you've done this in several different fields, do you find that the principles and your approach is basically the same or or is it different depending on the problems that you're trying to solve? Simeon: The approach is the same, and there are two parts to this. We'll talk about what's difficult in this role a little bit further in this conversation. The second part is you really need to understand the domain you're dealing with because, one, if we, if we're talking about data management in general, one of the key functions, one of the key challenges that you're going to be facing is establishing and building your credibility. Without knowledge of the domain. B insurance or financial services or manufacturing or any other field, you simply can't have intelligent conversations with your stakeholders in a way that would lead to good conclusions. So you will absolutely have to know the domain, which is large portion, of your value. Curtis: So as you've gotten into a domain that maybe you weren't as familiar with in a data role, how did you overcome this need to understand the domain better? Simeon: Let's step back and talk about what a data genuinely is right now and specifically talk about data management. You are running a data function or sometimes called data services because what used to be DBA teams or data analysts or various forms is really becoming a practice and looking at it as a practice. You have a certain set of clients, the are paying you for the services, you have certain amount of resources and you trying to optimize those resources to serve your clients better. So what are the challenges that you're going to face in any data management role? So you're in this interesting balance between moving forward very rapidly as well as not destroying what already exists, not destroying the services that are already provided. People have to breath, people have to be able to, to leave. You can't disrupt too much the services that already exist, your reports, your, you know, our auditing work your work with, you know, regulatory agencies. Anything else that the business needs to produce has to continue to happen. The people who are doing their jobs in the current way simil...

    Using Data to Design Tests People Don’t Hate

    Play Episode Listen Later Sep 4, 2019 18:51


    David Saben is on a mission to make taking tests less painful, and he’s using data to do it. In this episode, he’ll discuss reviving methods developed in 1979 to shorten tests and make them more effective, as well as how to use psychometrics to aid in the design and crafting of an effective test. David Saben: When I see my son who's 11 years old, spending three days and testing when I know there's absolutely no reason for it that you can do that in an hour. Ginette Methot: I'm Ginette Curtis Seare: And I'm Curtis Ginette: And you are listening to Data Crunch Curtis: A podcast about how applied data science, machine learning and artificial intelligence are changing the world. The father of lean startup methodology once said “There are no facts inside the building so get the heck outside.” The education industry is no different. Sometimes the facts that’ll make your machine learning career are waiting just outside your office.  Read more at mode.com/mledu m o d e dot com slash M L e d u Ginette: Today we chat with David Saben, the CEO and president of Assessment Systems, an organization innovating psychometrics (the science of assessment) Dave: I originally started my career in telecommunications, uh, bringing voice and data services into institutions and to learning institutions. And then when I realized is, is that connecting universities and for profit schools, you know, connecting them online really created a huge opportunity for learning and really crossing barriers to learn and really meeting learners on their terms with online learning courses. And that kind of brought me through this, this journey with using technology to, to really make better decisions in learning and knowledge and how we do that effectively. And that has started a about a 16 year career focused on that using using data, using e tools to make a better learning environment for everybody and make us more effective in the way that we, we gather information and retain information. And that that's left. Let brought me, um, into several areas. One is in the learning sciences is how do you, how do you deliver learning content more effectively, but also in the assessment side as well, where, how do you measure what folks are learning effectively and painlessly in that that's brought me on this, uh, this journey into the assessment industry and really making sure that every exam that's delivered in classrooms or whether it's a licensure exam is as fast and as fair as possible and using data to be able to do that. So really mitigating the risk of human bias when it comes to measuring a human's abilities, uh, which is, uh, which is a troublesome area, right? Curtis: Yeah. And now you say a effective and, and painless. And I know most people hate taking tests, so, so tell me how you approach that. Dave: Yeah. Well, I think there's a lot of ways. I mean, I think one of the, one of the most important ways is that you make the test faster, right? You make, you know, in 1979, I was the chairman of assessment systems help create a technology called computerized adaptive testing. What that uses, it uses algorithms to gauge what you know and what you don't know and then basically tailoring the content that you see, the next item you see gets more progressively difficult or progressively easier depending on your, your ability. And what that does is that reduces test time by about 50%. We see that with the ASVAB exam that's given to our service men and women to make their testing experience faster and fair and really, and we're starting to see that really across the world with measurements. So really making those exams tailored to the person's ability, uh, which is really, really important. You know, what you don't want to do is you don't want to give one test that doesn't change to everyone cause that's really, really inefficient. You know, if I'm going through the test and I know I know the content really well,

    Activating Analytics in Business and Government

    Play Episode Listen Later Aug 28, 2019 13:52


    Todd Jones: My name is Todd Jones. I'm the chief analytics officer here at WebbMason analytics. We are a professional services firm helping our clients accelerate their analytic evolution. So I think my journey started about 10 years ago. Uh, I graduated from Princeton with a degree in operations research and financial engineering. So I could have basically taken f two paths. One, I could have went into the financial space or the second path I could have taken was going into the analytics space and I, and I chose the, the analytics space. I joined a very early company called Spry. When I joined. It was about four months old and primarily started off doing a lot of DOD contracting specific to analytics and data. And we eventually built that company to a pretty nice size. We expanded past the DOD space, got into commercial, started consulting with some large, uh, pharmaceutical companies, transportation companies, and really built that company up and then sold that in 2015. Curtis: When you fill that is Webb Mason, the company that then bought Spry? Todd: Correct. So Spry was again, another professional services firm specializing in data and analytics. WebbMason historically has been a marketing a firm and so they specialize in all aspects of marketing. And as you can imagine, analytics is definitely a big area of focus for them and their clients. And so they brought us in and about 20% of our revenue comes from marketing related activities through WebbMason and then 80% of our revenue still comes working with it and analytic groups outside of the WebbMason portfolio. Curtis: Interesting. Okay. So there was some crossover there, but not as much as you might expect. Todd: Yeah, definitely some crossover without a doubt. So that was definitely beneficial. But you know, as, as I'm sure you can imagine with any acquisition, you learn a lot. And so we're in a great spot right now, and we're able to generate very healthy stream of business independently, but then also find those synergies with WebbMason as it relates to the marketing activities. Curtis: Sure. That's awesome. So when you got started at Spry, ah what, what was your role? What did, what did that look like? Todd: Yeah, so when I got started, most of my role at that time was consulting. So I was working directly with our stakeholders who at the time were within the Department of Defense. So I split my time between Crystal City, Virginia and the Pentagon. And really what we were trying to do was help them build a solution that gave them a enterprise view across the four military groups, specifically related to human resources. So if you think about it, when we, you know, when we fought world war two, you had, you know, one division, the Marines and the navy out in the Pacific and then you had the army in Europe and they, for the most part fought separate campaigns.And then we started to get into Iraq and Afghanistan and all of a sudden all of these individuals started to really come together. And so you might look at a city block and you have the air force there, army there, you know, navy seals in the area. And so all of these groups now have to work very closely together. And one of the things that the DOD was trying to accomplish at that time was to start to get a better view of people across the different military branches. So, for example, rather if I need a particular skillset within a particular city block, can I get that skillset from the navy? Can I get that skillset from the army? Maybe the Marine Corps has that skillset. And so they needed a very, they needed a large enterprise view so that they could very easily and quickly start to develop these blended teams. And so that was definitely a combination of technology solutions as well as analytics solutions. And so we were consulting with individuals within the Pentagon to help them build that technology solution. Curtis: That's really interesting.

    Last-Mile Logistics Analytics—for Everyone Who Isn't Amazon

    Play Episode Listen Later Aug 21, 2019 23:35


    Today we speak with Professor Ram Bala, an expert in supply chain management analytics, particularly last-mile delivery. He has very interesting insights into how today’s supply chain is evolving. He talks about various methods and algorithms he uses, the specific challenges inherent in doing last mile logistics and deliver, how pricing factors in, and how everyone is trying to catch up to Amazon. Ram Bala: Then there is this great opportunity to actually use the data effectively. But that is a long way to go in terms of coming up with the right algorithms, both on predictions, as well as the optimization to actually get this done in a meaningful way. And if you look at the landscape today in terms of industry, I would say very few companies that actually there yet. Right? I mean, Amazon obviously is a clear example of the leaders in the space, but everyone's trying to get there as well. Ginette Methot: I'm Ginette Curtis Seare: And I'm Curtis Ginette: And you are listening to Data Crunch Curtis: A podcast about how applied data science, machine learning and artificial intelligence are changing the world. Intro: Today we speak with Professor Ram Bala, an expert in supply chain management analytics, particularly last-mile delivery. He has very interesting insights into how today’s supply chain is evolving. Ram Bala: My name is Ram Bala. I'm a professor at Santa Clara University as well as a data science leader at CH Robinson, which is the largest logistics marketplace in North America. I've been working with topics in supply chain, belated data science even before it was called data science for the past 15 years. I got my Ph.D. in operations research and a supply chain from UCLA and a, I've been working on these problems both for companies as well as within the academic context that I've been working on research problems. And more recently I think there's been a lot of excitement in this space. And then that's where my involvement with both startups and as well as larger companies has gone up and I, I came into the CH Robinson fold as a consequence of an acquisition. So I was part of a startup that was working on last mile logistics and how to, how to improve that. Curtis Seare: Got It. That's awesome. And the space that you're in is really interesting. Could you give the audience just to contextualize the problem set that you're focused on? Ram: So I think one of the major things that has changed in logistics is the growth of e-commerce and also personal mobility. I mean if you think about Uber Logistics as a larger concept that covers both moving people as well as products and what's really happened is the, the availability of real time data has had a significant consequences on how we are able to predict as well as optimize how we move things and that's then also raised the bar in terms of customer expectations. We expect to get a get a ride to go somewhere within and within five minutes, we expect to get a product within a day and those expectations have been set by specific companies say Uber in the case of personal mobility. In the case of products, it's Amazon and having set the stage, everyone's now trying to be competitive with them, which means that in the product space, certainly all e-commerce companies as well as companies that were in brick and mortar are trying to achieve that same end goal, which is how do I get products to consumers quickly at the same time and not spend too much money? Right? That's the core problem. Now doing that as hard, it's become easier simply because we have real time access to real time data in terms of location as well as you know where products are at an even point. But it is a hard problem to solve. Curtis: Some of the intricacy and you know, routing and pricing and kind of interplay there. Can we dive into a little bit of those details? Ram: Absolutely. So I think uh, routing problems have been around ever since transportation's been around,

    Running a Successful Machine Learning Startup

    Play Episode Listen Later Aug 10, 2019 26:38


    Today, our guest, Alain Briancon, will talk to us about how to work with Fortune 500 companies and help them get quick value from their data, how to build a roadmap of incremental value during the data collection and analysis process, how they help predict and incentivize customer purchases, and how to dial in on an idea for successful data science software companies. Alain Briancon: Adding one more question to answer is always easy. The difficult part is what question can I remove and still providing insight. Ginette Methot: I'm Ginette Curtis Seare: And I'm Curtis Ginette: And you are listening to Data Crunch Curtis: A podcast about how applied data science, machine learning and artificial intelligence are changing the world. Ginette: If you’re a fortune 1000 company, and your team needs to be trained in Tableau, Statistics, Data Storytelling, or how to solve business problems with data, we’ll fly one of our expert trainers out to your site for a private group training. The most important investment a business can make is in its people, so head over to our site at datacrunchcorp.com and check out our training courses. Today, our guest, Alain Briancon, will talk to us about how to work with Fortune 500 companies and help them get quick value from their data, how to build a roadmap of incremental value during the data collection and analysis process, how they help predict and incentivize customer purchases, and how to dial in on an idea for successful data science software companies. Alain: My name is Alain Briancon. I am currently the VP of data science and chief technology officer for CEREBRI AI. CEREBRI AI is an AI company, as the name could guess. We are located in three cities: Austin, which is the corporate headquarters; Toronto, which is a hotbed of data science in North America; and Washington DC where I work. What CEREBRI AI focuses on is developing a system to help manage above the strategic component as well as the tactical component of customer experience. This is my fifth startup. This is my third startup that involves data science and machine learning. Jean Belanger, who is the CEO of CEREBRI is a friend of mine; now he's my boss. So I'm trying to work through that, and it took him about 19 years to convince me to join a startup, uh, with him. And this was the right opportunity because the kind of problems we are solving are very challenging. It has been a, an absolute blast. Besides working with a great team and building it up. But when I joined we were about 20 people. Now we're about 63 people, about 50 of them on the technical side. Half in data science, half in software. What has been fantastic is applying tricks and insight that I've gained over the years to, uh, help guide the data science side. The other thing also, which is fun, is we have a very pragmatic view of how to approach things and how to approach engagement with customers. Our customers are fortune 500 customers; they are major banks. One of them is a Central Bank. Others are car makers and we're working very hard into the telco business as well. And, uh, when you deal with such companies, first of all, a very interesting sell cycle in which data science and machine learning play a role at the right moment in time. But you have to also be humbled by the fact that you don't start on their side from a clean sheet. And I think that's one of the most interesting component of making things work is bring data science and machine learning insight to companies who cannot afford and we should not afford the, "okay, let's start from scratch. Let's share all of the data in the like," and so vis Jujitsu between the business case that machine learning brings up and the underlying machine learning technology is one of the most fun element of the work. Curtis: That's interesting. Let's, let's dig into that if we can. Can you give me a concrete example in CEREBRI AI how that works and spell out that concept for us?

    Executive Panel: How Can Data Science, ML, and AI Best Support Executive Goals

    Play Episode Listen Later Jul 26, 2019 43:15


    Today is a special episode. We welcome three executive guests from different organizations to share their experiences and insights about how data science can best support executive goals. Ginette Methot: I'm Ginette Curtis Seare: And I'm Curtis Ginette: And you are listening to Data Crunch Curtis: A podcast about how applied data science, machine learning and artificial intelligence are changing the world. Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company. There's a lot going on here at Data Crunch. Just this last week we finalized the merger of Vault Analytics and Lightpost Analytics under the new banner of the Data Crunch Corporation, which improves our capabilities to serve our clients head over to datacrunchcorp.com to check out our training and consulting offerings. For our executive panel, today we'll be talking to Simon Lee, the chief analytics officer from Waiter; Fatma Kocer, who is the vice president of data science engineering at Altair, and Rollen Roberson who is the president at Trianz. Curtis: So, welcome everyone to the executive panel. We are super excited to have you guys here. You are all executives and companies that are doing amazing things with data science. So the audience knows, again, we're talking about today, the topic is how data science, machine learning and AI can best support executive strategy and business goals. How, how does that function really work? Let's start maybe with Simon and then Fatma and then Rollen, if you could just give us a little introduction, and we'll get going from there. Simon Lee: Thanks. I'm Simon Lee. I'm actually kind of a mixed bag when it comes to data science and analytics. I've got about 20 years of experience using analytics and advanced algorithms, you know, in a whole bunch of different industries like transportation for example, airline rail, trucking, ocean carriers, printing, publishing, manufacturing, finance and delivery. Delivery is where I'm currently at. Waiter is a restaurant, food delivery company in small and mid size market. So probably a lot of people haven't heard of us because we're in the smaller communities, but, we're trying to make a big splash. So yeah, that's who I am. Curtis: Awesome. Thanks for being here. Fatma Kocer: Hi, this is Fatma Kocer from Altair engineering. I am a civil engineer by training, although I never get a chance to practice it. Um, my background is multidisciplinary design, exploration and optimization. And I was in the auto industry before I joined Altair. Um, there, I've done several things throughout the 14 years that I've been here, but always keeping, designing solution optimization as the core of my responsibilities. And Altair is a global technology company. We provide software in solutions for product development, data intelligence and high performance computing. We are located at headquarters in Michigan in Troy, Michigan where I'm speaking from and we have offices in I think 25 countries now. So that would be me. Curtis: Great. Thanks for being here, Fatma, and, ah, Rollen. Rollen Roberson: Right. Thank you. Good Morning. Rollen Roberson with Trianz. You know, for my own background, I've a similar to Simon. I'm kind of a mixed bag, I've been in the industry for 20 plus years, I'm solely in the digital transformation space. Uh, working from startups, mid-level companies through global service integrators, uh, working with Trianz currently to really expand the growth and a use within AI and IoT within the organization. And our customer base, Trianz is a company that has 1,500 plus employees, global offices mainly serving the, upper, mid-tier and enterprise level customer base, uh, solely focused on digital transformation and the use of those higher technologies for greater return on value. How Can Data Science and AI Have an Impact On Your Business Curtis: That's awesome.

    The Biggest Pitfalls of New Analytical Initiatives

    Play Episode Listen Later Jul 20, 2019 27:27


    Our guest Andrzej Wolosewicz has had years of experience helping companies define and build machine learning and analytical solutions that have a measurable impact on the business, and he shares with us his experience and expertise. He shares with us the biggest pitfalls he sees companies fall into over an over as they try to implement these initiatives. The problem was there was a lot of activity every month that they were doing, but in terms of progressing, their analytic capabilities were really kind of being able to to grow and be more effective. They weren't, they weren't able to do that. As the saying goes, they had a lot of action but not a lot of progress. Ginette Methot: I'm Ginette. Curtis Seare: And I'm Curtis, and you are listening to Data Crunch, a podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Andre Wolosewicz: My name is Andre Wolosewicz. I am currently the director of sales at HEXstream. We are Chicago based analytics and data consultancy. But this is kind of the, the latest step on my journey. So I actually started out coming straight out of college into a, a predictive modeling startup. And this would have been in the late nineties. Artificial intelligence at the time was, was a big buzzword as it is today. And we were looking at being able to do fairly advanced modeling of systems, but actually looking at the data as being the model. So if you were looking at uh, everything from a jet engine to the human body to complex refineries, we didn't necessarily understand all the nuances of how they ran, but we had all the data and so we would use that data to build out those models. And then I ended up going from that actually flipping into the, into the other side of the world around program management. So, not so much doing the analysis, but understanding how the analytics and designs and all of those steps fit together to actually deliver a furnished product. And so that was, that was very useful because it taught me that, hey, there's a lot more, you may find things that are interesting, but on the business side of the world you have all of the constraints that analysts may not always be aware of or or may not, you know, really want to take into consideration like budgets, schedule, things of that nature. And so I learned how to operate with that. And then another interesting twist of fate, met somebody who knew somebody who was looking for somebody that could provide that line of business experience, but actually selling a business intelligence platform. Not necessarily that you knew how the all the software worked. And you know, if you click here, this happen, if you click here, that happened, but could sit across the table from somebody who was in a line of business and say, I understand the business problem you're having. I understand how to solve it and here's how the technology can be applied. Because the, the reality is technology in and of itself will never solve a tool. It needs people, it needs processes, it needs the people to use it. My Dad used to like to look at a rake and say, well, the art's not going to rake itself, so the rake does the job, but it needs somebody to use it. After about five, six years actually selling and being involved with the bi platform, the opportunity to join HEXstream came up, and for me, this was kind of a combination of all of the past experiences because it gave me the opportunity to engage with clients and engage with our inner teens on what is it that you're trying to do. So going back to my first experience, what is the project? What is the model? What is the data that you're trying to work with and build? But then I also had to understand why that was relevant. Why would a client engage with a company like HEXstream to undertake a project? How is that project measured? There's a lot of things that over the years I've found people would love to do,

    Digital Credentials and Machine Learning Aim to Change How You Hire

    Play Episode Listen Later Jul 12, 2019 19:57


    Today we’re going to see how a clever idea and the skillful use of data is starting to disrupt how people get credentials. The use case here has the potential to remove gender and racial bias in the hiring process, help companies understand specific talent gaps in their workforce, and help learners find lucrative educational pathways they can take.

    How to Win Hearts and Minds as a Data Leader

    Play Episode Listen Later Jun 29, 2019 21:48


    Joe Kleinhenz talks about his journey from starting out in data all the way to becoming a leader in one of the largest insurance organizations in the United States. We'll learn about the importance of staying on top of technology, how to win hearts and minds of nontechnical folks, centralized versus decentralized team, pros and cons, how to hold effective conversations with stakeholders and how to go from individual contributor to leader. Joe Kleinhenz: The critical skills you bring to the table is the ability to break down complex ideas into ones that translate for nontechnical folks. Ginette Methot: I'm Ginette. Curtis Seare: And I'm Curtis. Ginette: And you are listening to Data Crunch—a podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Data Crunch is produced by the Data Crunch Corporation and analytics training and consulting company. One of the biggest challenges companies have in getting value from their data is finding the right talent. Good talent is scarce and building a top-tier team is hard if not impossible for some companies. If you are having this challenge try out our analytics as a service offering: we bring a fully equipped data science team to bear on your projects, on demand and with no long-term contract constraints. If you want to start seeing success for your data science efforts quickly and economically, head over to datacrunchcorp.com for more details. Today we'll be hearing about Joe Kleinhenz's journey from starting out in data all the way to becoming a leader in one of the largest insurance organizations in the United States. We'll learn about the importance of staying on top of technology, how to win hearts and minds of nontechnical folks, centralized versus decentralized team, pros and cons, how to hold effective conversations with stakeholders and how to go from individual contributor to leader. There's lots of unpack in this episode, so let's get to it. Curtis: If we could just start out just by talking about what got you interested in data in the first place, where your journey started, and we can go from there. Joe: I actually first started thinking about using math to predict future outcomes when I was a teenager. I read a book by Asimov called Foundation and whole premise of the book series was I'm using mathematics to predict the future. It's all science fiction stuff in it that point, but that's kind of what certainly got me first interested in it. Curtis: So it was a, it was a work of fiction that got you interested. Joe: Yeah, that captured my imagination. I didn't even at that point even know, it was a, you know, data science was a thing, and as I got my path into the technology, within IT, I was doing business consulting for awhile and got into data warehousing, and this was in the late nineties. From there, ended up in part of GE financial that was doing a lot of direct marketing, and they had a group called database marketing, which was essentially the precursors for data scientists. They had predictive modelers, statisticians essentially in there that were, by today's standards, relatively simplistic tools like linear regression to build, you know, models predicting who would respond to direct-drip marketing offers. I used to joke with people that I ran a team of bad people that decided to call you at dinner with an offer. You can just have the here. Um, Curtis: And you made those people very effective at, at being bad, I assume. Joe: Yes. Yes. At that point there was very few restrictions on what you could do. We were even using credit data for some of the, the algorithms cause we were with credit card companies. Credit data at the time, there wasn't the regulatory restrictions there is now, it's incredibly predictive. When you combine that with recency frequency data on purchasing behavior, you'd really kind of tune in on, you know, what someone would be interested in.

    Building Data Products that Work in the Health and Wellness Industry

    Play Episode Listen Later Jun 1, 2019 19:40


    Our guest today holds a PhD in organizational psychology and has been working on data products in the health and wellness space for over a decade. We cover a lot of ground in this interview: how to create data products that work, how to avoid the unexpected consequences of poorly designed data interventions, and the importance of ethnographic thinking in data science. We'll also talk about reducing friction in data collection, the coaching data product model, and surprising things we can learn when people's routine's are broken. From today's episode, you'll come away with a better understanding of how to build contextually relevant data products that make a difference in people's lives.

    The Road to a Data-Driven Culture in Your Organization

    Play Episode Listen Later May 1, 2019 24:17


    How do you whittle the murky business of creating a data-driven culture down to a proven process? Today we talk to a guest who has done this time and time again, helping companies transform their operations. He points out the small nuances and details about the process, like questions to ask to start on the right foot, critical feedback loops to put in place along the way, and how to overcome some of the most common problems that make people give up. Ginette: I’m Ginette. Curtis: And I’m Curtis. Ginette: And you are listening to Data Crunch. Curtis: A podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company. Now, let's jump into our interview with Ryan Deeds, VP of technology and data management at Assurex Global. Ginette Methot:            How do you whittle the murky business of creating a data driven culture down to proven process? Today we talk to a guest who has done this time and time again helping companies transform their operations. He points out the small nuances and details about the process, like questions to ask to start on the right foot, critical feedback loops to put in place along the way and how to overcome some of the most common problems that make people give up. I'm Ginette and I'm Curtis and you are listening to data crunch, a podcast about how applied data science, machine learning and artificial intelligence are changing the world, a vault analytics production. Let's jump into our interview with Ryan deeds that VP of technology and data management at Assurex global. Ryan Deeds:                  Uh, I think it's an interesting time in the whole a data experience because I think so many people failed. You know, in the last like decade that this next couple of years everybody's now trying to look at root cause. And so culture actually is becoming important now, you know? And so that's kind of a cool thing. Curtis Seare:                 What do you mean by that? In terms of a lot of people have failed. Ryan Deeds:                  I think when you look at bi projects from 2003 to 2013, they were just, companies went through litany of failures and trying to get data to a place that what made sense was easily accessible, had had a good quality. Um, but they didn't address that. They just put the visualizations on top of kind of crappy data and they did that over and over and over again. Um, and then finally it seems like, you know, in the last year or two years, we start really having a conversation about what has to happen inside an organization to make data usable. I mean, it's just like water, right? You can't just take water from a stream and start drinking it. You got to process it and clean it and make it and make it valuable and make it worthy of consumption. And that's exactly the thing we got to do with data. Curtis Seare:                 Sure. Maybe we can dive into that as well, because you've had this experience taking a lot of companies through those steps, right? So what do you see as the major roadblocks? How do you start this process of helping people get their hands around? How do I get value from my data? Ryan Deeds:                  So it's interesting. I kind of have, uh, you know, I've done this a lot and so I have, uh, organizations that come to me and they say, hey, you know, we want to, we were ready to start leveraging data. Um, and the, the typical thing is there's just a lack of expectation of the time it takes. Um, and so I threw together like a timeline to try to help, uh, educate individuals on that, you know, and kind of like the steps that it would take to get to usable data, um, in, and the first is really a recognition that today we don't, you know, the organization that we're in is not effectively using data, um, as a, as a strategic advantage.

    Statistics Done Wrong—A Woeful Podcast Episode

    Play Episode Listen Later Mar 27, 2019 21:28


    Beginning: Statistics are misused and abused, sometimes even unintentionally, in both scientific and business settings. Alex Reinhart, author of the book "Statistics Done Wrong: The Woefully Complete Guide" talks about the most common errors people make when trying to figure things out using statistics, and what happens as a result. He shares practical insights into how both scientists and business analysts can make sure their statistical tests have high enough power, how they can avoid “truth inflation,” and how to overcome multiple comparisons problems. Ginette: In 2009, neuroscientist Craig Bennett undertook a landmark experiment in a Dartmouth lab. A high tech fMRI machine was used on test subjects, who were “shown a series of photographs depicting human individuals in social situations with a specified emotional valence” and asked “to determine what emotion the individual in the photo must have been experiencing.” Would it be found that different parts of the brain were associated with different emotional associations? In fact, it was. The experiment was a success. The results came in showing brain activity changes for the different tasks, and the p-value came out to 0.001, indicating a significant result. The problem? The only participant was a 3.8 pound 18-inch mature Atlantic salmon, who was “not alive at the time of scanning.” Ginette: I’m Ginette. Curtis: And I’m Curtis. Ginette: And you are listening to Data Crunch. Curtis: A podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company. Ginette: This study was real. It was real data, robust analysis, and an actual dead fish. It even has an official sounding scientific study name—”Neural correlates of interspecies perspective taking in the post-mortem Atlantic Salmon”. Craig Bennett did the experiment to show that statistics can be dangerous territory. They can be abused and misleading—whether or not the experimenter has nefarious intentions. Still, statistics are a legitimate and powerful tool to discover actual truths and find important insights, so they cannot be ignored. It becomes our task to wield them correctly, and to be careful when accepting or rejecting statistical assertions we come across. Today we talk to Alex Reinhart, author of the book “Statistics done wrong—The Woefully complete guide”. Alex is an expert on how to do statistics wrong. And incidentally, how to do them right. Alex: We end up using statistical methods in science and in business to answer questions, often very simple questions, of just “does this intervention or this treatment or this change that I made, does it have an effect?” Often in a difficult situation, because there are many things going on, you know, if you're doing a medical treatment there’s many different reasons that people recover in different times, and there's a lot of variation, and it’s hard to predict these things. If you’re doing an A-B test on a website, your visitors are all different. Some of them will want to buy your product or whatever it is, and some of them won’t, and so there’s a lot of variation that happens naturally, and we’re always in the position of having to ask, “This thing/change I made or invention I did, does it have an effect, and can I distinguish that effect from all the other things that are going on.” And this leads to a lot of problems, so statistical methods exist to help you answer that questions by seeing how much variation is there naturally, and this effect I saw, is it more than I would have expected had my intervention not worked or not done anything, but it doesn’t give you certainty. It gives us nice words, which is like “statistically significant,” which sounds important, but it doesn't give you certainty. You're often asking the question, “Is this effect that I’m seeing from my experim...

    Getting into Data Science

    Play Episode Listen Later Mar 1, 2019 22:51


    What does it take to become a data scientist? We speak with three people who have become data scientists in the last three years and find out what it takes, in their opinions, to land a data science job and to be prepared for a career in the field. Curtis: We’ve talked a lot in our recent episodes about all the interesting things you can do with data science, and we’ve only talked a little bit recently about what it actually takes to get into the field, which is a topic that a lot of you have reached out to us and asked us to cover in a more thorough way. So today, we’re taking a broader approach on this topic by talking to three data scientists who have become data scientists in the last three years. You’re going to be able to hear all the details of each of their three journeys, how they got started, how they landed their jobs, and what their best advice is for getting into the field, and this will give you a broad view about how to get into data science from three people who have actually done it. Ginette: I’m Ginette. Curtis: And I’m Curtis. Ginette: And you are listening to Data Crunch. Curtis: A podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Ginette: A Vault Analytics production. Ginette: Here at Data Crunch we’ve been hard at work developing a technology that allows executives and business leaders to gain insight from their data instantly—simply by talking to the air. We hook up your data to an Alexa device with custom skills built in to understand the questions you have about your business - and give you answers. Figure out sales forecasts, marketing performance, operational compliance, progress on KPIs, and more by just talking to Alexa. We are officially launching the product this week and have room for three initial customers—if you're interested, head over to datacrunchcorp.com/alexa or datacrunchpodcast.com/alexa (both work), and book some time to chat with us. We’ll assess if your company is a good fit, and if so, we look forward to working with you! Tyler Folkman: My name’s Tyler Folkman. I've gotten into data science in kind of a strange route to be honest. I did my undergrad in economics, actually originally thinking to get into computer science, but for some reason, I had this thought that computer science was going to get outsourced; I don't know if that was a thing, but I think people back in the early 2000s were talking about computer science getting outsourced, so I thought about business, which ended up begin economics, which I really liked, and then ended up doing economic consulting, which is, basically in usually large litigation cases, lawyers hire economists to value damages, so for example, when Samsung and Apple were suing each other, I worked on the Samsung side to help value how much they might sue Apple for, for patent infringement, and a lot of that involves statistical analyses, data analytics, econometrics as economists would call it. And I got really interested in just this idea of data being a really powerful tool for making decisions and coming to conclusions, and so I started hearing about machine learning on the Internet, kind of dabbling with Python, which at the time, I was a Windows user, and it was a huge pain to get Python installed, but I kind of got it up and running, played around with things like SciKit learn, read some blogs, and really got into machine learning and found that it was really housed more in the computer science department at that time, and just kind of decided to apply to some computer science departments and was lucky to get in at University of Texas at Austin and do some studies there, join a machine learning lab and got to do some work at Amazon. Really got a really good set of experiences to kind of help me learn how to be both a programmer and a machine learning person, a little bit of statistics, and jumped straight from there over here to Ancestry and was luc...

    Automated Machine Learning with TransmogrifAI

    Play Episode Listen Later Jan 31, 2019 12:49


    Would you rather take a year to develop a proprietary algorithm for your company that has an accuracy of 95% or use an open source platform that takes a day to develop an algorithm that has nearly the same accuracy? In most business cases, you'd choose the latter. In this episode, we talk to Till Bergmann who works on a team that developed TransmogriAI, an open source project that helps you build models quickly.

    The Data Scientist's Journey with Nic Ryan

    Play Episode Listen Later Dec 28, 2018 19:34


    What does it take to become a data scientist? Nic Ryan has been in the field for over a decade and answered thousands of questions from people looking to get into the field. In this episode, he talks about his journey into data science and his experiencing mentoring aspiring data scientists, giving advice to both beginners and seasoned professionals. Nic Ryan: I think there's sometimes a problem in data science education, and what people find interesting is they tend to focus on the algorithms, which as you know from doing data science projects is really just the last little bit. There's tens or even sometimes hundreds of decisions steps that are made until you get to that particular point.  Ginette: I’m Ginette. Curtis: And I’m Curtis. Ginette: And you are listening to Data Crunch. Curtis: A podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Ginette: A Vault Analytics production. Ginette: Ad space Curtis: Let’s introduce you to our guest: Nic Ryan. He is an experienced data scientist and LinkedIn influencer who has helped a lot of aspiring data scientists in their journey into the profession. He’s been part of many different data teams, small and large, in big companies and startups, and he wrote a book called, “The Data Scientist's Journey. The Guide for Aspiring Data Scientists,” which is based off the thousands of questions he’s been asked about becoming a data scientist. Nic: It started off with failure. Originally, I wanted to go over to the States to play basketball, so I’m a failed basketball player, and there’s a couple reasons why I didn’t make it: one is I wasn’t tall enough to be a small forward, which is a bit ironic. I’m only 6’2”, but probably the more important reason is I wasn’t very good, but I didn’t know that at the time, so I didn't get a scholarship to play basketball, but I did get a scholarship to do actuarial studies. So it’s not a bad backup plan. But from there, I ended up falling into more of the stats side of things, of insurance, so the statistical modeling, pricing, fire, and theft, I really enjoyed that kind of stuff, so over time, I did more of that. Did some of my post-grad actuarial exams, and I was doing some reading on the weekends and finding out more about stats and a bit about code and a bit about R, and what really did it for me was having an incredibly long train ride to get to work. It was a couple hours each way, and so this is of course, this is the era of MOOCs, and rather than just talking to people, I just ended up joining the MOOCs, and so, really enjoyed that, and this whole thing of data science has just kind of grown around me, and I ended up working for one of the banks and doing their credit scoring and consulting with different banks for a long period of time, and I got a call out of the blue to, a guy just gave me a plane ticket and said come talk to us. So I flew there, and they offered me what was really a head of data science role, so there was a team overseas and a couple teams in Australia doing data science, and yeah, we did some pretty awesome things with NLP and bank statements and built some pretty sophisticated risk models; it was probably best in the country at that time. It’s about 60 miles away from Sydney where I worked, and so it was a real opportunity. It was probably two hour door to door each way, and that was the other thing as well: that was a long time away from family, which wasn’t cool. I had a couple young kids. That’s part of the reason I have my own business now is that I’ve spent too much time away from my daughters. The result of it being I had a whole heap of dead time that I could either use or not use, and so I was able to teach myself code and teach myself some more stats and machine learning and stuff pretty quickly when you have a couple hours of dead time each day, you become pretty good, pretty quickly,

    Cutting-Edge Computational Chemistry Enabled by Deep Learning

    Play Episode Listen Later Nov 27, 2018 17:43


    Machine learning is becoming a bigger part of chemistry as of the last two or three years. Industries need to have people trained in both fields, and it's taken time for them to make their way into this sector. Olexandr Isayev is at the forefront of that wave, and he talks to us about what he's done while melding deep learning and chemistry together and his vision of where he sees this field going with this new tech.

    Python and the Open Source Community

    Play Episode Listen Later Oct 24, 2018 24:50


    Python versus R. It's a heated debate. We won't solve this raging controversy today, but we will peek into the history of Python, particularly in the open source community surrounding it, and see how it came to be what it is today—a well used and flexible programming language. Travis Oliphant: Wes McKinney did a great job in creating Pandas . . . not just creating it but organized a community around it, which are two independent steps and both necessary, by the way. A lot of people get confused by open source. They sometimes think you just kind of going to get people together and open source emerges from the foam, but what ends up happening, I’ve seen this now at least eight, nine different times, both with projects I’ve had a chance and privilege to interact with, but also other people's projects. It really takes a core set of motivated people, usually not more than three. Ginette: I’m Ginette. Curtis: And I’m Curtis. Ginette: And you are listening to Data Crunch. Curtis: A podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Ginette: A Vault Analytics production. Ginette: This episode of Data Crunch is supported by Lightpost Analytics, a company helping bridge the last mile of AI: making data and algorithms understandable and actionable for a non-technical person, like the CEO of your company. Lightpost Analytics is offering a training academy to teach you Tableau, an industry-leading data visualization software. According to Indeed.com, the average salary for a Tableau Developer is above $50 per hour. If done well, making data understandable can create breakthroughs in your company and lead to recognition and promotions in your job. Go to lightpostanalytics.com/datacrunch to learn more and get some freebies. Here at Data Crunch, we love playing with artificial intelligence, machine learning, and deep learning, so we started a fun new side project. We just launched a new podcast that tests the boundaries of what can be done with Google’s cutting-edge deep learning speech generation algorithms. We use surprisingly human-like voices to host the podcast that reads all the unusual Wikipedia articles you haven’t had a chance to read yet, like chicken hypnosis, the history of an amusing German conspiracy theory, strange trends in Russian politics, and much more to come. It’s worth listening to to hear what this tech sounds like and you’ll learn unique and bizarre trivia that you can share at your next dinner party. Search for a podcast called “Griswold the AI Reads Unusual Wikipedia Articles,” now found on all your favorite popular podcast platforms. Curtis: There has been a heated, ongoing debate about which programming language is better when working with machine learning and data analytics: Python or R, and while we won’t be wresting that particular question, we will overview a bit of history for both and then dive into significant history behind one of these languages, Python, with a major contributor to the language, a man who significantly influenced the way that data scientists use Python today. Ginette: As a very short historical background, Python came to the scene in 1991 when Guido Van Rossem developed it. His language has developed a reputation as easy to use because it’s syntax is simple, it’s versatile, and it has a shallow learning curve. It’s also a general purpose language that is used beyond data analysis and great for implementing algorithms for production use. As for R, it followed shortly after Python. In 1995, Ross Ihaka and Robert Gentleman created it as an easier way to do data analysis, statistics, and graphic models, and it was mainly used in academia and research until more recently. It’s specifically aimed at statistics, and it has extensive libraries and a solid community. As a controversial side note, according to Gregory Piatetsky Shapiro’s KDNuggets poll, late last year,

    Machine Learning, Big Data, and Your Family History

    Play Episode Listen Later Sep 26, 2018 21:10


    How can artificial intelligence, machine learning, and deep learning benefit your family? These technologies are moving into every field, industry, and hobby, including what some say is the United State's second most popular hobby, family history. Today, it's so much easier to trace your roots back to find out more about your progenitors. Tyler Folkman, senior manager at Ancestry, the leading family history company, describes to us how he and his team use convolutional neural networks, LSTMs, conditional random fields, and the like to more easily piece together the puzzle of your family tree. Ginette: Today we peek into an area rich in data that has lots of interesting AI and machine learning problems. Curtis: The second most popular hobby in the United States, some claim, is family history research. And whether that’s true or not, it's has had a lot of growth recently. Personal DNA testing products have exploded in popular over the past three years, but beyond this popular product, lots of people go a step further and start tracing their roots back to piece together the puzzle of their family tree. Today we’re going to dive into the data side of this hobby with the leading family history research company. Ginette: I’m Ginette. Curtis: And I’m Curtis. Ginette: And you are listening to Data Crunch. Curtis: A podcast about how data and prediction shape our world. Ginette: A Vault Analytics production. Ginette: This episode of Data Crunch is supported by Lightpost Analytics, a company helping bridge the last mile of AI: making data and algorithms understandable and actionable for a non-technical person, like the CEO of your company. Lightpost Analytics is offering a training academy to teach you Tableau, an industry-leading data visualization software. According to Indeed.com, the average salary for a Tableau Developer is above $50 per hour. If done well, making data understandable can create breakthroughs in your company and lead to recognition and promotions in your job. Go to lightpostanalytics.com/datacrunch to learn more and get some freebies. Tyler: My name's Tyler Folkman. Curtis: Who is a Senior manager of data science at Ancestry. Tyler: As I look across Ancestry and family history, we almost have, like, every kind of machine learning problem you might want, I mean, probably not every kind, but we have genetically based machine learning problems on the DNA science side. We have search optimization because people need to search our databases. We have recommendation problems because we want to hint the best resources out to people or provide them. For example, if we have a hundred things we think might be relevant to a person, what order do we showed them? So we use recommendation algorithms for that. We have a lot of computer vision problems because people upload pictures and a lot of our documents, if they're not like digitized yet, meaning that they’ve extracted the text, they might just be raw photos, or even just the things that our pictures uploaded, we want to understand what's in them, so is this a picture of a graveyard is it a family portrait? Is it an old photo? And so tons of computers vision stuff, natural language processing. On the business side, we have marketing problems just like any other business, like how do you optimize marketing spend? How do you optimize customer experience, customer flow? And so it's really a cool place because you really can get exposed to almost any type of problem you might be interested in. Curtis: So back in the 80s, before you could go easily find information on the Internet, genealogists had to spend a ton of time trekking around to libraries to try to find information on their ancestors. Ancestry saw a business opportunity and started selling floppy disks, and eventually CDs, full of genealogical resources for genealogists to easily access in their home. Tyler: And then they grew up through the Internet age and moved out ...

    Machine Learning Takes on Diabetes

    Play Episode Listen Later Aug 31, 2018 17:16


    When Bryan Mazlish's son was diagnosed with Type I diabetes, there were unexpected challenges. Managing diabetes on a day-to-day basis was tough, so he hacked into his son's insulin pump and continuous glucose monitor to create the world's first ambulatory real-world artificial pancreas. Now his mission is to make it available to everyone. Bryan Mazlish: A nice demo that we showed at Google IO earlier this summer, where we showed our use case for one of their forthcoming APIs. We’re really at the vanguard of digital health medical device enterprise software, and it's incredibly exciting but also challenging place to be. We're enthusiastic about the prospects for what we can do for a whole lot of people. Ginette: I’m Ginette. Curtis: And I’m Curtis. Ginette: And you are listening to Data Crunch. Curtis: A podcast about how data and prediction shape our world. Ginette: A Vault Analytics production. This episode of Data Crunch is brought to you by Lightpost Analytics, a company helping bridge the last mile of AI: Making data and algorithms understandable and actionable for a non-technical person, like the CEO of your company. Lightpost Analytics is offering a training academy to teach you Tableau, an industry-leading data visualization software. According to Indeed.com, the average salary for a Tableau Developer is above $50 per hour. If done well, making data understandable can create breakthroughs in your company and lead to recognition and promotions in your job. Go to lightpostanalytics.com/datacrunch to learn more and get some freebies. Curtis: Today we get to speak with a man who, after studying computer science at Harvard, went to start a stock-trading algorithm company on Wall Street until his life experienced a twist. Now he’s the president and co-founder of one of the leading digital health medical device enterprise software companies, which employs machine learning to customize and automate medicine intake, all because of an unexpected challenge that showed up in his life. Bryan: My name is Bryan Mazlish. I’m one of the founders of Bigfoot biomedical. My background is in quantitative finance. I spent 20 years on Wall Street, first at a large investment bank and then about a decade running a fully automated trading business where we built algorithms to buy and sell stocks completely automated fashion, and it was about 6 or 7 years ago that my path took a change . . . Ginette: Bryan’s son was diagnosed with Type 1 diabetes, which Bryan says wasn’t entirely unexpected because his wife has the same disease. But what was unexpected was the intensity of managing the disease on a day-to-day basis. He was surprised with how antiquated the insulin management technology was. There wasn’t technology that could anticipate his son’s insulin needs and automatically give him the insulin he needed. Bryan: You have a need to take insulin to just simply to live. This is something that needs to be delivered on a constant basis, 24 hours a day. You can take this in one of two ways: you can use an insulin pump that delivers this in a continuous basis, and you can also take a once-a-day injection, and the benefit of the pump is that you can vary that at different points in the day. When you take an injection, it lasts for up to 24 hours, and it doesn't have the same flexibility, but it does have the benefit of not having to wear a device to deliver the insulin. And that's just the baseline, on top of that you need to take insulin to offset meals, primarily carbohydrates and high glucose levels. So when you're going to sit down to eat breakfast, lunch, or dinner, or even a snack, you need to estimate the amount of carbohydrate and glucose impact of the meal that you're about to consume, and then dose that amount of insulin, either through an insulin pump or through an injection at that time. Ginette: Figuring out how much insulin to give yourself is tough.

    Claim Data Crunch

    In order to claim this podcast we'll send an email to with a verification link. Simply click the link and you will be able to edit tags, request a refresh, and other features to take control of your podcast page!

    Claim Cancel