Data Journeys

Data Journeys

Follow Data Journeys
Share on
Copy link to clipboard

Data Journeys is a podcast for aspiring Data Scientists by AJ Goldstein, where he interviews world-class Data Scientists about their learning journeys. The focus is on how they’ve bridged the gap between acquiring technical skills and creating real-world impact. In each episode, the goal is to equip…

AJ Goldstein


    • Dec 3, 2018 LATEST EPISODE
    • infrequent NEW EPISODES
    • 1h 5m AVG DURATION
    • 26 EPISODES


    Search for episodes from Data Journeys with a specific topic:

    Latest episodes from Data Journeys

    #25: Laura Noren: The Ethics of Data Science

    Play Episode Listen Later Dec 3, 2018 55:56


    Laura Noren is a data science ethicist and researcher currently working in cybersecurity at Obsidian Security in Newport Beach. She holds undergraduate degrees from MIT, a PhD from NYU where she recently completed a postdoc in the Center for Data Science. Her work has been covered in The New York Times, Canada's Globe and Mail, American Public Media's Marketplace program, in numerous academic journals and international conferences. Dr. Norén is a champion of open source software and those who write it.   Enjoy the show!   Show Notes:   [3:55] Laura explains how she produces the Data Science Community Newsletter, covering things like how the department of defense just got billions in funding to do AI research. How do you incorporate humor into such rigorous coverage? [10:22] How can you distinguish signal from noise in choosing a news source? [12:13] When and how to control your biases in your work when in the heat of the moment. [14:05] Laura’s interests in data science began as an undergraduate at MIT, surrounded by people who build. [16:10] Sociology in the context of people who build, since people are the *actual* most complicated systems. [18:00] What important things defines a profession? [19:30] What’s the difference between ethics and morals? [22:04] How ethics affects the field of data science, specifically. [25:35] The data science ethicist as person who is a creator, and not just there to put up stop signs. [31:40] How can companies strike a balance between hard stops in a product and more negotiated unique messaging for customers to address ethical employees? [38:53] How can smaller companies who can’t afford a Chief Ethics Officer monitor and address ethical issues? [48:30] Techniques that can be used by individuals and organizations to identify and address ethical issues in a company. [50:00] How data scientists can navigate non-black and white ethical issues in their own work. [55:15] Laura’s recommendations for ethics 101: Data and Society, AI Now Institute, and Open AI. [1:00:00] Laura ends off with a call-to-action to start conversations on ethics with your colleagues.   If you enjoyed this episode of Data Journeys, the best way to support the show is by leaving a review on iTunes and sharing on your social medias using the hashtag #datajourneys.   Laura’s Twitter: https://twitter.com/digitalflaneuse?lang=en

    #24: Brian McFee: Music and Data Science

    Play Episode Listen Later Nov 27, 2018 59:10


    Dr. Brian McFee develops machine learning tools to analyze multimedia data. This includes recommender systems, image and audio analysis, similarity learning, cross-modal feature integration, and automatic annotation. As of Fall, 2014, he is a data science fellow at the Center for Data Science at New York University. Previously, he was a postdoctoral research scholar in the Center for Jazz Studies and LabROSA at Columbia University.   My conversation with Brian today was focused on discussing his research in music informatics and its many facets and applications. He tells about some of the methods he used during his dissertation, and I ask him for insight on how to get a recommender system to recommend stuff that you actually like.   Here are some of the highlights of the show:   [3:17] What came first for Brian, the data science or the music? [5:19] Of all the things he could have chose to study, why did Brian choose music? [7:35] What is it like to be in a branch of data science that has become so closely tied with industry and well understood by the public? [9:37] How has Brian’s work expanded his own taste in music, and given him an appreciation of jazz? [12:00] Brian gives a brief history of the field of music informatics. [14:48] Where was the field when Brian wrote his dissertation, “More like this: Machine learning approaches to music similarity?“ [17:15] How have the characteristics and features for making predictions about music evolved since then? [21:06] Why does the concept of genre generally irritate Brian, and what is the “David Bowie problem?” [26:20] How do you address the problem of subjectivity in the field when conducting research? [31:21] Is there a dilemma in trying to take a subjective art like music and trying to quantify it as a science? [35:24] How can a recommender system actually accurately predict what kind of music a listener is looking for? [38:43] What can you do to train your Spotify recommendations? [42:33] How do you make the career decision whether to stay in academia vs. go into industry? [46:31] What kind of problems is Brian currently interested in solving? [49:00] What major life lessons can be taken away from work in machine learning? [50:00] Rapid fire questions.   Enjoy the show!   Find more at www.ajgoldstein.com/podcast/ep24

    #23: Wes McKinney - The Creator of Pandas

    Play Episode Listen Later Nov 19, 2018 60:10


    Wes McKinney is the creator and "Benevolent Dictator for Life" (BDFL) of the open-source pandas package for data analysis in Python, and has also authored two versions of the reference book Python for Data Analysis. Wes is also one of the co-creators of the Apache Arrow project, which is currently his main focus. Most recently, he is the founder Ursa Labs, a not-for-profit open source development group in partnership with RStudio.   He describes himself as a problem-solver, and is particularly interested in improving the usability of data tools for programmers, accelerating data access and in-memory data processing performance, and improving data system interoperability.   In my conversation with Wes today, we focused on getting to know Wes on a more personal level, discussing his background and interests to get some insight into the living legend of open source he has become.   [3:48] How did coming from four generations of newspaperman impact Wes’s upbringing? [6:00] What kind of hobbies was he interested in growing up, and what is the origin of his interest in computers? [11:08] How did he come to run a Goldeneye 007 world record website, and update and maintain it by hand? [16:10] Wes’s high school career as a mathlete, and how an early interest in math contributed to his approach to programming. [18:15] How wes brings the rigor he learned in mathematics to software engineering. [19:50] How languages and math scratch the same itch for composition. [21:00] About learning enough German to complete a PhP programming internship in Munich. [23:00] How Wes’s experience using data in his first year working post-undergrad set him down the path to Pandas. [25:00] What went into his decision to take leave from grad school to build Pandas? [27:00] The legendary tweet where Wes expressed his sense of purpose and motivation in building Pandas. [29:52] Why Wes’s work is motivated by the desire to free up people’s time to realize their full potential. [30:51] Zero to One - Peter Thiel [31:40] Why is solving basic efficiency problems, like reading CSV files. so important? [34:12] How community management has played such a huge role in making Pandas so successful compared to other tools. [39:00] The importance of seeing peers in an open source project as people with good intentions and more than just a GitHub profile. [46:00] How do the incentives of an open source project influence prioritization in a project? [51:45] How Wes’s newest project, UrsaLabs, is tackling the problem of funding in open source software development. [56:20] Wes’s goals for UrsaLabs over the next five years.   AJ’s Twitter: https://twitter.com/ajgoldstein393 Wes’s Twitter:https://twitter.com/wesmckinn Wes’s personal website: http://wesmckinney.com Wes’s LinkedIn: https://www.linkedin.com/in/wesmckinn/

    #22: Mike Tamir: Identifying Fake News with the Head of Data Science at Uber ATG

    Play Episode Listen Later Nov 13, 2018 55:19


    Mike Tamir is the Head of Data Science at Uber ATG. He is a leader in data science, specializing in deep learning and distributed scalable machine learning, and he’s also a faculty member at UC Berkeley.   Mike has led several teams of Data Scientists in the San Francisco Bay Area as Chief Data Scientist for InterTrust and Formation, Director of Data Sciences for MetaScale, and Chief Science Officer for Galvanize, where he oversaw all data science product development. He also created an MS degree program in Data Science in partnership with UNH.   Mike began his career in academia serving as a mathematics teaching fellow for Columbia University and graduate student at the University of Pittsburgh. His early research focused on developing the epsilon-anchor methodology for resolving both an inconsistency he highlighted in the dynamics of Einstein’s general relativity theory and the convergence of “large N” Monte Carlo simulations in Statistical Mechanics’ universality models of criticality phenomena.   The focus of today’s conversation was on his fake news detection AI project called Faker Fact.   Show notes:   0:00 First, a life update from AJ. Read about his new opportunity in Portland here on his blog. 5:28 What is the evolutionary explanation for why a human’s capacity for careful, rational thought often takes a back seat to emotion? Explained in a comic on the project website. 6:17 Emotions often win over rational though, but as a result, it can be difficult to think clearly on issues we’re passionate about. 7:05 Why people should be aware of their emotional biases, even though it’s not our fault that we have them. 7:50 Why Facebook deleted over a billion fake accounts recently, and why fake accounts, clickbait, blatantly false content, and other forms of fake news are everywhere on social media. 9:10 What mechanisms can we put in place to counterbalance the parts of our nature that compel us to create and engage with content on an emotional level? 9:51 Since a majority of our information is second-hand, how do we distingush what’s really true? 11:44 How did Mike become motivated to pursue this problem, on top of his full time job at Uber ATG? 12:45 How can we tackle “fake news” without censorship? 16:40 Post-Walter Cronkite era, how do we create a sense of credibility and neutrality in our information? 21:00 Why would it be a mistake if the algorithm learned to only classify right or left wing content as fake news? 22:19 The algorithm only looks at the title and words on a page, not the url. 23:15 How Walt (the FakerFact AI) classifies different types of content. Satire, journalism, etc. 26:46 How do you strike the balance of entertainment and informativeness in content? 31:10 What features and characteristics defines each different category of content that Walt identifies? 36:16 What is Walt’s ideal use case? 36:55 You can use the FakerFact Chrome extension to view the “nutrition facts” of the page you’re reading. 37:42 How does research on run-on sentences and other grammatical choices help Walt understand and score an article? 40:34 What techniques were used to train the Walt AI? 42:41 A discussion on the use of wisdom of the crowds in algorithms. 45:30 What makes it difficult to use the wisdom of the crowds when answers are too closely correlated (because of political affiliations or the news cycle?) 46:47 Visit Humanetech.com for tips on regulating your daily notifications and escaping the “24-hour news cycle” to prevent media from controlling your emotions. 50:15 Rapid fire questions! 52:27 Mike’s advice to his 20 year old self. 52:40 What was his best investment in himself? 53:18 The Deep Learning Book a starting point for basic literacy in data science. 53:20 Mike, like lots of guests on this show, makes a distinction between things he believes but couldn’t prove right now, and believing things for no good reason. Show Notes: https://ajgoldstein.com/podcast/ep22 AJ’s Twitter: https://twitter.com/ajgoldstein393/ Mike’s LinkedIn: https://www.linkedin.com/in/miketamir/ Mike’s Twitter: https://twitter.com/MikeTamir

    #21 Frank Diana- The Future of AI - Predicting Preparing Thriving in Our Changing Future

    Play Episode Listen Later Aug 27, 2018 67:00


    Frank Diana is a recognized futurist, thought leader and frequent keynote speaker. He has served in various executive roles throughout his career and has over 30 years of leadership experience. Currently at Tata Consultancy Services, he is focused on leadership dialog in the context of our emerging future and its implications on business, society, governments, economies, and our environment. He blends a futurist perspective with a pragmatic, actionable approach, leveraging horizon scanning and storytelling to see possible futures and drive foresight into leadership deliberation.   His leadership experience obtained through various executive roles connects practical realities with the need to focus on an emerging future filled with complexity and change. A strong ability to connect dots enables the identification of future scenarios quickly and broadly, with an ability to see implications years into the future.   The conversation with Frank centered around his research which focuses on scanning the horizon for possible futures. We address common concerns about robots taking over the job market and eventually the world. We seek to understand what’s really true, what’s fear mongering, and what individuals and businesses should do to prepare for a world of change.   Some topics we covered include:   How Frank became so interested in educating the public as a futurist through his early career. Addressing the two tipping points that have occurred thus far in humanity and changed what it means to be human-- and the coming third tipping point. Addressing some of the common fears that people have about the implications of advanced AI and robotics on the future. How the shift to an automated society might cause initial elimination of jobs, but ultimately will allow more time for pursuit of creative, entrepreneurial endeavors. A discussion on the characteristics needed to succeed in a world of change, and what you personally should do to prepare for it.   Show Notes: https://ajgoldstein.com/podcast/ep21 AJ’s Twitter: https://twitter.com/ajgoldstein393/ Frank’s LinkedIn: https://www.linkedin.com/in/fdiana Frank’s website: frankdiana.net

    #20: Kyle Polich: Skepticism and Simplifying Complex Topics with the Host of the Data Skeptic Podcast

    Play Episode Listen Later Aug 20, 2018 66:40


    Kyle Polich is the co-host of the incredibly popular Data Skeptic podcast, which he has been churning out since 2014. He studied computer science and focused on artificial intelligence in grad school. His general interests range from areas like statistics, machine learning, data viz, and optimization to data provenance, data governance, econometrics, and metrology.   The Data Skeptic Podcast features conversations on topics related to data science, statistics, machine learning, and artificial intelligence. The podcast breaks down into two different episode formats. One is a short form podcast where Kyle explains complex data science concepts in a way that non-data scientists can understand. In these episodes he’s joined by his co-host and wife, Linh da Tran. The second format is a long form interview format where Kyle interviews experts in various data science and skepticism related arenas about their work.   In this episode of the Data Journeys Podcast, I pick Kyle’s brain for patterns noticed and lessons learned through interviewing and teaching his way through nearly 400 episodes of the Data Skeptic Podcast. Some of the topics covered include:   How the Data Skeptic podcast became the only podcast to be endorsed by the Pope. Kyle’s approach to teaching complex subject matter for entry level comprehension. What patterns and lessons Kyle has taken from interviewing nearly 400 guests on his show over the last four years. Advice for listeners who are considering starting their own podcasts, colored by lessons Kyle has learned in his tenure. Kyle and I get a little meta in trading lessons, best practices, and common experiences learned from their time hosting podcasts.   Enjoy the show!   Show Notes: https://ajgoldstein.com/podcast/ep20 AJ’s Twitter: https://twitter.com/ajgoldstein393/ Kyle’s LinkedIn: https://www.linkedin.com/in/kyle-polich-5047193

    #19: Emily Glassberg Sands: Equalizing Access to Rewarding Careers as Head of Data Science at Coursera

    Play Episode Listen Later Aug 13, 2018 74:02


    Emily Glassberg Sands is the Head of Data Science at Coursera - the largest online learning platform for higher education with 35M learners from around the world. Her team leads the quantitative measurement, experimentation, and inference that informs Coursera’s product and business direction.   Emily received her Bachelor's’ degree in Economics from Princeton University, and then moved on to complete her Ph.D. in Economics from Harvard University. At Harvard, her research focused on experimental and applied methods to better understand labor markets and consumer decision-making.   An economist by training, Emily loves using data to build better, smarter products that have a positive impact on society. In this interview, we discuss the insights into Emily has found in her work at Coursera and how they can be applied to give everyone in the world equal access to education. We covered topics like:   Emily’s journey to a career in science, and how she went from from Montessori school, to the farms of Montana, to an Econ Ph.D at Harvard. How she uses analytics to identify ways to improve the learning experience and provide high quality education worldwide. Emily’s Ph.D research on why companies hire based on referrals and how they performed compared to non-referral hires. What is analytical creativity, how can it be developed, and why is it important for a multi-disciplinary team? What Emily and her team at Coursera have discovered about the learning process and the confidence gap among learners. The behaviours of successful learners and how to change your habits.   Enjoy the show!   Show Notes: https://ajgoldstein.com/podcast/ep19 AJ’s Twitter: https://twitter.com/ajgoldstein393/ Emily’s LinkedIn: https://www.linkedin.com/in/egsands/ Emily's Twitter: https://twitter.com/emilygsands

    #18: Dan Hammer: Democratizing Environmental Data at the White House, NASA, National Geographic, and More

    Play Episode Listen Later Aug 6, 2018 61:04


    Dan Hammer is an environmental economist and winner of the 2017 Pritzker Prize for the Environment. Currently he serves as a National Geographic Fellow and the co-founder of Earthrise Media, and throughout 2016, he was the Senior Policy Advisor to the U.S. Chief Technology Officer, Megan Smith, as part of the Obama Administration.   Before arriving at the White House, Dan was the Presidential Innovation Fellow that released the first API listing for NASA. Prior to NASA, Hammer was the Chief Data Scientist at the World Resources Institute, where he helped re-launch Global Forest Watch, an open-source project to monitor deforestation.   After graduating from Swarthmore College in 2007 with high honors in mathematics and economics, and before receiving his PhD in environmental economics from the University of California, Berkeley, Dan was a Thomas J. Watson Fellow and traveled to Polynesia to build and race outrigger canoes. Today, among many other amazing mentors, he continues to works with Steve McCormick (former CEO of The Nature Conservancy) on web service infrastructure for environmental information.   played in the academic/private/public sectors // has stayed so true to a single mission across 17+ positions over 10 years // we took a while to get to the prize & White House   Some topics we covered include: How the strong sense of safety he experienced in his childhood has supported all the risk-taking he now takes on in his career. The lasting impact that mentors like Megan Smith, Steve McCormick, David Wheeler, and Arvind Subramanian have had on his career Where he sees the job of a data scientist (who knows what), ending, and a subject matter expert (who knows why), beginning. The most meaningful moments of his experienced at the White House, from working with a brilliant mentor to being in the situation room during the Flint Water Crisis. How teaching math to inmates at San Quentin State Prison for 2 years catalyzed his path to the World Resources Institute and NASA. Why -- across 17+ positions over the past 10 years -- democratizing scientific data and making it more accessible to the public has been THE consistent focus of his work.   Enjoy the show!   Show Notes: https://ajgoldstein.com/podcast/ep18/ Dan’s Website: https://www.danham.me/r/about.html AJ’s Twitter: https://twitter.com/ajgoldstein393/

    #17: Decoding Healthy Meals & Learning the World of Food with Sivan Aldor-Noiman and Erik Andrejko @ Wellio

    Play Episode Listen Later Jul 30, 2018 54:00


    In this episode of the Data Journeys podcast, we have not one, but two guests!   Sivan Aldor-Noiman and Erik Andrejko join me from Wellio: an intelligent platform that uses machine learning and behavioral science to help people plan, shop, prepare and enjoy healthy meals at home.   Wellio is on a mission to decode how meals are prepared and enjoyed at home, both on an individual-level in terms of people’s preferences and on a global-level in terms of semantic & nutritional understanding of food.   Quite interestingly, Sivan began her career in the Israeli army, serving as an instructor for an anti-tank missile unit. Afterwards, she transitioned to school and received her undergraduate degree in Industrial Engineering and a Master in Statistics from the Technion Israel Institute of Technology. At 26 years old she moved to the U.S. to complete a PhD degree in Statistics from The Wharton Business School at the University of Pennsylvania. Since then, she’s spent the majority of her career at The Climate Corporation, which -- despite what the name may suggest -- specializes in using data science to optimize farming practices.   Erik graduated Summa Cum Laude in 2001 from Arizona State University’s undergraduate Computer Science program before receiving his PhD in Pure Mathematics from the University of Wisconsin-Madison in 2009. He and Sivan first started working together at the Climate Corporation, where he was the VP of Science and Director of Data Science, leading the implementation of large scale statistical machine learning within numerous domains, including climatology, agronomic modeling and geo-spatial applications.   Today, Erik is the Co-Founder and Chief Technology Officer (CTO) at Wellio, while Sivan is the Head of Data Science and self-proclaimed Chief Fun Officer.   In this conversation, we dive deep into Wellio’s artificial intelligence for eating, covering topics like:   What it means to be a passionate disagreeable giver and how this fits into Sivan being Wellio’s Chief Fun Officer The problem that Wellio is solving and why’s is personally meaningful to Sivan and Erik The approach Wellio takes toward building recommendation systems to understand home chefs and using computer vision techniques to semantically learn food The most (and least) nutritious meals for your dollar, ranked by Wellio’s nutritional density model Why Erik sees trust as the necessary gateway when transitioning from descriptive recommendations to prescriptive recommendations How they learned at The Climate Corporation to help people who know nothing about data science (farmers) understand the power of it for their business Enjoy the show!   Show Notes: https://ajgoldstein.com/podcast/ep17/ Sivan’s LinkedIn: https://www.linkedin.com/in/sivan-aldor-noiman-297aa421/    Erik’s LinkedIn: https://www.linkedin.com/in/erik-andrejko-0900a51b/ AJ’s Twitter: https://twitter.com/ajgoldstein393/

    #16: Jim Guszcza: Chief Data Scientist at Deloitte Consulting

    Play Episode Listen Later Jul 23, 2018 57:08


    Jim Guszcza is the US Chief Data Scientist at Deloitte Consulting. Deloitte is the largest professional services network in the world in terms of revenue, with over 263,900 professionals globally.   Jim has been with Deloitte since 2001, where he took the lead on using behavioural nudge tactics to effectively act on model indications and prompt behavior change. Since then, he has gained extensive experience in applying predictive analytics to a variety of public and private sector domains.   In addition to his work with Deloitte, Jim is also an author and former professor at the University of Wisconsin-Madison business school. He received his PhD in Philosophy from the University of Chicago.   In this interview, we explore how philosophy and nudge theory can be applied to change human behavior using data science. Some highlights include:   How growing up a sci-fi fan led Jim to pursue science and philosophy as a career. The importance of a philosophy education and how it can train you to analyze issues you encounter in your career. How philosophy can be used to humanize algorithms and aid in complex decision making processes by taking away individual bias. How data science can uncover and expand ‘nudge theory’ to encourage certain human behaviors. Where are the boundaries of using information uncovered by data science? When does it cross the line from actionable to information overload?   Enjoy the show!   Show Notes: https://ajgoldstein.com/podcast/ep16/ Jim’s LinkedIn: https://www.linkedin.com/in/jim-guszcza-5330375/ AJ’s Twitter: https://twitter.com/ajgoldstein393/

    #15: Travis Oliphant: Creating, Evolving, & Funding Open-Source Software

    Play Episode Listen Later Jul 16, 2018 65:32


      Travis Oliphant is the Founder & CEO of Quansight: a company that bridges open-source communities and innovative companies by growing talent, building technology, and discovering new products. For years Travis has been an indispensable contributor toward data science’s open-source movement through so many different outlets: Founder, Director, & Former CEO @ Anaconda, Inc: a free and open source distribution of over 250 popular data science packages for Python and R, used by over 6 million users. Founder, Chairman of the Board @ NumFOCUS Foundation: the world-renowned open-source community promoting open code development and reproducible scientific research. President @ Enthought: a software company best known for the early development and maintenance of the SciPy stack. Creator of NumPy, SciPy, Numba, & XND: all invaluable open-source Python libraries Before founding Continuum Analytics (later renamed to Anaconda) in 2012, Travis received a Ph.D. from the Mayo Clinic, B.S. and M.S. degrees in Mathematics and Electrical Engineering from Brigham Young University, and spent nearly a decade thereafter as an Assistant Professor of Electrical and Computer Engineering at BYU before phasing out of academia to focus on creating open-source software to support industry. Naturally, our conversation focused on his work with creating, evolving, and funding open-source software, covering topics like: How growing up in a bad neighborhood and being raised Mormon led him to an early fascination with computers and a laser-focus on helping people What led him to take a leap from the stability of a 10-year academic career to trying to support a family while working on free software Why open-source software has been such a centerpiece throughout his career, the inflection points he’s experienced over the past 20 years Why community-based, open-source models have been severely underfunded, how Travis has managed to “monetize open source” with his newest company, Quansight. The open-source community’s recent shift “away from ‘local co-op’ and toward ‘big agriculture’”, the challenges Travis is seeing pop-up The story behind how he first wrote the Numpy library (all by himself!) over 4-5 months of 60-70 hour workweeks before others started to see any potential All of the amazing open-source projects he and his team need support with from brilliant data scientists like you :) Enjoy the show!   Show Notes: https://ajgoldstein.com/podcast/ep15/ Travis’ LinkedIn: https://www.linkedin.com/in/teoliphant/ Travis’ Twitter: https://twitter.com/teoliphant AJ’s Twitter: https://twitter.com/ajgoldstein393/      

    #14: Drew Conway – Applying Data Science to Where It’s Needed Most

    Play Episode Listen Later Jul 9, 2018 71:56


    Drew Conway is a world-renowned data scientist, entrepreneur, author, and speaker, perhaps most well-known for his infamous 2010 “Data Science Venn Diagram”. Today, Drew is the Founder & CEO of Alluvium: a company using machine learning and AI to turn massive streams of data produced by industrial operations companies into insights that bridge the gap between big data and human expertise. Designed with the goal of helping industrial operations become safer, more efficient and more profitable, the Alluvium platform makes industrial machine data meaningful and useful to the people who rely on it to make decisions that affect the stability of their operation. Before starting Alluvium in 2015, Drew helped start: Data Gotham: an organization focused on supporting the NYC data community, with an annual conference bringing together people from all industries DataKind: a non-profit that brings high-impact organizations together with leading data scientists to use data science in the service of humanity. They enable data scientists and social changemakers to address tough humanitarian challenges together, ranging from education to poverty, health to human rights, and the environment to cities. After starting the conversation by exploring Drew’s early years, we focused most of the dialogue around his (quite frankly, brilliant) thought process around identifying the highest-impact, most-needed applications for data science across problem spaces. Some of my favorite talking points included: Why “force of will” and a “tendency toward combativeness” were key to Drew’s early development and overcoming imposter syndrome Lessons learned from his 4th grade teacher who told him he was bad at math and an AAU basketball coach who made his team find their way home from the outskirts of Las Vegas The questions Drew asks executives who tell him they want to hire a data science team, how he recommends they avoid being “seduced by the industry” and “return back to first principles” Drew’s process for determining new applications for data science within various industries The three-question mental model Drew used to identify Alluvium’s first major product offering: business problem → data available → human support Alluvium’s team-building and hiring philosophy, how it’s evolved from day one until today The story behind DataKind, how he and his team decided what nonprofits to start by working with, and the step-by-step process they took to testing their assumptions   Enjoy the show!   Show Notes: https://ajgoldstein.com/podcast/ep14/ Drew’s LinkedIn: https://www.linkedin.com/in/drew-conway-13b5b013/ AJ’s Twitter: https://twitter.com/ajgoldstein393/

    #13: Anthony Scriffignano: Developing Cultural Awareness & Global Perspective Through International Data Science

    Play Episode Listen Later Jul 2, 2018 51:59


    Anthony Scriffignano is the Chief Data Scientist at Dun & Bradstreet: a publicly traded company that provides commercial data, analytics and insights about businesses through their database of more than 290 million business records worldwide. Most commonly known as D&B, the company was originally founded 176 years ago in the horse-and-buggy days of 1841, and today, is headquartered in Short Hills, NJ, with over $2.5 billion in assets. Anthony has been at D&B since 2002 and, with over 35 years of experience in information technologies, Big-4 management consulting, and international business, is well-regarded as an international thought-leader in data science. Today he leads a team of data scientists focused on advancing Dun & Bradstreet's core capabilities and IP globally. With extensive background in advanced algorithms and linguistics, he holds multiple patents and presents globally on data and technology trends, multilingual challenges in business identity, and artificial intelligence. In this conversation, we spoke quite a bit about the international lens through which Anthony views his work, covering topics like: How foreign language, cultural awareness, and learning to deeply understand people with very different backgrounds than his own has affected his work as a data scientist Anthony’s own definition of ‘trust’, how he’s seen it vary across cultures, and how this played out in “acceptance testing” at Dun & Bradstreet Why “technical expertise is necessary but not sufficient” and how his global perspective (he knows 7 languages & does quite a bit of travel!) has supplemented day-to-day work The bedrock foundations on which he sees the becoming of a data scientist today: People // Process // Technology // Mindset How he’s managed to stay focused on the ‘critical few’, rather than the ‘trivial many’ throughout his career, especially in an increasingly distracted world Why studying martial arts is the best investment he’s ever made in himself   Enjoy the show!   Show Notes: https://ajgoldstein.com/podcast/ep13/ Anthony’s LinkedIn: https://www.linkedin.com/in/anthony-scriffignano-ph-d-9165845/ AJ’s Twitter: https://twitter.com/ajgoldstein393/

    #12: Favio Vázquez - Data Science Polymath

    Play Episode Listen Later Jun 25, 2018 89:42


    Favio Vazquez is the Principal Data Scientist at OXXO: Mexico’s largest convenience store chain with over 17,500 locations. In addition to his work at OXXO, Favio brings new meaning to the term “polymath” by simultaneously holding 5 other related positions at AI companies in Mexico: Senior Data Scientist @ Raken Data Group Chief Data Scientist @ Iron AI Creator of the Ciencias y Datos online course Data Science Lecturer @ Afi Escuela de Finanzas Collaborator @ CosmoSIS With such a wide-ranging background, we covered a lot in this conversation, including: Favio’s experience growing up in Venezuela, the childhood influencers that played a core part in shaping him into who he is today Tricks like “frontloading” and “batching” that Favio uses to juggle many projects at once How individuals looking to get hired as a company’s first data scientist should explain the value that data science can provide, what language to use and terminology to avoid His approach for determining which questions would yield the most valuable return in applying data science to OXXO’s business The greatest similarities and differences between how data science is practiced in Mexico versus the United States When and how beginners should invest in learning advanced topics like big data processing (e.g. Apache Spark) and deep learning (e.g. neural networks) Enjoy the show!   Show Notes: https://ajgoldstein.com/podcast/ep11/ Favio’s LinkedIn: https://www.linkedin.com/in/faviovazquez/ AJ’s Twitter: https://twitter.com/ajgoldstein393/

    #11: Dan Wulin: International E-Commerce, Price Optimization, & Home-Good Product Recommendations

    Play Episode Listen Later Jun 18, 2018 65:06


    Dan Wulin is the Director of Data Science at Wayfair: an international e-commerce company specializing in home goods. Wayfair is a $5B company growing 40% year-over-year, with 10 million products and over 8,700 employees around the world. Their data science team is 80-people strong and growing fast, using econometrics to optimize prices, biostatistics to boost marketing, and computer vision to personalize product recommendations. Prior to joining Wayfair, Dan studied Math & Physics as an undergraduate at Columbia University and, thereafter, received his PhD in Theoretical Physics from the University of Chicago. Coming out of school, he worked as a consultant at Boston Consulting Group for a year in Chicago before transitioning to Wayfair in Boston. In this conversation, we cover a wide-range of topics, including: His childhood obsession with text-based multiplayer RPGs like Gemstone Dan’s roots in academia, how studying the physics of superconductors taught him (painfully) how to break down complex problems in simple ways How a misalignment of his problem solving approach with standard consulting frameworks at BCG led him to Wayfair Dan’s first major effort at Wayfair, project Athena, that saved the company millions of dollars on Google ad spending His passion for setting up Wayfair’s data science team to be successful, how teams within the organization partner, as well as who and how they hire The computer-vision-driven approach Wayfair takes toward measuring people’s creative taste on subjective things like furniture preference Enjoy the show! Show Notes: https://ajgoldstein.com/podcast/ep11/ Dan’s LinkedIn: https://www.linkedin.com/in/dan-wulin-02169a2a/ AJ’s Twitter: https://twitter.com/ajgoldstein393/  

    #10: Jure Leskovec - Chief Scientist of Pinterest

    Play Episode Listen Later Jun 11, 2018 60:31


    Jure Leskovec is the Chief Scientist of Pinterest, an $11 billion dollar company hosting over 75 billion idea “pins” from it’s 175 million monthly users worldwide. Jure originally arrived at Pinterest in 2014 when his company, Kosei, was acquired after starting a “recommendation revolution” through smarter, personalized mobile ads.  When Jure is not “turning cameras into keyboards” at Pinterest -- Fast Company’s “2nd most innovative AI company” -- he can also be found fulfilling his responsibilities as a: Associate Professor of Computer Science at Stanford University - where his research focuses on mining and modeling large social and information networks, including relationship graphs and chain effects in online community settings Investigator at the Chan-Zuckerberg Biohub - a multidisciplinary research organization on a mission to make all diseases preventable, manageable or curable by the year 2100 Some favorite topics we covered include: How being Pinterest’s Chief Data Scientist has affected his own social media use The story behind how he went from party conversation about Kosei to company acquisition by Pinterest in just 2 months How Pinterest makes recommendations to it’s users and thinks about the explore vs. exploit tradeoff of social media ads Moral considerations of echo chambers/filter bubbles/confirmation bias that he and his team take into account when serving Pinterest’s 200+ million users content each day Balancing short and long-term benefits at the individual (value to user), community (health of the content ecosystem), & company (ad revenue) level Recent research from Harvard, Stanford, UChicago, & Cornell investigating how machine learning can help criminal court judges make high-stake decisions This turned out to be one of my most fascinating conversations yet, so I hope you enjoy it as much as I did! Show Notes: https://ajgoldstein.com/podcast/ep10/ Jure’s Twitter: https://twitter.com/jure?lang=en AJ’s Twitter: https://twitter.com/ajgoldstein393/

    #9: Daniela Huppenkothen – Astronomy, Cosmology, & The Study of Space

    Play Episode Listen Later Jun 4, 2018 83:32


    Daniela Huppenkothen is the Associate Director at the DIRAC Institute of University of Washington, home to a diverse team of researchers in astrophysics and cosmology.  Before arriving at DIRAC, she has studied space & data science as a: Moore-Sloan Data Science Fellow of NYU - a multi-dimensional and independent research programs with impact in several scientific domains PhD recipient at University of Amsterdam - where she researched high & low energy astrophysics and the relationship between the two communities, namely “magnetar bursts”. Blogger of Hackathons, Data Science, and Academia - Daniela takes on a number of issues and shares her wise life lessons along the way.   In this conversation, we cover a wide-range of topics, including: The all-too-common (but remarkably important) mix up between astrology and astronomy How data science has a place in astronomy, with telescopes more powerful than several hundred iphones Frustrations with academia, a lack of validity when it comes to standardized testing, and how one learns to love getting harsh criticism How astronomy is an exciting opportunity to build teams who collaborate in unexpected ways, and tie together multiple backgrounds How Daniela’s academic outlook developed from open source - without necessarily realizing it in the moment Enjoy the show!   Show Notes: https://ajgoldstein.com/podcast/ep9/ Daniela’s Website: http://huppenkothen.org/ Daniela’s LinkedIn: https://www.linkedin.com/in/daniela-huppenkothen-0b44a97b/ AJ’s Twitter: https://twitter.com/ajgoldstein393/

    #8: Chris Albon – Connecting Africa to the Internet & Defending Public Discourse

    Play Episode Listen Later May 14, 2018 87:17


    Chris Albon is the Chief Data Scientist of BRCK: a Kenyan startup building a network dedicated to connecting Africa to the internet, and author of the Machine Learning with Python Cookbook. For years he’s been contributing to the data science world through so many different outlets:   Cofounder of New Knowledge AI - an social media platform focused on protecting companies from disinformation, fighting fake news, and defending public discourse Former host of Partially Derivative - a popular podcast mixing explorations into data science techniques with discussions in the field’s leading experts. Content creator of Machine Learning Flashcards  - simplified, easy to digest flashcards for otherwise-complex machine learning concepts.   Blog writer at chrisalbon.com - providing some of best (and definitely most wide-ranging) technical notes out there on machine learning, statistics, deep learning, Python, and so much more. Our conversation went many places, including: How early childhood experiences (including his heritage in Zimbabwe) led him to focus his career so strictly on social impact, through political and humanitarian efforts How he’s established himself (and his blog) as an authority figure in data science & AI (with 523 resource links and counting) without having a “technical” degree The step-by-step process BRCK takes when incorporating technology into African communities & what he’s learned about challenging his own assumptions while doing so The initial inspiration behind starting the Partially Derivative podcast and why his aim was for it to be the “talk at the bar after the conference” what his newest book -- Machine Learning Cookbook with Python -- is all about, who it’s for, and the gap it addresses for the community Enjoy the show!   Show Notes: https://ajgoldstein.com/podcast/ep8/ Chris’ Twitter: https://twitter.com/chrisalbon AJ’s Twitter: https://twitter.com/ajgoldstein393/

    #7: Fernando Perez — Creating IPython, Founding NumFOCUS, & The Stories Behind It All

    Play Episode Listen Later May 7, 2018 52:22


    Fernando Perez is best-known as the creator of IPython and co-founder of Project Jupyter: a set of open-source data science tools that some may consider to be the equivalent of the bat & ball to the sport of baseball. Today, you really can’t play the game of data science without Jupyter Notebooks and our guest today is one of Jupyter's leads and originators (see here for the rest of the amazing team). Fernando is also an Assistant Professor in Statistics at UC Berekely, Researcher at the Berekely Institute for Data Science, and Founding Board Member of the NumFOCUS foundation — the community that creates the SciPy stack, along with virtually every other notable open source data science tool out there. This conversation was recorded in-person with Fernando in his office on UC Berekely’s campus, and it turned out to be the most humanizing, energizing, and down-to-earth interview I’ve had so far. Some of the many topics we covered include: what Fernando wanted to be while growing up in Medellin (Me-de-jean), Colombia the function that formal education played in his learning of data science the story behind IPython and Project Jupyter and it’s evolution over the past 10 years lessons learned about technical competence and human character from his mentors over the years what a “computational narrative” means to him and why it’s principles are key to data storytelling Fernando’s experience teaching a 650-student course (part of a pair of courses that are the largest of it's kind) as part of the Berekely Institute of Data Science Enjoy the show! Show Notes: https://ajgoldstein.com/podcast/ep7/  Fernando’s Twitter: https://twitter.com/fperez_org AJ’s Twitter: www.twitter.com/ajgoldstein393/

    #6: Andrew Ng — Globalizing Education, Disrupting Industries, & Generalizing Artificial Intelligence

    Play Episode Listen Later Apr 11, 2018 44:41


    Andrew Ng is an Adjunct Professor at Stanford University and nothing short of a giant in the data science, machine learning, and artificial intelligence world. For the past decade he’s been shaping the way we live and learn. Four recent examples include: Co-Founder of Coursera: an education platform that offers online courses from top universities across the world Chairman of the Board for Woebot: a chatbot that’s currently revolutionizing mental health care Creator of DeepLearning.AI: a series of specialization courses created to help beginners break into the field of AI Former Chief Scientist at Baidu AI Group: basically the Google of China This conversation was recorded in-person with Andrew in his office on Stanford’s campus in Palo Alto, California. We covered a ton of different topics, including: the goals of Andrew's new $175M AI Fund how he plans to revolutionize manufacturing strategies for ML practitioners to tighten their feedback loop what differentiates the best businesses like Amazon, Facebook, & Google from the rest in terms of their use of AI the lack of progress he's seeing toward artificial general intelligence and the so called "technology singularity" why going to college for 4 years and coasting for 40 years makes no sense in today's rapidly changing world Enjoy the show!   Show Notes: https://ajgoldstein.com/podcast/ep6/  Andrew’s LinkedIn: https://twitter.com/AndrewYNg AJ’s Twitter: www.twitter.com/ajgoldstein393/

    #5: Eric Weber — Lifelong Learning & The Art of Staying Curious

    Play Episode Listen Later Mar 30, 2018 73:37


    Eric Weber is a Senior Data Scientist at LinkedIn, online educator, and epitome of a lifelong learner who’s created quite the following for himself through posts on LinkedIn’s platform. He holds a Bachelors degree in Mathematics at the University of Wisconsin, two Masters degrees in Math and Business Analytics at Arizona State University and the University of Minnesota, and a PhD from ASU. And as if that’s all not enough, he’s obtained 31 course certificates in MOOCs from Coursera and DataCamp… nearly all which in the realm of Data Science. Naturally, this conversation was primarily focused on Data Science education through the lens of Eric’s experience as both a teacher & student. Some topics we covered include… Eric’c childhood obsessions, what he wanted to be growing up what attracted Eric to LinkedIn and how he landed the job the skills & aptitudes best suited for in-class vs. online learning strategies for approaching online & self-directed learning the best and worst advice Eric has heard or received ethical implication of product promotion the danger of “content farms” in maintaining authentic perspectives Enjoy the show!   Show Notes: https://ajgoldstein.com/podcast/ep5/ Eric’s LinkedIn: https://www.linkedin.com/in/eric-weber-060397b7/ AJ’s Twitter: www.twitter.com/ajgoldstein393/

    #4: Kirk Borne — NASA, Astrophysics, & The Evolution of Modern-Day Data Science

    Play Episode Listen Later Mar 30, 2018 77:19


    Kirk Borne is the Principal Data Scientist at Booz Allen Hamilton, PhD Astrophysicist, and world-renowned Big Data Influencer. In 2007 he launched the first-ever data science bachelors degree program as a Professor of Computational Science at George Mason University and in 2009 co-created the field of Astroinformatics to support his research. Since Kirk has been around long-before the field was even called Data Science, we start this conversation by taking a 50-foot helicopter view of the field: where we’ve been over the past decade, where we are now, and how aspiring Data Scientists can prepare for what’s coming next. Other topics covered include… what it means to be “data literate” how organizations can promote a “culture of experimentation” the data science skills, talents, & aptitudes that have stood the test of time Kirk’s thought process behind creating the first data science bachelors degree program at GMU changes he would make to the program if he re-created it today …. and perhaps most notably: how Kirk was asked to brief the President of the United States on Data Mining after the terrorist attacks of 9/11 Enjoy!   Show Notes: https://ajgoldstein.com/podcast/ep4/ Kirk’s Twitter: https://twitter.com/KirkDBorne AJ’s Twitter: www.twitter.com/ajgoldstein393/

    #3: Kale Temple — Artificial Intelligence & Data Science Consulting

    Play Episode Listen Later Mar 13, 2018 112:13


    Kale Temple is a Data Scientist, Startup Advisor, & Guest Speaker at the University of Sydney, down-under in Australia. Most recently, he became the Founder & CEO of Intellify — an Artificial Intelligence consulting company based in Australia, where he and his team help companies grow exponentially, operate efficiently, and all-in-all become more data-driven in their decision making. In this conversation we spend a lot of time focusing on Data Science through the lens of consulting, covering topics like: how Kale has been able to master Machine Learning without a degree in Computer Science or Statistics the pros/cons of different data science career paths Kale’s strategies for landing consulting clients and building relationships that last his thought process behind starting companies how he packages up his technical skills into consulting projects tactics for identifying business pain-points where data science can have the biggest impact how to build a personal brand for yourself as a Data Scientist and maximize the business value you’re able to bring … and so so much more. Enjoy! Show Notes: www.ajgoldstein.com/podcast/ep3/ Kale's LinkedIn: https://www.linkedin.com/in/kaletemple/  AJ's Twitter: www.twitter.com/ajgoldstein393/ 

    #2: Kate Strachnyi — A Meta-Interview of 20 Amazing Data Scientists

    Play Episode Listen Later Mar 13, 2018 77:17


    In this episode of the Data Journeys podcast, I'm joined by Kate Strachnyi. Kate is a data visualization specialist at Deloitte consulting, author the book “Journey to Data Scientist”, and creator of the Story by Data blog and YouTube channel. This conversation with Kate was a really special treat because her book, “Journey to Data Scientist” is a essentially compilation of interviews that Kate herself conducted with 20 amazing Data Scientists. When she wanted to learn more about Data Science, as she says, she went “straight to the source”. So what we really have here is a meta-interview with Kate serving as the umbrella over 20 other people & stories — with backgrounds ranging from LinkedIn and Pinterest to Bloomberg and IBM. The focus of the conversation lies on the overarching themes and patterns she found throughout her own 20 interviews. In terms of topics — we covered so much — but some my personal favorites included advice for building a project portfolio, use-cases for Data Science across industries & organization-size, the most overrated and underrated aspects of doing Data Science, how to properly evaluate job opportunities, and strategies for finding an environment where you can thrive. We also spend some time speaking about her extensive experience with the art of data storytelling, which she breaks down for beginners very nicely. Enjoy! Show Notes: www.ajgoldstein.com/podcast/ep2 Kate's LinkedIn: https://www.linkedin.com/in/kate-strachnyi-data/ AJ's Twitter: https://twitter.com/ajgoldstein393 

    #1: Andrew Cassidy — Predicting Bombs, Learning Online, & Working Anywhere

    Play Episode Listen Later Mar 13, 2018 83:40


    In this pilot episode of the Data Journeys podcast, I'm joined by Andrew Cassidy. Andrew is a freelance Data Scientist, Software Engineer, & Online Educator. We originally met this past September, when he was my virtual mentor for Springboard’s Data Science Intensive bootcamp. In 2011 Andrew graduated from the University of Virginia with a bachelors degree in Systems Engineering, and afterwards, went to work as a Data Scientist for a company called Commonwealth Computer Research in Charlottesville, VA. While there, he was heavily involved in contracting work with the US military, and for one project used logistic regression to predict the likelihood of bombs in Afghanistan (which he talks about in detail in the interview here). After 4 years at Commonwealth, he decided to go back to school for a Masters degree in Computer Science at Georgia Tech (which he interestingly enough completed almost entirely online while rock climbing and camping as a “digital nomad” throughout the United States). Finally, just this past summer after graduating from Georgia Tech, he decided to shift his attention to the freelance world, while also serving as an educator for online learning platforms like Springboard. It’s amazing all that Andrew has done at just 28 years old, and we cover so much of it in this conversation… from explosive battlegrounds in Afghanistan to the sunny mountains of Boulder Colorado, we touch on topics like Andrew’s childhood, his principles for effectively learning Data Science, how he lands clients, the best/worst advice out there for up-and-comers, and even his strategies for dealing with overwhelm, lost focus, & the all-too-common imposter syndrome. We took a little while to get warmed up, so please be patient, but if you want to go straight to his stories about Afghanistan, feel free to skip to the 50:00 mark. Enjoy!   Show Notes: www.ajgoldstein.com/podcast/ep1 Andrew's LinkedIn: https://www.linkedin.com/in/andrew-cassidy-62b726b4/  AJ's Twitter: https://twitter.com/ajgoldstein393 

    #0: Welcome

    Play Episode Listen Later Mar 13, 2018 6:18


    Some background about the host, why he started this podcast, and what you can expect from the show.

    Claim Data Journeys

    In order to claim this podcast we'll send an email to with a verification link. Simply click the link and you will be able to edit tags, request a refresh, and other features to take control of your podcast page!

    Claim Cancel