Podcast appearances and mentions of matthew rocklin

  • 12PODCASTS
  • 15EPISODES
  • 53mAVG DURATION
  • ?INFREQUENT EPISODES
  • Dec 14, 2022LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about matthew rocklin

Latest podcast episodes about matthew rocklin

Open||Source||Data
Enabling Edge Workers, AI & ML, and The Future of Data Science with Matthew Rocklin

Open||Source||Data

Play Episode Listen Later Dec 14, 2022 44:14


This episode features an interview with Matthew Rocklin, CEO of Coiled, the scalable Dask-based cloud platform. Prior to founding Coiled, Matthew worked on Dask at Anaconda and then NVIDIA where his teams focused on accelerating Dask through parallel computing and GPUs. Matthew is an industry speaker, author, and founding member of Pangeo, whose mission is to develop open source analysis tools for ocean, atmosphere, and climate science.In this episode, Sam sits down with Matthew to discuss enabling edge workers, the future of data science, and the revolution of AI and ML.-------------------“There's all sorts of fun people using these tools and that's the most fun part of this job. You get to learn so much about so many different applications that are all so different and all so fascinating. You were thinking about all these different tools and technologies and I was talking to someone once, it's like, ‘Oh, it's like you're standing on the shoulders of giants.' That's not quite right. There's lots of sort of normal size people all standing on each other's shoulders in like a massive pyramid. [...] Dask was designed to scale up an existing ecosystem. There's a legacy Python ecosystem that'll provide a layer of parallel computing on top of it. You can do that either by rewriting the whole thing, which is not feasible, or you can do it by talking to lots of people and getting them to integrate in interesting, fun ways. That's actually been the fun parts of Dask. I think I've probably talked to every major maintainer group ever. I have worked with them to find out the ways to get everything to work smoothly together. And that's super fun. There's an interesting sort of technical and social hacking that occurs, which I think Python has done pretty well at, historically. Which is why it has success.” – Matthew Rocklin-------------------Episode Timestamps:(00:58): What open source data means to Matthew(03:29): Matthew's motivations behind Python(18:58): How Matthew is enabling edge workers (34:46): What the future of data Python space looks like(39:29): Matthew's advice for the technical data audience-------------------Links:LinkedIn - Connect with MatthewTwitter - Follow MatthewVisit Matthew's WebsiteVisit DaskDask ExamplesVisit CoiledSciPy Mission

Open Source Startup Podcast
E39: Coiled & Open Source Dask - Use Python for Ambitious Problems

Open Source Startup Podcast

Play Episode Listen Later Jun 21, 2022 33:10


Matthew Rocklin is Founder & CEO of Coiled, a company that sits on top of open-source Dask which makes Python highly scalable for data scientists. Coiled makes Dask enterprise-ready and gives users access to faster cluster startup times, savings on cloud costs, and allows them to run their Python workloads faster. Coiled has raised $26M from investors including Bessemer and Costanoa. In this episode, we discuss the creation of Dask, the decision to start a company around it, the challenges that come with company building, and much more!

The Tech Blog Writer Podcast
2006: Coiled | Python for Data Science on the Cloud with Dask

The Tech Blog Writer Podcast

Play Episode Listen Later Jun 13, 2022 28:38


Matthew Rocklin, the creator of Dask, the popular Python native distributed computing library. Matthew is also a founder and CEO of Coiled, which helps teams run Dask in the cloud. The past decade has seen a boom in several open source tools built specifically for data scientists. However, most of these projects struggle to gain wide adoption. Matthew is one of the few leaders in the developer tools space who knows how to build things that data scientists love. Matthew believes that for a data science tool to be successful, it must be human-centric. Data science is a diverse field. It should be easy for any data scientist to pick up a tool and quickly adapt it to their unique use case. Coiled is built on the success of Dask and is being used by thousands of data scientists and engineers in organizations like NASA, Capital One, Anthem Health, and the USAF to combat climate change, perform a credit risk analysis, and manage supply chains. Matthew shares his entrepreneurial path and offers tips on building a human-centric developer tool that thousands of developers will love. We dare to explore the future of distributed computing.

Talk Python To Me - Python conversations for passionate developers
#300 Building a data science startup (panel)

Talk Python To Me - Python conversations for passionate developers

Play Episode Listen Later Jan 22, 2021 66:22


You've heard that software developers and startups go hand-in-hand. But what about data scientists? Of course they! But how do you turn your data science skill set into a data science business skill set? What are some of the areas ripe for launching such a business into? On this episode, I welcome back 4 prior guests who have all walked their own version of this path and are currently running successful Python-based Data Science startups: * Ines Montani from Explosion AI * Matthew Rocklin from Coiled * Jonathon Morgan from Yonder AI * William Stein from Cocalc Links from the show Ines Montani Twitter: @_inesmontani Explosion AI: explosion.ai Matthew Rocklin Twitter: @mrocklin Coiled: coiled.io Jobs @ Coiled: jobs.lever.co/coiled Jonathon Morgan Twitter: @jonathonmorgan Yonder AI: yonder-ai.com William Stein Twitter: @wstein389 CoCalc: cocalc.com Talk Python Live Streams: talkpython.fm/youtube Sentry Promo Code: TALKPYTHON2021 Sponsors Sentry Error Monitoring, Code TALKPYTHON Linode Talk Python Training

AI Podcast in 26.1 Minutes
[1/2] Built from Open Source Software: Coiled Team Visits 26.1 AI Podcast

AI Podcast in 26.1 Minutes

Play Episode Listen Later Dec 1, 2020 26:21


[Part 1 of 2] Listeners join in for a wonderful conversation in this episode. Our guests Matthew Rocklin and Hugo Bowne-Anderson are extending access to powerful distributed computing for more data users with their startup Coiled (https://coiled.io/). Data scientists with a two minute download of Coiled’s software (https://cloud.coiled.io/) can scale their work to the cloud. We discuss during the episode how conversations with the open source community resembles early customer conversations commonly used by entrepreneurs in a lean startup framework. Dask’s creator and Coiled founder Matthew described his software design approach that has a decided minimalist bent. A great benefit for users of popular Python libraries because of Matt’s approach is a familiar interface when using Dask or Coiled to extend the power of popular PyData stack tools. Our conversation turns to how Coiled has the capability to extend more computation power to many casual users of Python who are interested in solving data problems pragmatically without rebuilding a data factory every time. [Join us next week for Part 2]

MLOps.community
Scalable Python for Everyone, Everywhere // Matthew Rocklin // MLOps Meetup #37

MLOps.community

Play Episode Listen Later Oct 19, 2020 57:10


Parallel Computing with Dask and Coiled Python makes data science and machine learning accessible to millions of people around the world. However, historically Python hasn't handled parallel computing well, which leads to issues as researchers try to tackle problems on increasingly large datasets. Dask is an open source Python library that enables the existing Python data science stack (Numpy, Pandas, Scikit-Learn, Jupyter, ...) with parallel and distributed computing. Today Dask has been broadly adopted by most major Python libraries, and is maintained by a robust open source community across the world. This talk discusses parallel computing generally, Dask's approach to parallelizing an existing ecosystem of software, and some of the challenges we've seen in deploying distributed systems. Finally, we also addressed the challenges of robustly deploying distributed systems, which ends up being one of the main accessibility challenges for users today. We hope that by the end of the meetup attendees will better understand parallel computing, have built intuition around how Dask works, and have the opportunity to play with their own Dask cluster on the cloud. Matthew is an open source software developer in the numeric Python ecosystem. He maintains several PyData libraries, but today focuses mostly on Dask a library for scalable computing. Matthew worked for Anaconda Inc for several years, then built out the Dask team at NVIDIA for RAPIDS, and most recently founded Coiled Computing to improve Python's scalability with Dask for large organizations. Matthew has given talks at a variety of technical, academic, and industry conferences. A list of talks and keynotes is available at (https://matthewrocklin.com/talks). Matthew holds a bachelor’s degree from UC Berkeley in physics and mathematics, and a PhD in computer science from the University of Chicago. Check out our posts here to get more context around where we're coming from: https://medium.com/coiled-hq/coiled-dask-for-everyone-everywhere-376f5de0eff4 https://medium.com/coiled-hq/the-unbearable-challenges-of-data-science-at-scale-83d294fa67f8 ----------- Connect With Us ✌️------------- Join our Slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/ Connect with Matthew on LinkedIn: https://www.linkedin.com/in/matthew-rocklin-461b4323/

MLOps.community
MLOps Coffee Sessions #14 Conversation with the creators of Dask // Hugo Brown Anderson and Matthew Rocklin

MLOps.community

Play Episode Listen Later Oct 12, 2020 56:42


Dask What is it? Parallelism for analytics What is parallelism? Doing a lot at once by splitting tasks into smaller subtasks which can be processed in parallel (at the same time) Distributed work across multiple machines and then combining the results Helpful for CPU bound - doing a bunch of calculations on the CPU. The rate at which process progresses is limited by the speed of the CPU Concurrency? Similar but a but things don’t have to happen at the same time, they can happen asynchronously. They can overlap. Shared state Helpful to I/O bound - networking, reading from disk, etc. The rate at which a process progresses is limited by the speed of the I/O subsystem. Multi-core vs distributed Multi-core is a single processor with 2 or more cores that can cooperate through threads - multithreading Distributed is across multiple nodes communicating via HTTP or RPC Why is this hard? Python has it challenges due to GIL, other languages don't have this problem Shared state can lead to potential race conditions, deadlocks, etc Coordination work across the machines For analytics? Calculating some statistics on a large dataset can be tricky if it can’t fit in memory // Show Notes Coiled Cloud: https://cloud.coiled.io/ Coiled Launch Announcement: https://medium.com/coiled-hq/coiled-dask-for-everyone-everywhere-376f5de0eff4 OSS article: https://www.forbes.com/sites/glennsolomon/2020/09/15/monetizing-open-source-business-models-that-generate-billions/#2862e47234fd Amish barn raising: https://www.youtube.com/watch?v=y1CPO4R8o5M MessagePassingInterface: https://en.wikipedia.org/wiki/Message_Passing_Interface ----------- Connect With Us ✌️------------- Join our Slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/ Connect with Matthew on LinkedIn: https://www.linkedin.com/in/matthew-rocklin-461b4323/ Timestamps: 0:00 - Intro to Matthew Rocklin and Hugo Bowne-Anderson 0:37 - Matthew Rocklin's Background 1:17 - Hugo Brown-Anderson's Background 3:47 - Where did that inspiration come from? 10:04 - Is there a close relationship between Best Practices and Tooling or are these two separate things? 11:27 - Why is Data Literacy important with Coiled? 14:46 - How do you think about the balance between enabling Data Science to have a lot of powerful compute? 17:05 - Machine Learning as a space for tracking best practices experimentation 19:32 - What makes Data Science so difficult? 24:07 - How can a for-profit company compliment Open Source Software (OSS) 29:40 - Amazon becoming a competitor with your own open-source technology (?) 32:50 - How do you encourage more people to contribute and ensure quality? 34:58 - Do you see Coiled operating within the DASK ecosystem? 37:30 - What is DASK? 39:19 - What should people know about parallelism? 41:28 - Why is it so hard to put things back together? 41:34 - Why does Python need a whole new tool to enable that? Or maybe some other tools as well? 44:44 - Dynamic Tasks Scheduling as being useful to Data Scientists 47:15 - Why is reliability in particular important in Data Science? 52:27 - What's in store for DASK?

Talk Python To Me - Python conversations for passionate developers
#285 Dask as a Platform Service with Coiled

Talk Python To Me - Python conversations for passionate developers

Play Episode Listen Later Oct 9, 2020 71:04


If you're into data science, you've probably heard about Dask. It's a package that feels like familiar APIs such as Numpy, Pandas, and Scikit-Learn. Yet it can scale that computation across CPU cores on your local machine all the way to distributed grid-based computing in large clusters. While powerful, this may take some serious setup to execute in its full glory. That's why Matthew Rocklin has teamed up with Hugo Bowne-Anderson and others to launch a business to help Python loving data scientists run Dask workloads in the cloud. And they are here to tell us about they open-source foundation business. And they must be on to something, between recording and releasing this episode, they raised $5M in VC funding. Links from the show Hugo on Twitter: @hugobowne Matthew on Twitter: @mrocklin Coiled: coiled.io Coiled raised $5M in Sept: twitter.com A brief history of dask article: coiled.io/blog Coiled: Dask for Everyone, Everywhere: medium.com The incredible growth of python: stackoverflow.blog Growth updated (SO Trends current): insights.stackoverflow.com Coiled Youtube channel: youtube.com Snorkel package: pypi.org Sponsors Brilliant Monday.com Talk Python Training

The Python Podcast.__init__
Growing Dask To Make Scaling Python Data Science Easier At Coiled

The Python Podcast.__init__

Play Episode Listen Later Aug 10, 2020 52:07


Python is a leading choice for data science due to the immense number of libraries and frameworks readily available to support it, but it is still difficult to scale. Dask is a framework designed to transparently run your data analysis across multiple CPU cores and multiple servers. Using Dask lifts a limitation for scaling your analytical workloads, but brings with it the complexity of server administration, deployment, and security. In this episode Matthew Rocklin and Hugo Bowne-Anderson discuss their recently formed company Coiled and how they are working to make use and maintenance of Dask in production. The share the goals for the business, their approach to building a profitable company based on open source, and the difficulties they face while growing a new team during a global pandemic.

Software Engineering Daily
Dask: Scalable Python with Matthew Rocklin

Software Engineering Daily

Play Episode Listen Later Apr 27, 2020 61:07


Python is the most widely used language for data science, and there are several libraries that are commonly used by Python data scientists including Numpy, Pandas, and scikit-learn. These libraries improve the user experience of a Python data scientist by giving them access to high level APIs. Data science is often performed over huge datasets, The post Dask: Scalable Python with Matthew Rocklin appeared first on Software Engineering Daily.

Podcast – Software Engineering Daily
Dask: Scalable Python with Matthew Rocklin

Podcast – Software Engineering Daily

Play Episode Listen Later Apr 27, 2020 61:07


Python is the most widely used language for data science, and there are several libraries that are commonly used by Python data scientists including Numpy, Pandas, and scikit-learn. These libraries improve the user experience of a Python data scientist by giving them access to high level APIs. Data science is often performed over huge datasets, The post Dask: Scalable Python with Matthew Rocklin appeared first on Software Engineering Daily.

Data – Software Engineering Daily
Dask: Scalable Python with Matthew Rocklin

Data – Software Engineering Daily

Play Episode Listen Later Apr 27, 2020 61:07


Python is the most widely used language for data science, and there are several libraries that are commonly used by Python data scientists including Numpy, Pandas, and scikit-learn. These libraries improve the user experience of a Python data scientist by giving them access to high level APIs. Data science is often performed over huge datasets, The post Dask: Scalable Python with Matthew Rocklin appeared first on Software Engineering Daily.

Software Daily
Dask: Scalable Python with Matthew Rocklin

Software Daily

Play Episode Listen Later Apr 27, 2020


Python is the most widely used language for data science, and there are several libraries that are commonly used by Python data scientists including Numpy, Pandas, and scikit-learn. These libraries improve the user experience of a Python data scientist by giving them access to high level APIs.Data science is often performed over huge datasets, and the data structures that are instantiated with those datasets need to be spread across multiple machines. To manage large distributed datasets, a library such as scikit-learn can use a system called Dask. Dask allows the instantiation of data structures such as a Dask dataframe or a Dask array.Matthew Rocklin is the creator of Dask. He joins the show to talk about distributed computing with Dask, its use cases, and the Python ecosystem. He also provides a detailed comparison between Dask and Spark, which is also used for distributed data science.

Data Engineering Podcast
Dask with Matthew Rocklin - Episode 2

Data Engineering Podcast

Play Episode Listen Later Jan 22, 2017 46:00


There is a vast constellation of tools and platforms for processing and analyzing your data. In this episode Matthew Rocklin talks about how Dask fills the gap between a task oriented workflow tool and an in memory processing framework, and how it brings the power of Python to bear on the problem of big data.

python das k matthew rocklin
The Python Podcast.__init__
Functional Python with Matthew Rocklin and Alexander Schepanovsky

The Python Podcast.__init__

Play Episode Listen Later Feb 29, 2016 80:02


What is functional programming, why would you want to use it, and how can you get started with it in Python? Our guests this week, Matthew Rocklin and Alexander Schepanovsky, help us understand all of that and more. Matthew and Alexander have each created their own Python libraries to make it easier to employ functional paradigms in your Python code. In this episode they help us understand the benefits that functional styles can have and the benefits that can be realized by trying them out for yourself.

functional python matthew rocklin