Open Source Directions hosted by Quansight

Follow Open Source Directions hosted by Quansight

Share on

Bringing you the news about the future of Open Source

Quansight, LLC

Jul 17, 2020 LATEST EPISODE
monthly NEW EPISODES
42 EPISODES

Search for episodes from Open Source Directions hosted by Quansight with a specific topic:

Latest episodes from Open Source Directions hosted by Quansight

Episode 45: Julia

Play Episode Listen Later Jul 17, 2020

In this episode of Open Source Directions we were joined by Jeff Bezanson and Katie Hyatt who talk about the work they have been doing with Julia. Julia is a programming language that was designed from the beginning for high performance. It programs compile to native code for multiple platforms via LLVM. Julia is dynamically typed, feels like a scripting language, and has good support for interactive use. Julia has a rich language of descriptive datatypes, and type declarations can be used to clarify and solidify programs. This language uses multiple dispatch as a paradigm, making it easy to express many object-oriented and functional programming patterns. It provides asynchronous I/O, debugging, logging, profiling, a package manager, and more.

open python io roadmaps llvm

Episode 44: RecallGraph

Play Episode Listen Later Jun 19, 2020

In this episode of Open Source Directions we were joined by Aditya Mukhopadhyay who talked about the work he has been doing with RecallGraph. RecallGraph is a versioned-graph data store - it retains all changes that its data (vertices and edges) have gone through to reach their current state. It supports point-in-time graph traversals, letting the user query any past state of the graph just as easily as the present.

open python roadmaps

Episode 43: Jupyter & Nteract

Play Episode Listen Later Jun 5, 2020

In this episode of Open Source Directions we were joined by Matthew Seal who talked about the work he has been doing with Jupyter and Nteract. Matthew also discussed a particular topic: common Jupyter tools and their adoption for various use cases in the wild.

open python roadmaps jupyter

Episode 42: Open Tech Response

Play Episode Listen Later May 8, 2020

OpenTechResponse is the hub for information sharing and coordination between open source projects responding to an emergency or crisis situation.

open python roadmaps

Episode 41: Spyder

Play Episode Listen Later May 1, 2020

Spyder is a powerful scientific environment written in Python, for Python, and designed by and for scientists, engineers and data analysts. It offers a unique combination of the advanced editing, analysis, debugging, and profiling functionality of a comprehensive development tool with the data exploration, interactive execution, deep inspection, and beautiful visualization capabilities of a scientific package.

open python spyder roadmaps

Episode 40: Fortran

Play Episode Listen Later Apr 17, 2020

Fortran is a compiled language which means that once written, the source code must be passed through a compiler to produce a machine executable that can be run.

open python fortran roadmaps

Episode 39: Apache Arrow

Play Episode Listen Later Apr 3, 2020

Apache Arrow is a cross-language development platform for in-memory data. It supports zero-copy streaming messaging and has support for a number of languages, including C, C++, Python, R, Rust, and many others.

open python rust roadmaps apache arrow

Episode 38: Jupyter Book

Play Episode Listen Later Mar 20, 2020

Jupyter Book lets you build an online book using a collection of Jupyter Notebooks and Markdown files. Its output is similar to the excellent Bookdown tool, and adds extra functionality for people running a Jupyter stack.

open python markdown roadmaps jupyter

Episode 37: PyJanitor

Play Episode Listen Later Mar 6, 2020

Originally a port of the R package, pyjanitor has evolved from a set of convenient data cleaning routines into an experiment with the method chaining paradigm. Data preprocessing usually consists of a series of steps that involve transforming raw data into an understandable/usable format.

open python data roadmaps

Episode 36: Bokeh 2.0

Play Episode Listen Later Feb 21, 2020

Bokeh is an interactive visualization library for modern web browsers. It provides elegant, concise construction of versatile graphics, and affords high-performance interactivity over large or streaming datasets. Bokeh can help anyone who would like to quickly and easily make interactive plots, dashboards, and data applications. This is our second time visiting Bokeh, in preperation for the v2.0 release!

open python bokeh roadmaps

Episode 35: IBM Lale

Play Episode Listen Later Jan 24, 2020

Lale is a Python library for semi-automated data science. Lale makes it easy to automatically select algorithms and tune hyperparameters of pipelines that are compatible with scikit-learn, in a type-safe fashion. If you are a data scientist who wants to experiment with automated machine learning, this library is for you! Lale adds value beyond scikit-learn along three dimensions: automation, correctness checks, and interoperability. For automation, Lale provides a consistent high-level interface to existing pipeline search tools including GridSearchCV, SMAC, and Hyperopt. For correctness checks, Lale uses JSON Schema to catch mistakes when there is a mismatch between hyperparameters and their type, or between data and operators. And for interoperability, Lale has a growing library of transformers and estimators from popular libraries such as scikit-learn, XGBoost, PyTorch etc. Lale can be installed just like any other Python package and can be edited with off-the-shelf Python tools such as Jupyter notebooks.

open python smac lale roadmaps jupyter pytorch xgboost

Episode 34: Stumpy

Play Episode Listen Later Jan 10, 2020

STUMPY is a powerful and scalable library that efficiently computes something called the matrix profile, which can be used for a variety of time series data mining tasks such as: pattern/motif (approximately repeated subsequences within a longer time series) discovery, anomaly/novelty (discord) discovery, shapelet discovery, semantic segmentation, density estimation, time series chains (temporally ordered set of subsequence patterns), and more!

open python roadmaps stumpy

Episode 28: Matplotlib

Play Episode Listen Later Dec 6, 2019

Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib can be used in Python scripts, the Python and IPython shells, the Jupyter notebook, web application servers, and four graphical user interface toolkits.

open python roadmaps jupyter ipython matplotlib

Episode 33: stdlib

Play Episode Listen Later Nov 22, 2019

stdlib ("standard lib") is a standard library for JavaScript and Node.js, with an emphasis on numerical and scientific computing applications. The library provides a collection of robust, high performance libraries for mathematics, statistics, data processing, streams, and more and includes many of the utilities you would expect from a standard library.

open python node javascript roadmaps

Episode 32: Voila

Play Episode Listen Later Nov 8, 2019

Voilà turns Jupyter notebooks into standalone web applications. Unlike the usual HTML-converted notebooks, each user connecting to the Voilà tornado application gets a dedicated Jupyter kernel which can execute the callbacks to changes in Jupyter interactive widgets. By default, Voilà disallows execute requests from the front-end, preventing execution of arbitrary code. By default, Voilà runs with the strip_source option, which strips out the input cells from the rendered notebook.

open python html voil roadmaps jupyter

Episode 31: Econ-ARK

Play Episode Listen Later Oct 25, 2019

The Econ-ARK project provides open-source toolkits for researchers trying to understand how economic and social outcomes result from the actions of heterogeneous individuals. The primary goals of the project are to make entry into the world of such modeling easy; to accelerate the development of this kind of modeling for policy-making and academic research; and to increase the openness, replicability, and interoperability of modeling tools.

open python roadmaps

Episode 30: UMAP

Play Episode Listen Later Oct 11, 2019

Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualisation similarly to t-SNE, but also for general non-linear dimension reduction. The algorithm is founded on three assumptions about the data: 1. The data is uniformly distributed on a Riemannian manifold; 2. The Riemannian metric is locally constant (or can be approximated as such); 3. The manifold is locally connected. From these assumptions it is possible to model the manifold with a fuzzy topological structure. The embedding is found by searching for a low dimensional projection of the data that has the closest possible equivalent fuzzy topological structure.

open python sne roadmaps riemannian

Episode 29: Panel

Play Episode Listen Later Sep 27, 2019

Panel provides tools for easily composing widgets, plots, tables, and other viewable objects and controls into control panels, apps, and dashboards. Panel works with visualizations from Bokeh, Matplotlib, HoloViews, and other Python plotting libraries, making them instantly viewable either individually or when combined with interactive widgets that control them. Panel works equally well in Jupyter Notebooks, for creating quick data-exploration tools, or as standalone deployed apps and dashboards, and allows you to easily switch between those contexts as needed.

open python panel bokeh roadmaps jupyter notebooks matplotlib

Episode 27: OpenTeams

Play Episode Listen Later Aug 23, 2019

OpenTeams brings together organizations using open source software with creators and maintainers of the software to facilitate and grow funding opportunities.

open python roadmaps

Episode 26: Vega-Lite

Play Episode Listen Later Aug 9, 2019

Vega is a declarative format for creating, saving, and sharing visualization designs. With Vega, visualizations are described in JSON, and generate interactive views using either HTML5 Canvas or SVG.

open python vega svg json roadmaps

Episode 25: Binder

Play Episode Listen Later Jul 26, 2019

Have a repository full of Jupyter notebooks? With Binder, open those notebooks in an executable environment, making your code immediately reproducible by anyone, anywhere.

open python roadmaps jupyter

Episode 24: nteract

Play Episode Listen Later Jul 12, 2019

The nteract project is an ecosystem of open source tools to enable people to build their own front-ends and workflows on top of the Jupyter ecosystem.

open python roadmaps jupyter

Episode 23: conda-forge

Play Episode Listen Later Jun 28, 2019

conda-forge is community led collection of recipes, build infrastructure and distributions. Conda-forge currently build conda packages for Linux, Mac, Windows, ARM, and Power8 architectures. Conda-forge has 1400 members in its GitHub organization and >7000 repositories. The conda-forge channel has about 80 million downloads a month, and growing. Conda-forge is an official NumFOCUS project.

mac open python arm windows linux github roadmaps numfocus conda

Episode 22: SciKit-Learn

Play Episode Listen Later May 31, 2019

SciKit-Learn provides simple and efficient tools for data mining and data analysis which are accessible to everybody, and reusable in various contexts. It is built on NumPy, SciPy, and matplotlib.

open python roadmaps scikit learn scipy numpy

Episode 21: xtensor/xframe

Play Episode Listen Later May 18, 2019

xtensor provides an extensible expression system enabling lazy broadcasting, an API following the idioms of the C++ standard library, and tools to manipulate array expressions and build upon xtensor. xtensor containers are inspired by NumPy, the Python array programming library. Adaptors for existing data structures to be plugged into our expression system can easily be written. xtensor requires a modern C++ compiler supporting C++14.

api open python roadmaps adaptors numpy

Episode 20: Uarray

Play Episode Listen Later May 3, 2019

Array interface object for Python with pluggable backends and a multiple-dispatch mechanism for defining down-stream functions. CORRECTION: In the episode Hameer implied moving data from GPUs to CPUs won’t be a problem in PCIe 4,0. It’s actually in an Intel-proposed extension to PCIe 5.0.

open python array intel pcie roadmaps cpus gpus

Episode 19: Pyodide

Play Episode Listen Later Apr 19, 2019

It provides transparent conversion of objects between Javascript and Python. When inside a browser, this means Python has full access to the Web APIs. While closely related to the iodide project, Pyodide may be used standalone in any context where you want to run Python inside a web browser.

open python javascript roadmaps web apis

Episode 18: PyMC3

Play Episode Listen Later Apr 5, 2019

PyMC3 is a probabilistic programming package for Python that allows users to fit Bayesian models using a variety of numerical methods, most notably Markov chain Monte Carlo (MCMC) and variational inference (VI).

open python bayesian roadmaps monte carlo mcmc

Episode 17: TensorFlow

Play Episode Listen Later Mar 22, 2019

TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications.

open python ml roadmaps tensorflow

Episode 16: Chainer

Play Episode Listen Later Mar 14, 2019

Chainer is a powerful, flexible and intuitive deep learning framework. Chainer supports CUDA computation. It only requires a few lines of code to leverage a GPU. It also runs on multiple GPUs with little effort. Chainer supports various network architectures including feed-forward nets, convnets, recurrent nets and recursive nets. It also supports per-batch architectures. Forward computation can include any control flow statements of Python without lacking the ability of backpropagation. It makes code intuitive and easy to debug.

forward open python gpu cuda roadmaps gpus chainer

Episode 15: Numba

Play Episode Listen Later Mar 1, 2019

Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code. Numba translates Python functions to optimized machine code at runtime using the Industry-standard LLVM compiler library. Numba-compiled numerical algorithms in Python can approach the speeds of C or FORTRAN. Users do not need to replacethe Python interpreter, run a separate compilation step, or even have a C/C++ compiler installed. Applying one of the Numba decorators to a Python function is all that is needed.

open users python cc fortran roadmaps llvm jit numpy numba

Episode 14: ITK

Play Episode Listen Later Feb 21, 2019

ITK is an open-source, cross-platform system that provides developers with an extensive suite of software tools for image analysis. Developed through extreme programming methodologies, ITK employs leading-edge algorithms for registering and segmenting multidimensional data.

open python developed roadmaps itk

Episode 13: Jupyter Ecosystem

Play Episode Listen Later Feb 1, 2019

Project Jupyter exists to develop open-source software, open-standards, and services for interactive computing across dozens of programming languages. The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more.

open python roadmaps project jupyter jupyter notebook

Episode 12: PySpark

Play Episode Listen Later Jan 18, 2019

The Spark Python API (PySpark) exposes the Spark programming model to Pytho

open python spark roadmaps

Episode 11: Dask

Play Episode Listen Later Jan 14, 2019

Dask provides advanced parallelism for analytics, enabling performance at scale for the tools you love. Dask is open source and freely available. It is developed in coordination with other community projects like Numpy, Pandas, and SciKit-Learn.

open python pandas das k roadmaps scikit learn numpy

Episode 10: PyData/Sparse

Play Episode Listen Later Dec 19, 2018

The aim of PyData/Sparse is to create sparse containers that implement the ndarray interface. Traditionally in the PyData ecosystem, sparse arrays have been provided by the scipy.sparse submodule. All containers there depend on and emulate the numpy.matrix interface. This means that they are limited to two dimensions and also do not work well in places where numpy.ndarray would work. PyData/Sparse is well on its way to replacing scipy.sparse as the de-facto sparse array implementation in the PyData ecosystem.

open python traditionally roadmaps pydata

Episode 9: Datashader

Play Episode Listen Later Dec 19, 2018

Datashader is a graphics pipeline system for creating meaningful representations of large datasets quickly and flexibly. Datashader breaks the creation of images into a series of explicit steps that allow computations to be done on intermediate representations. This approach allows accurate and effective visualizations to be produced automatically without trial-and-error parameter tuning, and also makes it simple for data scientists to focus on particular data and relationships of interest in a principled way.

open python roadmaps

Episode 8: SciPy

Play Episode Listen Later Dec 3, 2018

SciPy is open-source software for mathematics, science, and engineering. It includes modules for statistics, optimization, integration, linear algebra, Fourier transforms, signal and image processing, ODE solvers, and more. SciPy provides many user-friendly and efficient numerical routines such as for numerical integration and optimization. SciPy runs on all popular operating systems, is easy to use, and powerful enough to be depended upon by the world's leading scientists & engineers.

open python ode fourier roadmaps scipy

Episode 7: GeoViews

Play Episode Listen Later Nov 16, 2018

GeoViews is a Python library that makes it easy to explore and visualize geographical, meteorological, and oceanographic datasets, such as those used in weather, climate, and remote sensing research. GeoViews is built on the HoloViews library for building flexible visualizations of multidimensional data. GeoViews adds a family of geographic plot types based on the Cartopy library, plotted using either the Matplotlib or Bokeh packages. With GeoViews, you can now work easily and naturally with large, multidimensional geographic datasets, instantly visualizing any subset or combination of them, while always being able to access the raw data underlying any plot.

open python bokeh roadmaps

Episode 6: Intake

Play Episode Listen Later Oct 30, 2018

Intake will appeal to different groups but is useful for all and acts as a common platform that everyone can use to smooth the progression of data from developers and providers to users.

open python intake roadmaps

Episode 5: CuPy

Play Episode Listen Later Oct 29, 2018

CuPy's interface is highly compatible with NumPy; in most cases it can be used as a drop-in replacement. Blog Post: https://quansight.github.io/Episode-5-CuPy/

open python blog post roadmaps numpy

Episode 0: Bokeh

Play Episode Listen Later Jul 20, 2018

This episode features Bokeh, which is a web-based and interactive visualization library for Python. Its goal is to provide elegant, concise construction of versatile graphics, and to extend this capability with high-performance interactivity over very large or streaming datasets. Bokeh can help anyone who would like to quickly and easily create interactive plots, dashboards, and data applications.

open python bokeh roadmaps

Claim Open Source Directions hosted by Quansight

In order to claim this podcast we'll send an email to with a verification link. Simply click the link and you will be able to edit tags, request a refresh, and other features to take control of your podcast page!

Claim Cancel

Open Source Directions hosted by Quansight

Search for episodes from Open Source Directions hosted by Quansight with a specific topic:

Latest episodes from Open Source Directions hosted by Quansight

Episode 45: Julia

Episode 44: RecallGraph

Episode 43: Jupyter & Nteract

Episode 42: Open Tech Response

Episode 41: Spyder

Episode 40: Fortran

Episode 39: Apache Arrow

Episode 38: Jupyter Book

Episode 37: PyJanitor

Episode 36: Bokeh 2.0

Episode 35: IBM Lale

Episode 34: Stumpy

Episode 28: Matplotlib

Episode 33: stdlib

Episode 32: Voila

Episode 31: Econ-ARK

Episode 30: UMAP

Episode 29: Panel

Episode 27: OpenTeams

Episode 26: Vega-Lite

Episode 25: Binder

Episode 24: nteract

Episode 23: conda-forge

Episode 22: SciKit-Learn

Episode 21: xtensor/xframe

Episode 20: Uarray

Episode 19: Pyodide

Episode 18: PyMC3

Episode 17: TensorFlow

Episode 16: Chainer

Episode 15: Numba

Episode 14: ITK

Episode 13: Jupyter Ecosystem

Episode 12: PySpark

Episode 11: Dask

Episode 10: PyData/Sparse

Episode 9: Datashader

Episode 8: SciPy

Episode 7: GeoViews

Episode 6: Intake

Episode 5: CuPy

Episode 0: Bokeh

Claim Open Source Directions hosted by Quansight

On the way!