Data – Software Engineering Daily

Follow Data – Software Engineering Daily

Share on

Technical interviews about software topics.

Aug 5, 2022 LATEST EPISODE
infrequent NEW EPISODES
53m AVG DURATION
214 EPISODES

Search for episodes from Data – Software Engineering Daily with a specific topic:

Latest episodes from Data – Software Engineering Daily

Faking Data Using Tonic.ai with Ian Coe and Adam Kamor

Play Episode Listen Later Aug 5, 2022 41:54

Ian Coe CEO Adam Kamor Head of Engineering Companies that gather data about their users have an ethical obligation and legal responsibility to protect the personally identifiable information in their dataset. Ideally, developers working on a software application wouldn't need access to production data. Yet without high-quality example data, many technology groups stumble on avoidable The post Faking Data Using Tonic.ai with Ian Coe and Adam Kamor appeared first on Software Engineering Daily.

data faking tonic software engineering daily

Lakehouse Data Stack with Raj Bains

Play Episode Listen Later Apr 12, 2022 66:30

As companies move to Spark and a Lakehouse architecture, they are realizing that the data tools are lagging way behind. You need to be a programmer to effectively use Spark and Airflow. There are some low-code ETL tools, but is that enough? Companies want to treat their data pipelines like mission-critical apps. They want DevOps The post Lakehouse Data Stack with Raj Bains appeared first on Software Engineering Daily.

data companies spark stack etl airflow software engineering daily raj bains

RudderStack Engineering with Soumaydeb Mitra

Play Episode Listen Later Mar 16, 2022 52:34

Customer data pipelines power the backend of many successful web platforms. In a customer data pipeline, data is collected from sources such as mobile apps and cloud SaaS tools, transformed and munged using data engineering, stored in data warehouses, and piped to analytics, advertising platforms, and data infrastructure. RudderStack is an open source customer data The post RudderStack Engineering with Soumaydeb Mitra appeared first on Software Engineering Daily.

engineering saas mitra software engineering daily

Apache Hudi with Vinoth Chandar

Play Episode Listen Later Mar 9, 2022 43:03

The data lake architecture has become broadly adopted in a relatively short period of time. In a nutshell, that means data in it's raw format stored in cloud object storage. Modern software and data engineers have no shortage of options for accessing their data lake, but that list shrinks quickly if you care about features The post Apache Hudi with Vinoth Chandar appeared first on Software Engineering Daily.

modern apache hudi software engineering daily

Couchbase Architecture with Ravi Mayuram

Play Episode Listen Later Jan 28, 2022 58:42

Couchbase is a distributed NoSQL cloud database. Since its creation, Couchbase has expanded into edge computing, application services, and most recently a database-as-a-service called Capella. Couchbase started as an in-memory cache and needed to be rearchitected to be a persistent storage system. In this episode, I interview Ravi Mayuram, SVP Products and Engineering at Couchbase The post Couchbase Architecture with Ravi Mayuram appeared first on Software Engineering Daily.

engineering architecture ravi capella nosql couchbase software engineering daily

Trifacta with Joe Hellerstein

Play Episode Listen Later Dec 21, 2021 33:50

If you haven't encountered a data quality problem, then you haven't yet worked on a large enough project. Invariably, a gap exists between the state of raw data and what an analyst or machine learning engineer needs to solve their problem. Many organizations needing to automate data preparation workflows look to Trifacta as a solution. The post Trifacta with Joe Hellerstein appeared first on Software Engineering Daily.

invariably trifacta software engineering daily

MemGraph with Dominik Tomicevic

Play Episode Listen Later Dec 10, 2021 42:37

Relational databases have been a fixture of software applications for decades. They are highly tuned for performance and typically offer explicit guarantees like transactional consistency. More recently, there's been a figurative cambrian explosion of other-than-relational databases. Simple key value stores or counters were an early win in this space. Managing a graph data structure is The post MemGraph with Dominik Tomicevic appeared first on Software Engineering Daily.

simple managing relational software engineering daily

Amplemarket with João Batalha

Play Episode Listen Later Dec 9, 2021 32:05

The lifeblood of most companies is their sales departments. When you're selling something other than a commodity, it's typically necessary to carefully groom the onboarding experience for inbound future customers. Historically, companies approached this in a one-size-fits-all manner, giving all customers a common experience. In today's data-driven age, a better experience can be provided that The post Amplemarket with João Batalha appeared first on Software Engineering Daily.

historically batalha software engineering daily

Metaplane with Kevin Hu

Play Episode Listen Later Nov 24, 2021 37:49

Application observability is a fairly mature area. Engineering teams have a wide selection of tools they can choose to adopt and a significant amount of thought leadership and philosophy already exists giving guidance for managing your application. That application is going to persist data. As you scale up, your system is invariably going to experience The post Metaplane with Kevin Hu appeared first on Software Engineering Daily.

engineering application software engineering daily

Risk and Compliance with Terry O’Daniel

Play Episode Listen Later Nov 23, 2021 58:08

Consumers are increasingly becoming aware of how detrimental it can be when companies mismanage data. This demand has fueled regulations, defined standards, and applied pressure to companies. Modern enterprises need to consider corporate risk management and regulatory compliance. In this interview, I speak with Terry O'Daniel, Director of Engineering (Risk & Compliance) at Instacart. Sponsorship The post Risk and Compliance with Terry O’Daniel appeared first on Software Engineering Daily.

director risk modern compliance consumers instacart terry o software engineering daily

#FREEZUCK | ?masked…?… !(DOCTORS)!!

Play Episode Listen Later Nov 10, 2021 2:45

Software Engineering Daily invites Owen Frank Davis, Paul Davis, Kyle Davis, and Robbie Davis for a joint interview on the subject of reproduction and teething, as well as Lisch fascitis. Aledade and Kubernetes are both inconsiderate ideas for navigation. They need improvements in infrastructure. I prefer Dominaria to New Phyrexia (though I can get by The post #FREEZUCK | ?masked…?… !(DOCTORS)!! appeared first on Software Engineering Daily.

doctors masked kubernetes paul davis dominaria kyle davis software engineering daily lisch aledade new phyrexia

Scalable Streaming Video with Amit Mishra

Play Episode Listen Later Nov 10, 2021 40:26

The internet is a layer cake of technologies and protocols. At a fundamental level, the internet runs on the TCP/IP protocol. It's a packet based system. When your browser requests a file from a web server, that server chops up the file into tiny pieces known as packets and puts them on the network labeled The post Scalable Streaming Video with Amit Mishra appeared first on Software Engineering Daily.

amit scalable mishra tcp ip streaming video software engineering daily

Observability Using Honeycomb.io with Christine Yen

Play Episode Listen Later Nov 8, 2021 43:10

It does not matter if it runs on your machine. Your code must run in the production environment and it must do so performantly. For that, you need tooling to better understand your application’s behavior under different circumstances. In the earliest days of software development, all we had were logs, which are still around and The post Observability Using Honeycomb.io with Christine Yen appeared first on Software Engineering Daily.

honeycomb observability software engineering daily christine yen

Location-Based Experiences Using Foursquare with Ankit Patel

Play Episode Listen Later Nov 3, 2021 48:24

The manner in which users interact with technology has rapidly switched to mobile consumption. The devices almost all of us carry with us at all times open endless opportunities for developers to create location-based experiences. Foursquare became a household name when the introduced social check-ins. Today they're a location data platform. Ankit Patel is the The post Location-Based Experiences Using Foursquare with Ankit Patel appeared first on Software Engineering Daily.

experiences location patel foursquare ankit software engineering daily

Datadog with Omri Sass and Hugo Kaczmarek

Play Episode Listen Later Oct 28, 2021 43:22

Modern business applications are complex. It’s not enough to have raw logs or some basic telemetry. Today’s enterprise organizations require an application performance monitoring solution or APM. Today’s applications are complex distributed systems whose performance depends on a wide variety of factors. Every single line of code can affect production and teams need insights into The post Datadog with Omri Sass and Hugo Kaczmarek appeared first on Software Engineering Daily.

modern sass apm omri datadog kaczmarek software engineering daily

Infrastructure as Code with Christian Tragesser

Play Episode Listen Later Oct 8, 2021 43:52

Infrastructure as Code is an approach to machine provisioning and setup in which a programmer describes the underlying services they need for their projects. However, this infrastructure code doesn't compile a binary artifact like traditional source code. The successful completion of running the code signals that the servers and other components described in the configuration The post Infrastructure as Code with Christian Tragesser appeared first on Software Engineering Daily.

code infrastructure software engineering daily

Modern Data Infrastructure and Tools with Leigh Marie Braswell

Play Episode Listen Later Oct 5, 2021 47:57

The first industrial deployments of machine learning and artificial intelligence solutions were bespoke by definition and often had brittle operating characteristics. Almost no one builds custom databases, web servers, or email clients. Yet technology groups today often consider developing homegrown ML and data solutions in order to solve their unique use cases. Today's modern data The post Modern Data Infrastructure and Tools with Leigh Marie Braswell appeared first on Software Engineering Daily.

tools modern ml braswell data infrastructure software engineering daily

Git Scales for Monorepos with Derrick Stolee

Play Episode Listen Later Oct 1, 2021 53:58

In a version control system, a Monorepo is a version control management strategy in which all your code is contained in one potentially large but complete repository. The monorepo is in stark contrast to an alternative approach in which software teams independently manage microservices or deliver software as libraries to be imported in other projects. The post Git Scales for Monorepos with Derrick Stolee appeared first on Software Engineering Daily.

scales software engineering daily monorepo

Faking Data Using Tonic.ai with Ian Coe and Adam Kamor

Play Episode Listen Later Sep 29, 2021 50:23

Companies that gather data about their users have an ethical obligation and legal responsibility to protect the personally identifiable information in their dataset. Ideally, developers working on a software application wouldn't need access to production data. Yet without high-quality example data, many technology groups stumble on avoidable problems. Organizations need a solution to protect privacy The post Faking Data Using Tonic.ai with Ian Coe and Adam Kamor appeared first on Software Engineering Daily.

data companies organizations faking tonic software engineering daily

DBT: Data Build Tool with Tristan Handy

Play Episode Listen Later Sep 28, 2021 44:56

Applications write data to persistent storage like a database. The most popular database query language is SQL which has many similar dialects. SQL is expressive and powerful for describing what data you want. What you do with that data requires a solution in the form of a data pipeline. Ideally, these analytical workflows can follow The post DBT: Data Build Tool with Tristan Handy appeared first on Software Engineering Daily.

data tool applications handy sql software engineering daily

No Code Process Automation at Axiom with Yaseer Sheriff

Play Episode Listen Later Sep 24, 2021

Tedious, repetitive tasks are better handled by machines. Unless these tasks truly require human intelligence, repetitive tasks are often good candidates for automation. Implementing process automation can be challenging and technical. Increasingly, engineers are seeking out tools and platforms to facilitate faster, more reliable automation. In this episode I talk to Yaseer Sheriff, Co-Founder and The post No Code Process Automation at Axiom with Yaseer Sheriff appeared first on Software Engineering Daily.

sheriffs implementing no code axiom tedious process automation software engineering daily

LinearB with Dan Lines

Play Episode Listen Later Sep 21, 2021 45:40

A developer's core deliverables are individual commits and the pull requests they aggregate into. While the number of lines of code written alone may not be very informative, in total, the code and metadata about the code found in tracking systems present a rich dataset with great promise for analysis and productivity optimization insights. LinearB The post LinearB with Dan Lines appeared first on Software Engineering Daily.

lines software engineering daily

Modern Data Stacks Optimized by Mozart Data with Peter Fishman and Dan Silberman

Play Episode Listen Later Sep 14, 2021 50:57

Modern companies leverage dozens or even hundreds of software solutions to solve specific needs of the business. Organizations need to collect all these disparate data sources into a data warehouse in order to add value. The raw data typically needs transformation before it can be analyzed. In many cases, companies develop homegrown solutions, thus reinventing The post Modern Data Stacks Optimized by Mozart Data with Peter Fishman and Dan Silberman appeared first on Software Engineering Daily.

data modern organizations mozart stacks optimized fishman silberman software engineering daily

Instabase with Anant Bhardwaj

Play Episode Listen Later Sep 7, 2021 48:09

Instabase is a technology platform for building automation solutions. Users deploy it onto their own infrastructure and can leverage the tools offered by the platform to build complex workflows for handling tasks like income verification and claims processing. In this episode we interview Anant Bhardwaj, founder of Instabase. He describes Instabase as an operating system. The post Instabase with Anant Bhardwaj appeared first on Software Engineering Daily.

users bhardwaj anant software engineering daily

Data Discovery with Shinji Kim

Play Episode Listen Later Aug 27, 2021 87:25

Shinji Kim is Founder and CEO of Select Star. In this episode we discuss data discovery and more. This interview was also recorded as a video podcast. Check out the video on the Software Daily YouTube channel. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Data Discovery with Shinji Kim appeared first on Software Engineering Daily.

ceo founders data discovery shinji software engineering daily

InfluxData: Time-Series Data with Russ Savage

Play Episode Listen Later Aug 19, 2021 43:55

Time series data are simply measurements or events that are tracked, monitored, downsampled, and aggregated over time. This could be server metrics, application performance monitoring, network data, sensor data, events, clicks, trades in a market, and many other types of analytics data (influxdata.com). The platform InfluxData is designed for building and operating time series applications. The post InfluxData: Time-Series Data with Russ Savage appeared first on Software Engineering Daily.

time savage russ time series software engineering daily influxdata

Druid: Event-Driven Data with Eric Tschetter

Play Episode Listen Later Aug 16, 2021 48:05

Whether sending messages, shopping in an app, or watching videos, modern consumers expect information and responsiveness to be near-instant in their apps and devices. From a developer's perspective, this means clean code and a fast database. Apache Druid is a database built to power real-time analytic workloads for event-driven data, like user-facing applications, streaming, and The post Druid: Event-Driven Data with Eric Tschetter appeared first on Software Engineering Daily.

data druid event driven software engineering daily

DaaS with Auren Hoffman

Play Episode Listen Later Aug 13, 2021 107:58

Auren Hoffman is the CEO of SafeGraph. In this episode we discuss data as a service and more. This interview was also recorded as a video podcast. Check out the video on the Software Daily YouTube channel. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post DaaS with Auren Hoffman appeared first on Software Engineering Daily.

ceo daas safegraph software engineering daily auren hoffman

Reverse ETL: Operationalizing Data Warehouses with Tejas Manohar

Play Episode Listen Later Aug 2, 2021 53:44

Enterprise data warehouses store all company data in a single place to be accessed, queried, and analyzed. They're essential for business operations because they support managing data from multiple sources, providing context, and have built-in analytics tools. While keeping a single source of truth is important, easily moving data from the warehouse to other applications The post Reverse ETL: Operationalizing Data Warehouses with Tejas Manohar appeared first on Software Engineering Daily.

enterprise reverse tejas data warehouses operationalizing manohar software engineering daily

Prophecy: Apple of Data Engineering with Raj Bains

Play Episode Listen Later Jul 28, 2021 51:51

Prophecy is a complete Low-Code Data Engineering Platform for the Enterprise. Prophecy enables all your teams on Apache Spark with a unique low-code designer. While you visually build your Dataflows – Prophecy generates high-quality Spark code on Git. Then, you can schedule Spark workflows with Prophecy's low-code Airflow. Not only that, Prophecy provides end-to-end visibility The post Prophecy: Apple of Data Engineering with Raj Bains appeared first on Software Engineering Daily.

apple prophecy spark enterprise git data engineering airflow apache spark software engineering daily raj bains

Pulsar Rerevisted with Enrico Olivelli

Play Episode Listen Later Jul 26, 2021 48:21

In the previous episode, Pulsar Revisited, we discussed how the company DataStax has added to their product stack Astra Streaming, their cloud-native messaging and event streaming service that's built on top of Apache Pulsar. We discussed Apache Pulsar and the added features DataStax offers like injecting machine learning into your data streams and viewing real-time The post Pulsar Rerevisted with Enrico Olivelli appeared first on Software Engineering Daily.

enrico pulsars datastax software engineering daily apache pulsar

CockroachDB: Distributed Databases and Containerization with Spencer Kimball

Play Episode Listen Later Jul 21, 2021 58:52

In 2003, Google developed a robust cluster management system called Borg. This enabled them to manage clusters with tens of thousands of machines, moving them away from virtual machines and firmly into container management. Then, in 2014, they open sourced a version of Borg called Kubernetes, or K8s. Now, in 2021, CockroachDB is a distributed The post CockroachDB: Distributed Databases and Containerization with Spencer Kimball appeared first on Software Engineering Daily.

google databases borg distributed kubernetes k8s containerization cockroachdb software engineering daily spencer kimball

Imply Infra: Big Data Analysis and Real-World Examples with Jad Naous

Play Episode Listen Later Jul 19, 2021 38:23

Big data analytics is the process of collecting data, processing and cleaning it, then analyzing it with techniques like data mining, predictive analytics, and deep learning. This process requires a suite of tools to operate efficiently. Data analytics can save companies money, drive product development, and give insight into the market and customers. The company The post Imply Infra: Big Data Analysis and Real-World Examples with Jad Naous appeared first on Software Engineering Daily.

data big data real world data analysis infra imply software engineering daily

Better Stack: A New DevOps Experience with Juraj Masar

Play Episode Listen Later Jul 15, 2021 45:21

DevOps has shortened the development life cycle for countless applications and is embraced by companies around the world. But managing and monitoring multiple environments is still a major pain point, particularly when companies need to mix cloud and legacy systems. Knowing when services go down and quickly pinpointing the cause is essential for continuous development. The post Better Stack: A New DevOps Experience with Juraj Masar appeared first on Software Engineering Daily.

stack devops juraj software engineering daily masar

Data Science on AWS: Implementing AI and ML Pipelines on AWS with Chris Fregly

Play Episode Listen Later Jul 14, 2021 47:03

Data science is an interdisciplinary field that combines strong technical skills with industry knowledge to perform a large range of jobs. Data scientists solve business questions with hands-on work cleaning and analyzing data, building machine learning models and applying algorithms, and generating dynamic visuals and tools to understand the world from the data it generates. The post Data Science on AWS: Implementing AI and ML Pipelines on AWS with Chris Fregly appeared first on Software Engineering Daily.

data data science aws pipelines software engineering daily implementing ai

Data Lineage: Understanding Data Lineage at Scale with Julien Le Dem

Play Episode Listen Later Jul 12, 2021 59:03

Big Data has exploded the past decade as cloud computing and more efficient hardware made scaling essentially limitless. Products like Uber revolve entirely around analyzing data to provide rides. According to an EMC/IDC study, there was approximately 5.2TB of data for every person in 2020. That estimate was made before the transition to remote work, The post Data Lineage: Understanding Data Lineage at Scale with Julien Le Dem appeared first on Software Engineering Daily.

uber scale products big data 2tb software engineering daily data lineage

Text Blaze: Text Shortcuts with Scott Fortmann-Roe

Play Episode Listen Later Jul 3, 2021 46:00

There are over 4 billion people using email. Many people using email for business communicate quick questions to colleagues, send repetitive, template-based information to potential customers and freshly hired employees, and repeat a lot of the same phrases. We actually repeat phrases in a lot of written formats. How often do you copy and paste The post Text Blaze: Text Shortcuts with Scott Fortmann-Roe appeared first on Software Engineering Daily.

shortcuts software engineering daily

LayerCI with Colin Chartier

Play Episode Listen Later Jul 2, 2021 42:47

Continuous integration is a coding practice where engineers deliver incremental and frequent code changes to create higher quality software and collaborate more. Teams attempting to continuously integrate new code need a consistent and automated pipeline for reviewing, testing, and deploying the changes. Otherwise change requests pile up in the queue and nothing gets integrated efficiently. The post LayerCI with Colin Chartier appeared first on Software Engineering Daily.

continuous chartier software engineering daily

Meltano: ELT for DataOps with Douwe Maan

Play Episode Listen Later Jul 1, 2021 57:02

ELT is a process for copying data from a source system into a target system. It stands for “Extract, Load, Transform” and starts with extracting a copy of data from the source location. It's loaded into the target system like a data warehouse, and then it's ready to be transformed into a usable format for The post Meltano: ELT for DataOps with Douwe Maan appeared first on Software Engineering Daily.

transform load extract maan elt dataops douwe software engineering daily

Uber Data Science with Kevin Novak

Play Episode Listen Later Jun 24, 2021 50:17

Uber is one of many examples we've discussed on this show that has changed the world with big data analysis. With over 8 million users, 1 billion Uber trips and people driving for Uber in over 400 cities and 66 countries, Uber has redefined an entire industry in a very short time frame. It's difficult The post Uber Data Science with Kevin Novak appeared first on Software Engineering Daily.

uber data science novak software engineering daily

Axiom Browser Automation with Yaseer Sheriff

Play Episode Listen Later Jun 23, 2021 38:21

The quantity and quality of a company's data can mean the difference between a major success or major failure. Companies like Google have used big data from its earliest days to steer their product suite in the direction consumers need. Other companies, like Apple, didn't always use big data analytics to drive product design, but The post Axiom Browser Automation with Yaseer Sheriff appeared first on Software Engineering Daily.

google apple companies automation sheriffs browsers axiom software engineering daily

StreamSets: DataOps and Smart Pipelines with Arvind Prabhakar

Play Episode Listen Later Jun 17, 2021 49:34

The company StreamSets is enabling DataOps practices in today's enterprises. StreamSets is a data engineering platform designed to help engineers design, deploy, and operate smart data pipelines. StreamSets Data Collector is a codeless solution for designing pipelines, triggering CDC operations, and monitoring data in flight. StreamSets Transformer uses Apache Spark to generate insights about your The post StreamSets: DataOps and Smart Pipelines with Arvind Prabhakar appeared first on Software Engineering Daily.

smart cdc pipelines arvind dataops apache spark software engineering daily streamsets

Blissfully: Comprehensive IT Management with Aaron White

Play Episode Listen Later Jun 16, 2021 48:35

Delivering Saas products involves a lot more than just building the product. Saas management involves customer relationship management, licensing, renewals, maintaining software visibility, and the general management of the technology portfolio. The company Blissfully helps businesses manage their SaaS products from within a complete IT platform with organization, automation, and security built in. The Blissfully The post Blissfully: Comprehensive IT Management with Aaron White appeared first on Software Engineering Daily.

management saas comprehensive blissfully aaron white software engineering daily

Stemma: Understanding Big Data with Mark Grover

Play Episode Listen Later Jun 15, 2021 40:33

Amundsen was started at Lyft and is the leading open-source data catalog with the fastest-growing community and the most integrations. Amundsen enables you to search your entire organization by text search, see automated and curated metadata, share context with co workers, and learn from others by seeing most common queries on a table or frequently The post Stemma: Understanding Big Data with Mark Grover appeared first on Software Engineering Daily.

big data lyft grover amundsen software engineering daily stemma

Data Exploration with a New Python Library with Doris Lee

Play Episode Listen Later May 27, 2021 41:11

Data exploration uses visual exploration to understand what is in a dataset and the characteristics of the data. Data scientists explore data to understand things like customer behavior and resource utilization. Some common programming languages used for data exploration are Python, R, and Matlab. Doris Jung-Lin Lee is currently a Graduate Research Assistant at the The post Data Exploration with a New Python Library with Doris Lee appeared first on Software Engineering Daily.

data library exploration python matlab graduate research assistant software engineering daily

Firebolt: Data Warehouses with Eldad Farkash

Play Episode Listen Later May 25, 2021 57:59

Cloud data warehouses are databases hosted in cloud environments. They provide typical benefits of the cloud like flexible data access, scalability, and performance. The company Firebolt provides a cloud data warehouse built for modern data environments. It decouples storage and compute to operate on top of existing data lakes like S3. It computes orders of The post Firebolt: Data Warehouses with Eldad Farkash appeared first on Software Engineering Daily.

cloud s3 data warehouses eldad firebolt software engineering daily

Preset: Visualizing Big Data with Srini Kadamati

Play Episode Listen Later May 20, 2021 46:53

Apache Superset is an open-source, fast, lightweight and modern data exploration and visualization platform. It can connect to any SQL based data source through SQLAlchemy at petabyte scale. Its architecture is highly scalable and it ships with a wide array of visualizations. The company Preset provides a powerful, easy to use data exploration and visualization The post Preset: Visualizing Big Data with Srini Kadamati appeared first on Software Engineering Daily.

big data visualizing sql preset srini software engineering daily sqlalchemy

ClickHouse: Data Warehousing with Robert Hodges

Play Episode Listen Later May 17, 2021 39:37

Columnar databases store and retrieve columns of data rather than rows of data. Each block of data in a columnar database stores up to 3 times as many records as row-based storage. This means you can read data with a third of the power needed in row-based data, among other advantages. The company Altinity is The post ClickHouse: Data Warehousing with Robert Hodges appeared first on Software Engineering Daily.

hodges data warehousing software engineering daily clickhouse

Apache Hudi: Large Scale Data Systems with Vinoth Chandar

Play Episode Listen Later May 13, 2021 46:06

Apache Hudi is an open-source data management framework used to simplify incremental data processing and data pipeline development. This framework more efficiently manages business requirements like data lifecycle and improves data quality. Some common use cases for Hudi is record-level insert, update, and delete, simplified file management and near real-time data access, and simplified CDC The post Apache Hudi: Large Scale Data Systems with Vinoth Chandar appeared first on Software Engineering Daily.

apache large scale data systems hudi software engineering daily

Akita: Application Programming Interfaces with Jean Yang

Play Episode Listen Later May 12, 2021 50:14

An application programming interface, API for short, is the connector between 2 applications. For example, a user interface that needs user data will call an endpoint, like a special URL, with request parameters and receive the data back if the request is valid. Modern applications rely on APIs to send data back and forth to The post Akita: Application Programming Interfaces with Jean Yang appeared first on Software Engineering Daily.

modern application programming api apis interfaces akita software engineering daily jean yang

Nextmv: Optimization in Fluid Work Environments with Carolyn Mooney

Play Episode Listen Later May 11, 2021 48:47

The traveling salesman problem is a classic challenge of finding the shortest and most efficient route for a person to take given a list of destinations. This is one of many real-world optimization problems that companies encounter. How should they schedule product distribution, or promote product bundles, or define sales territories? The answers to these The post Nextmv: Optimization in Fluid Work Environments with Carolyn Mooney appeared first on Software Engineering Daily.

optimization environments fluid mooney software engineering daily

Claim Data – Software Engineering Daily

In order to claim this podcast we'll send an email to with a verification link. Simply click the link and you will be able to edit tags, request a refresh, and other features to take control of your podcast page!

Claim Cancel

Data – Software Engineering Daily

Search for episodes from Data – Software Engineering Daily with a specific topic:

Latest episodes from Data – Software Engineering Daily

Faking Data Using Tonic.ai with Ian Coe and Adam Kamor

Lakehouse Data Stack with Raj Bains

RudderStack Engineering with Soumaydeb Mitra

Apache Hudi with Vinoth Chandar

Couchbase Architecture with Ravi Mayuram

Trifacta with Joe Hellerstein

MemGraph with Dominik Tomicevic

Amplemarket with João Batalha

Metaplane with Kevin Hu

Risk and Compliance with Terry O’Daniel

#FREEZUCK | ?masked…?… !(DOCTORS)!!

Scalable Streaming Video with Amit Mishra

Observability Using Honeycomb.io with Christine Yen

Location-Based Experiences Using Foursquare with Ankit Patel

Datadog with Omri Sass and Hugo Kaczmarek

Infrastructure as Code with Christian Tragesser

Modern Data Infrastructure and Tools with Leigh Marie Braswell

Git Scales for Monorepos with Derrick Stolee

Faking Data Using Tonic.ai with Ian Coe and Adam Kamor

DBT: Data Build Tool with Tristan Handy

No Code Process Automation at Axiom with Yaseer Sheriff

LinearB with Dan Lines

Modern Data Stacks Optimized by Mozart Data with Peter Fishman and Dan Silberman

Instabase with Anant Bhardwaj

Data Discovery with Shinji Kim

InfluxData: Time-Series Data with Russ Savage

Druid: Event-Driven Data with Eric Tschetter

DaaS with Auren Hoffman

Reverse ETL: Operationalizing Data Warehouses with Tejas Manohar

Prophecy: Apple of Data Engineering with Raj Bains

Pulsar Rerevisted with Enrico Olivelli

CockroachDB: Distributed Databases and Containerization with Spencer Kimball

Imply Infra: Big Data Analysis and Real-World Examples with Jad Naous

Better Stack: A New DevOps Experience with Juraj Masar

Data Science on AWS: Implementing AI and ML Pipelines on AWS with Chris Fregly

Data Lineage: Understanding Data Lineage at Scale with Julien Le Dem

Text Blaze: Text Shortcuts with Scott Fortmann-Roe

LayerCI with Colin Chartier

Meltano: ELT for DataOps with Douwe Maan

Uber Data Science with Kevin Novak

Axiom Browser Automation with Yaseer Sheriff

StreamSets: DataOps and Smart Pipelines with Arvind Prabhakar

Blissfully: Comprehensive IT Management with Aaron White

Stemma: Understanding Big Data with Mark Grover

Data Exploration with a New Python Library with Doris Lee

Firebolt: Data Warehouses with Eldad Farkash

Preset: Visualizing Big Data with Srini Kadamati

ClickHouse: Data Warehousing with Robert Hodges

Apache Hudi: Large Scale Data Systems with Vinoth Chandar

Akita: Application Programming Interfaces with Jean Yang

Nextmv: Optimization in Fluid Work Environments with Carolyn Mooney

Claim Data – Software Engineering Daily

On the way!