POPULARITY
Thiersch, JD, speaks with Alex Lirtsman, founder and CEO of CorralData, to explore how medical spas can unlock real-time, HIPAA-compliant insights without changing their existing systems. CorralData integrates everything from your EMR to marketing, payroll, and finance systems, giving med spas the ability to uncover actionable insights that drive profitability, patient retention, and scalable growth. Listen for strategies to ask smarter questions of your data, including: Integrating all of your existing platforms to get actionable insights from your data; How multi-location practices, med spa rollups and private equity develop playbooks; Navigating HIPAA and BAAs with AI companies to create secure data analysis tools; Using reverse ETL to optimize for high lifetime value patients and boost profitability; The questions you can ask your data with conversational AI and large language models; CorralData's tailor-made solutions for Advanced MedAesthetic Partners, and more! -- Music by Ghost Score
We all talk about #AI, but what good is it if your models are powered by stale, outdated data?In Episode 99 of Great Things with Great Tech, Deepti Srivastava, founder and CEO of Snow Leopard, and former founding PM of Google Spanner, calls out the broken state of enterprise AI. With decades of experience in distributed systems and data infrastructure, Deepti unveils how Snow Leopard is redefining how AI applications are built, by tapping into live, real-time data from SQL and APIs without the need for ETL or pipelines.Instead of relying on static snapshots or disconnected data lakes, Snow Leopard's #agentic platform queries native sources like PostgreSQL, Snowflake, and Salesforce on-demand, empowering AI to live directly in the critical decision path.In This Episode, We Cover:Deepti's journey from building Spanner at Google to founding Snow Leopard AI.Why most enterprise AI fails due to reliance on stale data and outdated pipelines. How Snow Leopard federates live data across SQL and APIs with zero ETL.The limitations of vector databases in structured, real-time business use cases.Why putting AI in the critical path of business decisions unlocks real value.Snow Leopard is a U.S.-based technology company founded in 2023 by and is Headquartered in San Francisco, CaliforniaSnow Leopard specializes in building a platform that enables the development of production-ready AI applications by leveraging live business data. The company's approach focuses on real-time data retrieval directly from sources like SQL databases and APIs, eliminating the need for traditional ETL processes and data pipelines. This innovation allows for more accurate and timely AI-driven business decision.PODCAST LINKSGreat Things with Great Tech Podcast: https://gtwgt.comGTwGT Playlist on YouTube: https://www.youtube.com/@GTwGTPodcastListen on Spotify: https://open.spotify.com/show/5Y1Fgl4DgGpFd5Z4dHulVXListen on Apple Podcasts: https://podcasts.apple.com/us/podcast/great-things-with-great-tech-podcast/id1519439787EPISODE LINKSSnow Leopard Web: https://www.snowleopard.ai/Deepti Srivastava on LinkedIn:https://www.linkedin.com/in/thedeepti/Snow Leopard on LinkedIn: https://www.linkedin.com/company/snow-leopard-ai/GTwGT LINKSSupport the Channel: https://ko-fi.com/gtwgtBe on #GTwGT: Contact via Twitter/X @GTwGTPodcast or visit https://www.gtwgt.comSubscribe to YouTube: https://www.youtube.com/@GTwGTPodcast?sub_confirmation=1Great Things with Great Tech Podcast Website: https://gtwgt.comSOCIAL LINKSFollow GTwGT on Social Media:Twitter/X: https://twitter.com/GTwGTPodcastInstagram: https://www.instagram.com/GTwGTPodcastTikTok: https://www.tiktok.com/@GTwGTPodcast
Si tu observes attentivement la Joconde, le célèbre tableau de Léonard de Vinci exposé au Louvre, un détail intrigue immédiatement : elle n'a ni sourcils ni cils. Un visage d'une précision incroyable, un regard presque vivant… mais un front totalement nu. Comment expliquer cette absence ?Une mode de la Renaissance ?Pendant longtemps, on a pensé que l'absence de sourcils était simplement liée à la mode de l'époque. Au début du XVIe siècle, en Italie, certaines femmes aristocrates s'épilaient les sourcils (et parfois la racine des cheveux) pour dégager le front, considéré alors comme un signe de beauté et de noblesse. Selon cette hypothèse, Mona Lisa (ou Lisa Gherardini, si l'on en croit la thèse majoritaire) aurait pu suivre cette tendance esthétique.Mais cette explication ne tient pas totalement : d'autres portraits de femmes de la même époque montrent clairement des sourcils, même fins ou discrets. Et Léonard de Vinci, connu pour son obsession du réalisme, aurait-il vraiment volontairement omis un tel détail ?Une disparition progressiveL'explication la plus crédible aujourd'hui repose sur l'histoire matérielle du tableau. La Joconde a plus de 500 ans, et au fil des siècles, elle a été soumise à des restaurations, nettoyages et vernissages qui ont pu altérer les détails les plus fins.Une étude scientifique menée par le spécialiste Pascal Cotte, en 2004, à l'aide d'une technologie de réflectographie multispectrale, a révélé qu'à l'origine, Léonard avait bien peint des sourcils et des cils, très fins et délicats. Mais ces détails auraient disparu avec le temps, en raison de l'usure naturelle de la couche picturale ou de restaurations trop agressives. En somme, les sourcils étaient là, mais ils se sont effacés au fil des siècles.Un effet renforçant le mystèreL'absence de sourcils contribue aussi, paradoxalement, au mystère et à l'ambiguïté du visage de la Joconde. Son expression indéfinissable, ce mélange de sourire et de neutralité, est renforcé par ce manque de lignes faciales qui encadreraient normalement le regard. Ce flou contribue au caractère intemporel et énigmatique du tableau, qui fascine depuis des siècles.En résumé : la Joconde avait probablement des sourcils, peints avec la finesse propre à Léonard de Vinci. Mais le temps, les restaurations et les vernis les ont effacés. Ce détail oublié est devenu un élément clé de son mystère. Hébergé par Acast. Visitez acast.com/privacy pour plus d'informations.
✨ Heads up! This episode features a demonstration of the SnapLogic UI and its AI Agent Creator towards the end. For the full visual experience, check out the video version on the Spotify app! ✨(Episode Summary)Tired of tangled data spread across multiple clouds, on-premise systems, and the edge? In this episode, MongoDB's Shane McAllister sits down with Peter Ngai, Principal Architect at SnapLogic, to explore the future of data integration and management in today's complex tech landscape.Dive into the challenges and solutions surrounding modern data architecture, including:Navigating the complexities of multi-cloud and hybrid cloud environments.The secrets to building flexible, resilient data ecosystems that avoid vendor lock-in.Strategies for seamless data integration and connecting disparate applications using low-code/no-code platforms like SnapLogic.Meeting critical data compliance, security, and sovereignty demands (think GDPR, HIPAA, etc.).How AI is revolutionizing data automation and providing faster access to insights (featuring SnapLogic's Agent Creator).The powerful synergy between SnapLogic and MongoDB, leveraging MongoDB both internally and for customer integrations.Real-world applications, from IoT data processing to simplifying enterprise workflows.Whether you're an IT leader, data engineer, business analyst, or simply curious about cloud strategy, iPaaS solutions, AI in business, or simplifying your data stack, Peter offers invaluable insights into making data connectivity a driver, not a barrier, for innovation.-Keywords: Data Integration, Multi-Cloud, Hybrid Cloud, Edge Computing, SnapLogic, MongoDB, AI, Artificial Intelligence, Data Automation, iPaaS, Low-Code, No-Code, Data Architecture, Data Management, Cloud Data, Enterprise Data, API Integration, Data Compliance, Data Sovereignty, Data Security, Business Automation, ETL, ELT, Tech Stack Simplification, Peter Ngai, Shane McAllister.
In a new season of the Oracle University Podcast, Lois Houston and Nikita Abraham dive into the world of Oracle GoldenGate 23ai, a cutting-edge software solution for data management. They are joined by Nick Wagner, a seasoned expert in database replication, who provides a comprehensive overview of this powerful tool. Nick highlights GoldenGate's ability to ensure continuous operations by efficiently moving data between databases and platforms with minimal overhead. He emphasizes its role in enabling real-time analytics, enhancing data security, and reducing costs by offloading data to low-cost hardware. The discussion also covers GoldenGate's role in facilitating data sharing, improving operational efficiency, and reducing downtime during outages. Oracle GoldenGate 23ai: Fundamentals: https://mylearn.oracle.com/ou/course/oracle-goldengate-23ai-fundamentals/145884/237273 Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ X: https://x.com/Oracle_Edu Special thanks to Arijit Ghosh, David Wright, Kris-Ann Nansen, Radhika Banka, and the OU Studio Team for helping us create this episode. --------------------------------------------------------------- Episode Transcript: 00:00 Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we'll bring you foundational training on the most popular Oracle technologies. Let's get started! 00:25 Nikita: Welcome to the Oracle University Podcast! I'm Nikita Abraham, Team Lead: Editorial Services with Oracle University, and with me is Lois Houston: Director of Innovation Programs. Lois: Hi everyone! Welcome to a new season of the podcast. This time, we're focusing on the fundamentals of Oracle GoldenGate. Oracle GoldenGate helps organizations manage and synchronize their data across diverse systems and databases in real time. And with the new Oracle GoldenGate 23ai release, we'll uncover the latest innovations and features that empower businesses to make the most of their data. Nikita: Taking us through this is Nick Wagner, Senior Director of Product Management for Oracle GoldenGate. He's been doing database replication for about 25 years and has been focused on GoldenGate on and off for about 20 of those years. 01:18 Lois: In today's episode, we'll ask Nick to give us a general overview of the product, along with some use cases and benefits. Hi Nick! To start with, why do customers need GoldenGate? Nick: Well, it delivers continuous operations, being able to continuously move data from one database to another database or data platform in efficiently and a high-speed manner, and it does this with very low overhead. Almost all the GoldenGate environments use transaction logs to pull the data out of the system, so we're not creating any additional triggers or very little overhead on that source system. GoldenGate can also enable real-time analytics, being able to pull data from all these different databases and move them into your analytics system in real time can improve the value that those analytics systems provide. Being able to do real-time statistics and analysis of that data within those high-performance custom environments is really important. 02:13 Nikita: Does it offer any benefits in terms of cost? Nick: GoldenGate can also lower IT costs. A lot of times people run these massive OLTP databases, and they are running reporting in those same systems. With GoldenGate, you can offload some of the data or all the data to a low-cost commodity hardware where you can then run the reports on that other system. So, this way, you can get back that performance on the OLTP system, while at the same time optimizing your reporting environment for those long running reports. You can improve efficiencies and reduce risks. Being able to reduce the amount of downtime during planned and unplanned outages can really make a big benefit to the overall operational efficiencies of your company. 02:54 Nikita: What about when it comes to data sharing and data security? Nick: You can also reduce barriers to data sharing. Being able to pull subsets of data, or just specific pieces of data out of a production database and move it to the team or to the group that needs that information in real time is very important. And it also protects the security of your data by only moving in the information that they need and not the entire database. It also provides extensibility and flexibility, being able to support multiple different replication topologies and architectures. 03:24 Lois: Can you tell us about some of the use cases of GoldenGate? Where does GoldenGate truly shine? Nick: Some of the more traditional use cases of GoldenGate include use within the multicloud fabric. Within a multicloud fabric, this essentially means that GoldenGate can replicate data between on-premise environments, within cloud environments, or hybrid, cloud to on-premise, on-premise to cloud, or even within multiple clouds. So, you can move data from AWS to Azure to OCI. You can also move between the systems themselves, so you don't have to use the same database in all the different clouds. For example, if you wanted to move data from AWS Postgres into Oracle running in OCI, you can do that using Oracle GoldenGate. We also support maximum availability architectures. And so, there's a lot of different use cases here, but primarily geared around reducing your recovery point objective and recovery time objective. 04:20 Lois: Ah, reducing RPO and RTO. That must have a significant advantage for the customer, right? Nick: So, reducing your RPO and RTO allows you to take advantage of some of the benefits of GoldenGate, being able to do active-active replication, being able to set up GoldenGate for high availability, real-time failover, and it can augment your active Data Guard and Data Guard configuration. So, a lot of times GoldenGate is used within Oracle's maximum availability architecture platinum tier level of replication, which means that at that point you've got lots of different capabilities within the Oracle Database itself. But to help eke out that last little bit of high availability, you want to set up an active-active environment with GoldenGate to really get true zero RPO and RTO. GoldenGate can also be used for data offloading and data hubs. Being able to pull data from one or more source systems and move it into a data hub, or into a data warehouse for your operational reporting. This could also be your analytics environment too. 05:22 Nikita: Does GoldenGate support online migrations? Nick: In fact, a lot of companies actually get started in GoldenGate by doing a migration from one platform to another. Now, these don't even have to be something as complex as going from one database like a DB2 on-premise into an Oracle on OCI, it could even be simple migrations. A lot of times doing something like a major application or a major database version upgrade is going to take downtime on that production system. You can use GoldenGate to eliminate that downtime. So this could be going from Oracle 19c to Oracle 23ai, or going from application version 1.0 to application version 2.0, because GoldenGate can do the transformation between the different application schemas. You can use GoldenGate to migrate your database from on premise into the cloud with no downtime as well. We also support real-time analytic feeds, being able to go from multiple databases, not only those on premise, but being able to pull information from different SaaS applications inside of OCI and move it to your different analytic systems. And then, of course, we also have the ability to stream events and analytics within GoldenGate itself. 06:34 Lois: Let's move on to the various topologies supported by GoldenGate. I know GoldenGate supports many different platforms and can be used with just about any database. Nick: This first layer of topologies is what we usually consider relational database topologies. And so this would be moving data from Oracle to Oracle, Postgres to Oracle, Sybase to SQL Server, a lot of different types of databases. So the first architecture would be unidirectional. This is replicating from one source to one target. You can do this for reporting. If I wanted to offload some reports into another server, I can go ahead and do that using GoldenGate. I can replicate the entire database or just a subset of tables. I can also set up GoldenGate for bidirectional, and this is what I want to set up GoldenGate for something like high availability. So in the event that one of the servers crashes, I can almost immediately reconnect my users to the other system. And that almost immediately depends on the amount of latency that GoldenGate has at that time. So a typical latency is anywhere from 3 to 6 seconds. So after that primary system fails, I can reconnect my users to the other system in 3 to 6 seconds. And I can do that because as GoldenGate's applying data into that target database, that target system is already open for read and write activity. GoldenGate is just another user connecting in issuing DML operations, and so it makes that failover time very low. 07:59 Nikita: Ok…If you can get it down to 3 to 6 seconds, can you bring it down to zero? Like zero failover time? Nick: That's the next topology, which is active-active. And in this scenario, all servers are read/write all at the same time and all available for user activity. And you can do multiple topologies with this as well. You can do a mesh architecture, which is where every server talks to every other server. This works really well for 2, 3, 4, maybe even 5 environments, but when you get beyond that, having every server communicate with every other server can get a little complex. And so at that point we start looking at doing what we call a hub and spoke architecture, where we have lots of different spokes. At the end of each spoke is a read/write database, and then those communicate with a hub. So any change that happens on one spoke gets sent into the hub, and then from the hub it gets sent out to all the other spokes. And through that architecture, it allows you to really scale up your environments. We have customers that are doing up to 150 spokes within that hub architecture. Within active-active replication as well, we can do conflict detection and resolution, which means that if two users modify the same row on two different systems, GoldenGate can actually determine that there was an issue with that and determine what user wins or which row change wins, which is extremely important when doing active-active replication. And this means that if one of those systems fails, there is no downtime when you switch your users to another active system because it's already available for activity and ready to go. 09:35 Lois: Wow, that's fantastic. Ok, tell us more about the topologies. Nick: GoldenGate can do other things like broadcast, sending data from one system to multiple systems, or many to one as far as consolidation. We can also do cascading replication, so when data moves from one environment that GoldenGate is replicating into another environment that GoldenGate is replicating. By default, we ignore all of our own transactions. But there's actually a toggle switch that you can flip that says, hey, GoldenGate, even though you wrote that data into that database, still push it on to the next system. And then of course, we can also do distribution of data, and this is more like moving data from a relational database into something like a Kafka topic or a JMS queue or into some messaging service. 10:24 Raise your game with the Oracle Cloud Applications skills challenge. Get free training on Oracle Fusion Cloud Applications, Oracle Modern Best Practice, and Oracle Cloud Success Navigator. Pass the free Oracle Fusion Cloud Foundations Associate exam to earn a Foundations Associate certification. Plus, there's a chance to win awards and prizes throughout the challenge! What are you waiting for? Join the challenge today by visiting visit oracle.com/education. 10:58 Nikita: Welcome back! Nick, does GoldenGate also have nonrelational capabilities? Nick: We have a number of nonrelational replication events in topologies as well. This includes things like data lake ingestion and streaming ingestion, being able to move data and data objects from these different relational database platforms into data lakes and into these streaming systems where you can run analytics on them and run reports. We can also do cloud ingestion, being able to move data from these databases into different cloud environments. And this is not only just moving it into relational databases with those clouds, but also their data lakes and data fabrics. 11:38 Lois: You mentioned a messaging service earlier. Can you tell us more about that? Nick: Messaging replication is also possible. So we can actually capture from things like messaging systems like Kafka Connect and JMS, replicate that into a relational data, or simply stream it into another environment. We also support NoSQL replication, being able to capture from MongoDB and replicate it onto another MongoDB for high availability or disaster recovery, or simply into any other system. 12:06 Nikita: I see. And is there any integration with a customer's SaaS applications? Nick: GoldenGate also supports a number of different OCI SaaS applications. And so a lot of these different applications like Oracle Financials Fusion, Oracle Transportation Management, they all have GoldenGate built under the covers and can be enabled with a flag that you can actually have that data sent out to your other GoldenGate environment. So you can actually subscribe to changes that are happening in these other systems with very little overhead. And then of course, we have event processing and analytics, and this is the final topology or flexibility within GoldenGate itself. And this is being able to push data through data pipelines, doing data transformations. GoldenGate is not an ETL tool, but it can do row-level transformation and row-level filtering. 12:55 Lois: Are there integrations offered by Oracle GoldenGate in automation and artificial intelligence? Nick: We can do time series analysis and geofencing using the GoldenGate Stream Analytics product. It allows you to actually do real time analysis and time series analysis on data as it flows through the GoldenGate trails. And then that same product, the GoldenGate Stream Analytics, can then take the data and move it to predictive analytics, where you can run MML on it, or ONNX or other Spark-type technologies and do real-time analysis and AI on that information as it's flowing through. 13:29 Nikita: So, GoldenGate is extremely flexible. And given Oracle's focus on integrating AI into its product portfolio, what about GoldenGate? Does it offer any AI-related features, especially since the product name has “23ai” in it? Nick: With the advent of Oracle GoldenGate 23ai, it's one of the two products at this point that has the AI moniker at Oracle. Oracle Database 23ai also has it, and that means that we actually do stuff with AI. So the Oracle GoldenGate product can actually capture vectors from databases like MySQL HeatWave, Postgres using pgvector, which includes things like AlloyDB, Amazon RDS Postgres, Aurora Postgres. We can also replicate data into Elasticsearch and OpenSearch, or if the data is using vectors within OCI or the Oracle Database itself. So GoldenGate can be used for a number of things here. The first one is being able to migrate vectors into the Oracle Database. So if you're using something like Postgres, MySQL, and you want to migrate the vector information into the Oracle Database, you can. Now one thing to keep in mind here is a vector is oftentimes like a GPS coordinate. So if I need to know the GPS coordinates of Austin, Texas, I can put in a latitude and longitude and it will give me the GPS coordinates of a building within that city. But if I also need to know the altitude of that same building, well, that's going to be a different algorithm. And GoldenGate and replicating vectors is the same way. When you create a vector, it's essentially just creating a bunch of numbers under the screen, kind of like those same GPS coordinates. The dimension and the algorithm that you use to generate that vector can be different across different databases, but the actual meaning of that data will change. And so GoldenGate can replicate the vector data as long as the algorithm and the dimensions are the same. If the algorithm and the dimensions are not the same between the source and the target, then you'll actually want GoldenGate to replicate the base data that created that vector. And then once GoldenGate replicates the base data, it'll actually call the vector embedding technology to re-embed that data and produce that numerical formatting for you. 15:42 Lois: So, there are some nuances there… Nick: GoldenGate can also replicate and consolidate vector changes or even do the embedding API calls itself. This is really nice because it means that we can take changes from multiple systems and consolidate them into a single one. We can also do the reverse of that too. A lot of customers are still trying to find out which algorithms work best for them. How many dimensions? What's the optimal use? Well, you can now run those in different servers without impacting your actual AI system. Once you've identified which algorithm and dimension is going to be best for your data, you can then have GoldenGate replicate that into your production system and we'll start using that instead. So it's a nice way to switch algorithms without taking extensive downtime. 16:29 Nikita: What about in multicloud environments? Nick: GoldenGate can also do multicloud and N-way active-active Oracle replication between vectors. So if there's vectors in Oracle databases, in multiple clouds, or multiple on-premise databases, GoldenGate can synchronize them all up. And of course we can also stream changes from vector information, including text as well into different search engines. And that's where the integration with Elasticsearch and OpenSearch comes in. And then we can use things like NVIDIA and Cohere to actually do the AI on that data. 17:01 Lois: Using GoldenGate with AI in the database unlocks so many possibilities. Thanks for that detailed introduction to Oracle GoldenGate 23ai and its capabilities, Nick. Nikita: We've run out of time for today, but Nick will be back next week to talk about how GoldenGate has evolved over time and its latest features. And if you liked what you heard today, head over to mylearn.oracle.com and take a look at the Oracle GoldenGate 23ai Fundamentals course to learn more. Until next time, this is Nikita Abraham… Lois: And Lois Houston, signing off! 17:33 That's all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We'd also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.
"So you want trusted data, but you want it now? Building this trust really starts with transparency and collaboration. It's not just technology. It's about creating a single governed view of data that is consistent no matter who accesses it, " says Errol Rodericks, Director of Product Marketing at Denodo.In this episode of the 'Don't Panic, It's Just Data' podcast, Shawn Rogers, CEO at BARC US, speaks with Errol Rodericks from Denodo. They explore the crucial link between trusted data and successful AI initiatives. They discuss key factors such as data orchestration, governance, and cost management within complex cloud environments. We've all heard the horror stories – AI projects that fail spectacularly, delivering biased or inaccurate results. But what's the root cause of these failures? More often than not, it's a lack of focus on the data itself. Rodericks emphasises that "AI is only as good as the data it's trained on." This episode explores how organisations can avoid the "garbage in, garbage out" scenario by prioritising data quality, lineage, and responsible AI practices. Learn how to avoid AI failures and discover strategies for building an AI-ready data foundation that ensures trusted, reliable outcomes. Key topics include overcoming data bias, ETL processes, and improving data sharing practices.TakeawaysBad data leads to bad AI outputs.Trust in data is essential for effective AI.Organisations must prioritise data quality and orchestration.Transparency and collaboration are key to building trust in data.Compliance is a responsibility for the entire organisation, not just IT.Agility in accessing data is crucial for AI success.Chapters00:00 The Importance of Data Quality in AI02:57 Building Trust in Data Ecosystems06:11 Navigating Complex Data Landscapes09:11 Top-Down Pressure for AI Strategy11:49 Responsible AI and Data Governance15:08 Challenges in Personalisation and Compliance17:47 The Role of Speed in Data Utilisation20:47 Advice for CFOs on AI InvestmentsAbout DenodoDenodo is a leader in data management. The award-winning Denodo Platform is the leading logical data management platform for transforming data into trustworthy insights and outcomes for all data-related initiatives across the enterprise, including AI and self-service. Denodo's customers in all industries all over the world have delivered trusted AI-ready and business-ready data in a third of the time and with 10x better performance than with lakehouses and other mainstream data platforms alone.
Ever wondered how companies like Amazon or Pinterest deliver lightning-fast image search? Dive into this episode of MongoDB Podcast Live with Shane McAllister and Nenad, a MongoDB Champion, as they unravel the magic of semantic image search powered by MongoDB Atlas Vector Search!
What happens when you hand off your Power BI output to ChatGPT and ask it to make sense of your world? You might be surprised. This week, Rob shares a deeply personal use case. One that ties together two major themes we've been exploring: Gen AI is reshaping the way we think about dashboards. To get real value out of AI, you need more than just data. You need metadata. And yes, that kind of metadata—the kind you create in Power BI when you translate raw data into something meaningful. Along the way, we revisit the old guard of data warehousing. The mighty (and now dusty?) ETL priesthood. And we uncover a delicious little irony about how the future of data looks a lot like its past, just with better tools and smarter questions. The big twist? We're all ETL now. But the "T" might not mean what you think it does anymore. Listen now to find out how a few rows of carefully modeled data, a table visual, and one really good AI assistant changed the game. For Rob and, just possibly, for all of us. Also in this episode: Blind Melon – Change (YouTube) The Data Warehouse Toolkit Raw Data Episode - The Human Side of Data: Using Analytics for Personal Well-Being
In this engaging episode of the HVAC School Podcast, host Bryan sits down with Jesse from NAVAC to dive deep into the evolving landscape of refrigeration technology, focusing primarily on the transition to A2L refrigerants. The conversation offers a refreshingly pragmatic approach to addressing industry concerns about these new, mildly flammable refrigerants, dispelling myths and providing practical insights for HVAC technicians. The discussion begins by addressing the most pressing question for many technicians: Do you need to buy all new tools to work with A2L refrigerants? Jesse from NAVAC provides a nuanced response, emphasizing that while there are currently no regulations mandating new equipment, the company has proactively developed tools that are safety-certified and compatible with the new refrigerant types. They explore the intricacies of safety certifications like UL and CSA, explaining the differences between UL Listed and UL Verified, and highlighting the importance of intrinsically safe equipment, especially for tools like vacuum pumps and recovery machines. NAVAC's approach goes beyond mere product promotion, with Jesse positioning himself as an educator first. The podcast delves into the technical details of A2L refrigerants, challenging common misconceptions and providing context about their flammability. Bryan and Jesse draw parallels with previous refrigerant transitions, noting how technicians were initially skeptical about R-410A but eventually adapted. They emphasize the importance of best practices, proper training, and understanding the actual risks associated with these new refrigerants, rather than succumbing to fear-based narratives. The episode also showcases NAVAC's latest technological innovations, including smart probes, a Bluetooth scale, a smart valve for charging and recovery, and an advanced vacuum pump with a one-touch oil testing feature. These tools represent the company's commitment to improving technician efficiency and safety, with features that address real-world challenges faced by HVAC professionals. Key Topics Covered: A2L Refrigerants Myths and misconceptions about flammability Comparison with previous refrigerant transitions Safety considerations and best practices Safety Certifications Differences between UL Listed and UL Verified Importance of intrinsically safe equipment CSA and ETL certifications NAVAC's New Tools Smart probes with Bluetooth connectivity Advanced vacuum pump with automatic oil testing Flex manifold with digital accuracy and analog feel Battery-operated pumps with improved run times Industry Trends Preparation for A2L and future refrigerant transitions Regulatory changes and efficiency standards Importance of technician education and adaptation Additional Insights: No current regulations require new tools for A2L refrigerants Proper training and best practices are crucial Technicians should focus on understanding new technologies Safety is about awareness and proper procedures, not fear Have a question that you want us to answer on the podcast? Submit your questions at https://www.speakpipe.com/hvacschool. Purchase your tickets or learn more about the 6th Annual HVACR Training Symposium at https://hvacrschool.com/symposium. Subscribe to our podcast on your iPhone or Android. Subscribe to our YouTube channel. Check out our handy calculators here or on the HVAC School Mobile App for Apple and Android
Companies have crucial data stored across multiple systems in their organization. And the bigger the company, the more systems there are. Sales, marketing, finance, ERP, inventory, contract management, billing, and service delivery are just types of data and systems that show the story of the business. Many times your CRM just by itself can't show all of that!Because of this, many companies have started setting up a centralized data warehouse like Snowflake or Redshift to pull in data from their CRM and other systems to be able to run more advanced and centralized reporting across it all. But if you want to set this up, where do you start? How do you manage it? How do you protect it? How do you keep it maintained? How do you actually derive value from all the hard work of implementing it?I contacted Ryan Severns to talk through all these questions with him. We talk through all these questions and more, starting with questions like:Why set up a CRM data warehouse infrastructure in the first place?How do you build the integration?What important considerations do you need to plan for?We dig deeper into the details from there, talking through topics like ETL tooling, DBT, data governance, cross-platform analytics, building effective business intelligence systems, and more.If you're ready to level up your customer data skills and advance to the next level of your RevOps hero journey, this episode is for you! Give it a watch and a like. And hit that subscribe button so that you'll always get notified of future episodes of The RevOps Hero Podcast as well.
Easily access and unify your data for analytics and AI—no matter where it lives. With OneLake in Microsoft Fabric, you can connect to data across multiple clouds, databases, and formats without duplication. Use the OneLake catalog to quickly find and interact with your data, and let Copilot in Fabric help you transform and analyze it effortlessly. Eliminate barriers to working with your data using Shortcuts to virtualize external sources and Mirroring to keep databases and warehouses in sync—all without ETL. For deeper integration, leverage Data Factory's 180+ connectors to bring in structured, unstructured, and real-time streaming data at scale. Maraki Ketema from the Microsoft Fabric team shows how to combine these methods, ensuring fast, reliable access to quality data for analytics and AI workloads. ► QUICK LINKS: 00:00 - Access data wherever it lives 00:42 - Microsoft Fabric background 01:17 - Manage data with Microsoft Fabric 03:04 - Low latency 03:34 - How Shortcuts work 06:41 - Mirroring 08:10 - Open mirroring 08:40 - Low friction ways to bring data in 09:32 - Data Factory in Microsoft Fabric 10:52 - Build out your data flow 11:49 - Use built-in AI to ask questions of data 12:56 - OneLake catalog 13:36 - Data security & compliance 15:10 - Additional options to bring data in 15:42 - Wrap up ► Link References Watch our show on Real-Time Intelligence at https://aka.ms/MechanicsRTI Check out Open Mirroring at https://aka.ms/FabricOpenMirroring ► Unfamiliar with Microsoft Mechanics? As Microsoft's official video series for IT, you can watch and share valuable content and demos of current and upcoming tech from the people who build it at Microsoft. • Subscribe to our YouTube: https://www.youtube.com/c/MicrosoftMechanicsSeries • Talk with other IT Pros, join us on the Microsoft Tech Community: https://techcommunity.microsoft.com/t5/microsoft-mechanics-blog/bg-p/MicrosoftMechanicsBlog • Watch or listen from anywhere, subscribe to our podcast: https://microsoftmechanics.libsyn.com/podcast ► Keep getting this insider knowledge, join us on social: • Follow us on Twitter: https://twitter.com/MSFTMechanics • Share knowledge on LinkedIn: https://www.linkedin.com/company/microsoft-mechanics/ • Enjoy us on Instagram: https://www.instagram.com/msftmechanics/ • Loosen up with us on TikTok: https://www.tiktok.com/@msftmechanics
In this exciting episode, we sit down with Oleg Ovanesyan, Principal Product Manager, and Prithvi Khosla, Product Manager, to discuss the recently announced data workspace feature in Microsoft Dataverse. Oleg and Prithvi delve into the key aspects of the Data Workspace, including its AI-assisted mapping capabilities, seamless integration with external data sources, and enhanced data management processes. They will also share insights on how this feature simplifies data migration, maintains security role privileges, and supports a no ETL-no copy approach for data import and export. Learn more about data workspace here: https://learn.microsoft.com/en-us/power-platform/release-plan/2024wave2/data-platform/data-workspace
Points of Interest01:00 – 01:45 – Guest Introduction: Marcel welcomes back his co-founder, Ben Zittlau, highlighting his expertise in data operations and agency growth strategies.03:35 – 05:50 – Scaling Data Operations: Lessons from Jobber: Ben shares insights from his experience at Jobber, detailing the challenges of building and scaling a data operations team in a fast-growing company.08:43 – 13:14 – Why Agencies Struggle to Get Insights from Their Data: Discussion on how agencies collect vast amounts of data across multiple tools but fail to derive meaningful insights due to fragmentation and inconsistency.13:15 – 16:05 – The Myth of a “Single Source of Truth: Marcel and Ben challenge the common belief that pushing all data into a single platform solves reporting issues, highlighting the reality of messy operational data.19:26 – 21:48 – The Limitations of All-in-One Software Solutions: Exploring why all-in-one agency management tools often fail to deliver on their promise of seamless reporting and data integration.24:43 – 27:44 – The Hidden Costs of Locking into a Single Platform: Discussion on how agencies become “trapped” by software providers, making it difficult to switch tools without major operational disruptions.30:29 – 35:53 – How to Integrate Data Without Sacrificing Flexibility: A deep dive into the challenges of stitching data from various tools while maintaining adaptability and historical accuracy.35:54 – 41:14 – Accuracy vs. Precision: Why Clean Data is a Myth: Why agencies should focus on broader trends instead of pursuing impossible data perfection, and how to handle data inconsistencies effectively.41:15 – 44:15 – A Modern Data Approach: Extract, Transform, Load (ETL): Introduction to the ETL process, which allows agencies to clean and transform data before reporting, improving reliability and flexibility.50:14 – 52:00 – Lessons from Finance: What Agencies Can Learn from Accounting: Marcel compares data operations to bookkeeping, explaining how structured financial workflows can serve as a model for better agency data management.Show NotesConnect with Ben via LinkedInFree ToolkitParakeeto Foundations CourseLove the PodcastLeave us a review here.
Once you have your data stored in OneLake, you'll be ready to start transforming it to improve it's usability, accuracy, and efficiency. In this episode of the podcast, Belinda Allen takes us on a delightful journey through Data Flows, Power Query, Azure Data Factory (Pipelines), and discusses the merits of shortcuts. We also learn about a handy way to manually upload a table if you have some static data you need to update. There are many tools and techniques that can be used for data ingestion and transformations. And while some of these options we discuss will be up to individual preference, there are pros and cons to each. One of the blessings and curses of Fabric is that there are many ways of achieving the same result, so what you choose may depend on the nature of the data you have and your goals, but might also be dictated by personal experience. We hope you enjoyed this conversation with Belinda on ingesting and transforming data in Microsoft Fabric. If you have questions or comments, please send them our way. We would love to answer your questions on a future episode. Leave us a comment and some love ❤️on LinkedIn, X, Facebook, or Instagram. The show notes for today's episode can be found at Episode 284: The Four-Letter Word ETL - Data Movement. Have fun on the SQL Trail!
I had never heard of data debt until I saw this article on the topic. In reading it, I couldn't help thinking that most everyone has data debt, it creates inefficiencies, and it's unlikely we'll get rid of it. And by the way, it's too late to get this under control. I somewhat dismissed the article when I saw this: "addressing data debt in its early stages is crucial to ensure that it does not become an overwhelming barrier to progress." I know it's a barrier, as I assume most of you also know, but it's also not stopping us. We keep building more apps, databases, and systems, and accruing more data debt. Somehow, most organizations keep running. The description of debt might help here. How many of you have inconsistent data standards, where you might define a data element differently in different databases? Maybe you have duplicated data that is slow to update (think ETL/warehouses), maybe you have different ways of tracking a completed sale in different systems. Maybe you even store dates in different formats (int, string, or something weirder). How many of you lack some documentation on what the columns in your databases mean? Maybe I should ask the reverse, where the few of you who have complete data dictionaries can raise your hands. Read the rest of Data Debt
Bu bölümünde konuğum Datrick kurucusu Can Göktuğ Özdem.Göktuğ bir veri mühendisi, Silikon Vadisi'nden yeni döndü, oradaki deneyimlerini konuşurken duyduklarımı sizlerle de paylaşmak istedim ve bu bölümü kaydettik.Göktuğ, bu son seyahatinde yapay zeka alanındaki faaliyet yoğunluğuna bizzat şahit olmuş. Etkinliklerden ve genel atmosferden bahsederken, bu alandaki hızlı gelişimin etkileyiciliğini vurguladı. İki yıl önceki ziyaretinin sıcak konusu blockchain'den yapay zekaya yaşanan kayışın etkilerini ve geleceği tartıştık.Yapay zeka ve verinin ayrılmaz bir ikili olduğunu, doğru verinin yapay zeka uygulamaları için kritik önemini konuştuk. Veri mühendisliğinin rolünü ve ETL gibi önemli kavramları ele aldık.Yapay zeka agent'larının yükselişiyle iş dünyasında nelerin değişebileceğini, hangi mesleklerin etkilenebileceğini ve şirketlerin bu değişime nasıl adapte olabileceğini konuştuk. "En iyi yapay zekaya sahip olan kazanacak" öngörüsü, gelecekteki rekabetin boyutunu gözler önüne seriyor.Göktuğ, Silikon Vadisi'ndeki dinamik ekosistemden, etkinliklerden, hackathon'lardan ve sponsorluklardan da bahsetti. Ben de kendisine Türkiye'nin buradan ne öğrenebileceğini sordum.Son olarak, Göktuğ gençlere kariyerleri ve gelecekleriyle ilgili değerli tavsiyelerde bulundu. Trendlere körü körüne kapılmak yerine, kendi iç seslerini dinlemeleri, uzun vadeli planlar yapmaları ve risk almaktan korkmamaları gerektiğini vurguladı.Yapay zeka, teknoloji trendleri, girişimcilik ve Silikon Vadisi ekosistemi hakkında merak ettikleriniz varsa, bu bölüm tam size göre. (02:14) Neden Silikon Vadisi'ne gitti (06:00) Nasıl planladı? (08:00) Amerika-Avrupa farkı (09:40) Silikon Vadisi'nde olmanın avantajları (11:57) Türkiye neler öğrenebilir (Hackaton örneği) (15:54) AI Agents (19:52) Veriyi temizleme ve kullanma (21:57) Yapay zekadaki son gelişmeler (27:20) Datrick'in gelecek planları (28:04) Göktuğ'un değer yaratma formülü Can Göktuğ Özdem'in Linkedin profilihttps://www.linkedin.com/in/goktugozdem/Support the show
News includes the official release of Elixir 1.18.0 with enhanced type system support, José Valim's retrospective on Elixir's progress in 2024, LiveView Native's significant v0.4.0-rc.0 release with a new networking stack, ExDoc v0.36's introduction of swup.js for smoother page navigations, the announcement of a new Elixir conference called Goatmire in Sweden, and more! Show Notes online - http://podcast.thinkingelixir.com/235 (http://podcast.thinkingelixir.com/235) Elixir Community News https://elixir-lang.org/blog/2024/12/19/elixir-v1-18-0-released/ (https://elixir-lang.org/blog/2024/12/19/elixir-v1-18-0-released/?utm_source=thinkingelixir&utm_medium=shownotes) – Official Elixir 1.18.0 release announcement https://github.com/elixir-lang/elixir/blob/v1.18/CHANGELOG.md (https://github.com/elixir-lang/elixir/blob/v1.18/CHANGELOG.md?utm_source=thinkingelixir&utm_medium=shownotes) – Changelog for Elixir 1.18.0 release https://bsky.app/profile/david.bernheisel.com/post/3leetmgvihk2a (https://bsky.app/profile/david.bernheisel.com/post/3leetmgvihk2a?utm_source=thinkingelixir&utm_medium=shownotes) – Details about upcoming Elixir 1.19 type checking capabilities for protocols https://bsky.app/profile/josevalim.bsky.social/post/3ldyphlun4c2z (https://bsky.app/profile/josevalim.bsky.social/post/3ldyphlun4c2z?utm_source=thinkingelixir&utm_medium=shownotes) – José Valim's retrospective on Elixir's progress in 2024, highlighting type system improvements and project releases https://github.com/liveview-native/liveviewnative/releases (https://github.com/liveview-native/live_view_native/releases?utm_source=thinkingelixir&utm_medium=shownotes) – LiveView Native v0.4.0-rc.0 release announcement https://x.com/liveviewnative/status/1869081462659809771 (https://x.com/liveviewnative/status/1869081462659809771?utm_source=thinkingelixir&utm_medium=shownotes) – Twitter announcement about LiveView Native release https://github.com/liveview-native/liveviewnative/blob/main/CHANGELOG.md (https://github.com/liveview-native/live_view_native/blob/main/CHANGELOG.md?utm_source=thinkingelixir&utm_medium=shownotes) – Changelog for LiveView Native v0.4.0-rc.0 https://bsky.app/profile/josevalim.bsky.social/post/3le25qqcfh22x (https://bsky.app/profile/josevalim.bsky.social/post/3le25qqcfh22x?utm_source=thinkingelixir&utm_medium=shownotes) – ExDoc v0.36 release announcement introducing swup.js for navigation https://github.com/swup/swup (https://github.com/swup/swup?utm_source=thinkingelixir&utm_medium=shownotes) – Swup.js GitHub repository https://swup.js.org/ (https://swup.js.org/?utm_source=thinkingelixir&utm_medium=shownotes) – Swup.js documentation https://swup.js.org/getting-started/demos/ (https://swup.js.org/getting-started/demos/?utm_source=thinkingelixir&utm_medium=shownotes) – Swup.js demos showing page transition capabilities https://github.com/hexpm/hexdocs/pull/44 (https://github.com/hexpm/hexdocs/pull/44?utm_source=thinkingelixir&utm_medium=shownotes) – Pull request for cross-package function search in ExDoc using Typesense https://github.com/elixir-lang/ex_doc/issues/1811 (https://github.com/elixir-lang/ex_doc/issues/1811?utm_source=thinkingelixir&utm_medium=shownotes) – Related issue for cross-package function search feature https://bsky.app/profile/tylerayoung.com/post/3lejnfttgok2u (https://bsky.app/profile/tylerayoung.com/post/3lejnfttgok2u?utm_source=thinkingelixir&utm_medium=shownotes) – Announcement of parameterized_test v0.6.0 with improved failure messages https://hexdocs.pm/phoenix_test/changelog.html#0-5-1 (https://hexdocs.pm/phoenix_test/changelog.html#0-5-1?utm_source=thinkingelixir&utm_medium=shownotes) – phoenix_test v0.5.1 changelog with new assertion helpers https://x.com/germsvel/status/1873732271611469976 (https://x.com/germsvel/status/1873732271611469976?utm_source=thinkingelixir&utm_medium=shownotes) – Twitter announcement about phoenix_test updates https://x.com/ElixirConf/status/1873445096773111848 (https://x.com/ElixirConf/status/1873445096773111848?utm_source=thinkingelixir&utm_medium=shownotes) – Announcement of new ElixirConf US 2024 videos https://www.youtube.com/playlist?list=PLqj39LCvnOWbW2Zli4LurDGc6lL5ij-9Y (https://www.youtube.com/playlist?list=PLqj39LCvnOWbW2Zli4LurDGc6lL5ij-9Y?utm_source=thinkingelixir&utm_medium=shownotes) – YouTube playlist of ElixirConf US 2024 talks https://x.com/TylerAYoung/status/1873798040525693040 (https://x.com/TylerAYoung/status/1873798040525693040?utm_source=thinkingelixir&utm_medium=shownotes) – Recommendation for David's ETL talk at ElixirConf https://goatmire.com/ (https://goatmire.com/?utm_source=thinkingelixir&utm_medium=shownotes) – New Elixir conference "Goatmire" announced in Sweden https://bsky.app/profile/lawik.bsky.social/post/3ldougsbvhk2s (https://bsky.app/profile/lawik.bsky.social/post/3ldougsbvhk2s?utm_source=thinkingelixir&utm_medium=shownotes) – Lars Wikman's announcement about Goatmire conference Do you have some Elixir news to share? Tell us at @ThinkingElixir (https://twitter.com/ThinkingElixir) or email at show@thinkingelixir.com (mailto:show@thinkingelixir.com) Find us online - Message the show - Bluesky (https://bsky.app/profile/thinkingelixir.com) - Message the show - X (https://x.com/ThinkingElixir) - Message the show on Fediverse - @ThinkingElixir@genserver.social (https://genserver.social/ThinkingElixir) - Email the show - show@thinkingelixir.com (mailto:show@thinkingelixir.com) - Mark Ericksen on X - @brainlid (https://x.com/brainlid) - Mark Ericksen on Bluesky - @brainlid.bsky.social (https://bsky.app/profile/brainlid.bsky.social) - Mark Ericksen on Fediverse - @brainlid@genserver.social (https://genserver.social/brainlid) - David Bernheisel on Bluesky - @david.bernheisel.com (https://bsky.app/profile/david.bernheisel.com) - David Bernheisel on Fediverse - @dbern@genserver.social (https://genserver.social/dbern)
What goes into a successful business and technology podcast like ThinkCast? Gartner's premiere podcast boasts more than 10,000 subscribers in 130 plus countries, and it has been downloaded more than 700,000 times. ThinkCast co-founder and former host Scott Smith is here to describe the keys to producing a great business podcast. Today on ETL, he goes over how to define your audience, form a strategy for success, and pick the right host for your podcast. Scott is an expert storyteller who worked at Gartner for over 26 years, beginning as managing editor for Talking Technology, the IT research and consulting firm's audio news magazine. He founded ThinkCast in 2016 and hosted it for three years. “Blueprint” by Jahzzar is licensed under CC BY-SA 4.0. Music set to dialogue. https://freemusicarchive.org/music/Jahzzar/Ashes_1206/blueprint/ https://freemusicarchive.org/music/Jahzzar/ https://creativecommons.org/licenses/by-sa/4.0/
Dave and Johnny run Estuary, a data integration company focused on real-time ETL and ELT. We're also friends, so we decided to have a chat. In this episode, we chat about the current state of the data integration space, running a startup while raising kids, and much more. Estuary
Deepti Srivastava is the Founder and CEO of Snow Leopard. We dive into Snow Leopard's innovative approach to data integration, exploring its live data access model that bypasses traditional ETL pipelines to offer real-time data retrieval directly from source systems.Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes - with links to many references - can be found on The Data Exchange web site.
We have a full slate of upcoming events: AI Engineer London, AWS Re:Invent in Las Vegas, and now Latent Space LIVE! at NeurIPS in Vancouver and online. Sign up to join and speak!We are still taking questions for our next big recap episode! Submit questions and messages on Speakpipe here for a chance to appear on the show!We try to stay close to the inference providers as part of our coverage, as our podcasts with Together AI and Replicate will attest: However one of the most notable pull quotes from our very well received Braintrust episode was his opinion that open source model adoption has NOT gone very well and is actually declining in relative market share terms (it is of course increasing in absolute terms):Today's guest, Lin Qiao, would wholly disagree. Her team of Pytorch/GPU experts are wholly dedicated toward helping you serve and finetune the full stack of open source models from Meta and others, across all modalities (Text, Audio, Image, Embedding, Vision-understanding), helping customers like Cursor and Hubspot scale up open source model inference both rapidly and affordably.Fireworks has emerged after its successive funding rounds with top tier VCs as one of the leaders of the Compound AI movement, a term first coined by the Databricks/Mosaic gang at Berkeley AI and adapted as “Composite AI” by Gartner:Replicating o1We are the first podcast to discuss Fireworks' f1, their proprietary replication of OpenAI's o1. This has become a surprisingly hot area of competition in the past week as both Nous Forge and Deepseek r1 have launched competitive models.Full Video PodcastLike and subscribe!Timestamps* 00:00:00 Introductions* 00:02:08 Pre-history of Fireworks and PyTorch at Meta* 00:09:49 Product Strategy: From Framework to Model Library* 00:13:01 Compound AI Concept and Industry Dynamics* 00:20:07 Fireworks' Distributed Inference Engine* 00:22:58 OSS Model Support and Competitive Strategy* 00:29:46 Declarative System Approach in AI* 00:31:00 Can OSS replicate o1?* 00:36:51 Fireworks f1* 00:41:03 Collaboration with Cursor and Speculative Decoding* 00:46:44 Fireworks quantization (and drama around it)* 00:49:38 Pricing Strategy* 00:51:51 Underrated Features of Fireworks Platform* 00:55:17 HiringTranscriptAlessio [00:00:00]: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner at CTO at Danceable Partners, and I'm joined by my co-host, Swyx founder, Osmalayar.Swyx [00:00:11]: Hey, and today we're in a very special studio inside the Fireworks office with Lin Qiang, CEO of Fireworks. Welcome. Yeah.Lin [00:00:20]: Oh, you should welcome us.Swyx [00:00:21]: Yeah, welcome. Yeah, thanks for having us. It's unusual to be in the home of a startup, but it's also, I think our relationship is a bit unusual compared to all our normal guests. Definitely.Lin [00:00:34]: Yeah. I'm super excited to talk about very interesting topics in that space with both of you.Swyx [00:00:41]: You just celebrated your two-year anniversary yesterday.Lin [00:00:43]: Yeah, it's quite a crazy journey. We circle around and share all the crazy stories across these two years, and it has been super fun. All the way from we experienced Silicon Valley bank run to we delete some data that shouldn't be deleted operationally. We went through a massive scale where we actually are busy getting capacity to, yeah, we learned to kind of work with it as a team with a lot of brilliant people across different places to join a company. It has really been a fun journey.Alessio [00:01:24]: When you started, did you think the technical stuff will be harder or the bank run and then the people side? I think there's a lot of amazing researchers that want to do companies and it's like the hardest thing is going to be building the product and then you have all these different other things. So, were you surprised by what has been your experience the most?Lin [00:01:42]: Yeah, to be honest with you, my focus has always been on the product side and then after the product goes to market. And I didn't realize the rest has been so complicated, operating a company and so on. But because I don't think about it, I just kind of manage it. So it's done. I think I just somehow don't think about it too much and solve whatever problem coming our way and it worked.Swyx [00:02:08]: So let's, I guess, let's start at the pre-history, the initial history of Fireworks. You ran the PyTorch team at Meta for a number of years and we previously had Sumit Chintal on and I think we were just all very interested in the history of GenEI. Maybe not that many people know how deeply involved Faire and Meta were prior to the current GenEI revolution.Lin [00:02:35]: My background is deep in distributed system, database management system. And I joined Meta from the data side and I saw this tremendous amount of data growth, which cost a lot of money and we're analyzing what's going on. And it's clear that AI is driving all this data generation. So it's a very interesting time because when I joined Meta, Meta is going through ramping down mobile-first, finishing the mobile-first transition and then starting AI-first. And there's a fundamental reason about that sequence because mobile-first gave a full range of user engagement that has never existed before. And all this user engagement generated a lot of data and this data power AI. So then the whole entire industry is also going through, falling through this same transition. When I see, oh, okay, this AI is powering all this data generation and look at where's our AI stack. There's no software, there's no hardware, there's no people, there's no team. I want to dive up there and help this movement. So when I started, it's very interesting industry landscape. There are a lot of AI frameworks. It's a kind of proliferation of AI frameworks happening in the industry. But all the AI frameworks focus on production and they use a very certain way of defining the graph of neural network and then use that to drive the model iteration and productionization. And PyTorch is completely different. So they could also assume that he was the user of his product. And he basically says, researchers face so much pain using existing AI frameworks, this is really hard to use and I'm going to do something different for myself. And that's the origin story of PyTorch. PyTorch actually started as the framework for researchers. They don't care about production at all. And as they grow in terms of adoption, so the interesting part of AI is research is the top of our normal production. There are so many researchers across academic, across industry, they innovate and they put their results out there in open source and that power the downstream productionization. So it's brilliant for MATA to establish PyTorch as a strategy to drive massive adoption in open source because MATA internally is a PyTorch shop. So it creates a flying wheel effect. So that's kind of a strategy behind PyTorch. But when I took on PyTorch, it's kind of at Caspo, MATA established PyTorch as the framework for both research and production. So no one has done that before. And we have to kind of rethink how to architect PyTorch so we can really sustain production workload, the stability, reliability, low latency, all this production concern was never a concern before. Now it's a concern. And we actually have to adjust its design and make it work for both sides. And that took us five years because MATA has so many AI use cases, all the way from ranking recommendation as powering the business top line or as ranking newsfeed, video ranking to site integrity detect bad content automatically using AI to all kinds of effects, translation, image classification, object detection, all this. And also across AI running on the server side, on mobile phones, on AI VR devices, the wide spectrum. So by the time we actually basically managed to support AI across ubiquitous everywhere across MATA. But interestingly, through open source engagement, we work with a lot of companies. It is clear to us like this industry is starting to take on AI first transition. And of course, MATA's hyperscale always go ahead of industry. And it feels like when we start this AI journey at MATA, there's no software, no hardware, no team. For many companies we engage with through PyTorch, we feel the pain. That's the genesis why we feel like, hey, if we create fireworks and support industry going through this transition, it will be a huge amount of impact. Of course, the problem that the industry is facing will not be the same as MATA. MATA is so big, right? So it's kind of skewed towards extreme scale and extreme optimization in the industry will be different. But we feel like we have the technical chop and we've seen a lot. We'll look to kind of drive that. So yeah, so that's how we started.Swyx [00:06:58]: When you and I chatted about the origins of fireworks, it was originally envisioned more as a PyTorch platform, and then later became much more focused on generative AI. Is that fair to say? What was the customer discovery here?Lin [00:07:13]: Right. So I would say our initial blueprint is we should build a PyTorch cloud because a PyTorch library and there's no SaaS platform to enable AI workloads.Swyx [00:07:26]: Even in 2022, it's interesting.Lin [00:07:28]: I would not say absolutely no, but cloud providers have some of those, but it's not first class citizen, right? At 2022, there's still like TensorFlow is massively in production. And this is all pre-gen AI, and PyTorch is kind of getting more and more adoption. But there's no PyTorch-first SaaS platform existing. At the same time, we are also a very pragmatic set of people. We really want to make sure from the get-go, we get really, really close to customers. We understand their use case, we understand their pain points, we understand the value we deliver to them. So we want to take a different approach instead of building a horizontal PyTorch cloud. We want to build a verticalized platform first. And then we talk with many customers. And interestingly, we started the company in September 2022, and in October, November, the OpenAI announced ChatGPT. And then boom, when we talked with many customers, they were like, can you help us work on the JNS aspect? So of course, there are some open source models. It's not as good at that time, but people are already putting a lot of attention there. Then we decided that if we're going to pick a vertical, we're going to pick JNI. The other reason is all JNI models are PyTorch models. So that's another reason. We believe that because of the nature of JNI, it's going to generate a lot of human consumable content. It will drive a lot of consumer, customer-developer-facing application and product innovation. Guaranteed. We're just at the beginning of this. Our prediction is for those kind of applications, the inference is much more important than training because inference scale is proportional to the up-limit award population. And training scale is proportional to the number of researchers. Of course, each training round could be very expensive. Although PyTorch supports both inference and training, we decided to laser focus on inference. So yeah, so that's how we got started. And we launched our public platform August last year. When we launched, it was a single product. It's a distributed inference engine with a simple API, open AI compatible API with many models. We started with LM and then we added a lot of models. Fast forward to now, we are a full platform with multiple product lines. So we love to kind of dive deep into what we offer. But that's a very fun journey in the past two years.Alessio [00:09:49]: What was the transition from you start to focus on PyTorch and people want to understand the framework, get it live. And now say maybe most people that use you don't even really know much about PyTorch at all. You know, they're just trying to consume a model. From a product perspective, like what were some of the decisions early on? Like right in October, November, you were just like, hey, most people just care about the model, not about the framework. We're going to make it super easy or was it more a gradual transition to the model librarySwyx [00:10:16]: you have today?Lin [00:10:17]: Yeah. So our product decision is all based on who is our ICP. And one thing I want to acknowledge here is the generic technology is disruptive. It's very different from AI before GNI. So it's a clear leap forward. Because before GNI, the companies that want to invest in AI, they have to train from scratch. There's no other way. There's no foundation model. It doesn't exist. So that means then to start a team, first hire a team who is capable of crunch data. There's a lot of data to crunch, right? Because training from scratch, you have to prepare a lot of data. And then they need to have GPUs to train, and then you start to manage GPUs. So then it becomes a very complex project. It takes a long time and not many companies can afford it, actually. And the GNI is a very different game right now, because it is a foundation model. So you don't have to train anymore. That makes AI much more accessible as a technology. As an app developer or product manager, even, not a developer, they can interact with GNI models directly. So our goal is to make AI accessible to all app developers and product engineers. That's our goal. So then getting them into the building model doesn't make any sense anymore with this new technology. And then building easy, accessible APIs is the most important. Early on, when we got started, we decided we're going to be open AI compatible. It's just kind of very easy for developers to adopt this new technology, and we will manage the underlying complexity of serving all these models.Swyx [00:11:56]: Yeah, open AI has become the standard. Even as we're recording today, Gemini announced that they have open AI compatible APIs. Interesting. So we just need to drop it all in line, and then we have everyone popping in line.Lin [00:12:09]: That's interesting, because we are working very closely with Meta as one of the partners. Meta, of course, is kind of very generous to donate many very, very strong open source models, expecting more to come. But also they have announced LamaStack, which is basically standardized, the upper level stack built on top of Lama models. So they don't just want to give out models and you figure out what the upper stack is. They instead want to build a community around the stack and build a new standard. I think there's an interesting dynamics in play in the industry right now, when it's more standardized across open AI, because they are kind of creating the top of the funnel, or standardized across Lama, because this is the most used open source model. So I think it's a lot of fun working at this time.Swyx [00:13:01]: I've been a little bit more doubtful on LamaStack, I think you've been more positive. Basically it's just like the meta version of whatever Hugging Face offers, you know, or TensorRT, or BLM, or whatever the open source opportunity is. But to me, it's not clear that just because Meta open sources Lama, that the rest of LamaStack will be adopted. And it's not clear why I should adopt it. So I don't know if you agree.Lin [00:13:27]: It's very early right now. That's why I kind of work very closely with them and give them feedback. The feedback to the meta team is very important. So then they can use that to continue to improve the model and also improve the higher level I think the success of LamaStack heavily depends on the community adoption. And there's no way around it. And I know the meta team would like to kind of work with a broader set of community. But it's very early.Swyx [00:13:52]: One thing that after your Series B, so you raced for Benchmark, and then Sequoia. I remember being close to you for at least your Series B announcements, you started betting heavily on this term of Compound AI. It's not a term that we've covered very much in the podcast, but I think it's definitely getting a lot of adoption from Databricks and Berkeley people and all that. What's your take on Compound AI? Why is it resonating with people?Lin [00:14:16]: Right. So let me give a little bit of context why we even consider that space.Swyx [00:14:22]: Because like pre-Series B, there was no message, and now it's like on your landing page.Lin [00:14:27]: So it's kind of very organic evolution from when we first launched our public platform, we are a single product. We are a distributed inference engine, where we do a lot of innovation, customized KUDA kernels, raw kernel kernels, running on different kinds of hardware, and build distributed disaggregated execution, inference execution, build all kinds of caching. So that is one. So that's kind of one product line, is the fast, most cost-efficient inference platform. Because we wrote PyTorch code, we know we basically have a special PyTorch build for that, together with a custom kernel we wrote. And then we worked with many more customers, we realized, oh, the distributed inference engine, our design is one size fits all. We want to have this inference endpoint, then everyone come in, and no matter what kind of form and shape or workload they have, it will just work for them. So that's great. But the reality is, we realized all customers have different kinds of use cases. The use cases come in all different forms and shapes. And the end result is the data distribution in their inference workload doesn't align with the data distribution in the training data for the model. It's a given, actually. If you think about it, because researchers have to guesstimate what is important, what's not important in preparing data for training. So because of that misalignment, then we leave a lot of quality, latency, cost improvement on the table. So then we're saying, OK, we want to heavily invest in a customization engine. And we actually announced it called FHIR Optimizer. So FHIR Optimizer basically helps users navigate a three-dimensional optimization space across quality, latency, and cost. So it's a three-dimensional curve. And even for one company, for different use cases, they want to land in different spots. So we automate that process for our customers. It's very simple. You have your inference workload. You inject into the optimizer along with the objective function. And then we spit out inference deployment config and the model setup. So it's your customized setup. So that is a completely different product. So that product thinking is one size fits all. And now on top of that, we provide a huge variety of state-of-the-art models, hundreds of them, varying from text to large state-of-the-art English models. That's where we started. And as we talk with many customers, we realize, oh, audio and text are very, very close. Many of our customers start to build assistants, all kinds of assistants using text. And they immediately want to add audio, audio in, audio out. So we support transcription, translation, speech synthesis, text, audio alignment, all different kinds of audio features. It's a big announcement. You should have heard by the time this is out. And the other areas of vision and text are very close with each other. Because a lot of information doesn't live in plain text. A lot of information lives in multimedia format, images, PDFs, screenshots, and many other different formats. So oftentimes to solve a problem, we need to put the vision model first to extract information and then use language model to process and then send out results. So vision is important. We also support vision model, various different kinds of vision models specialized in processing different kinds of source and extraction. And we're also going to have another announcement of a new API endpoint we'll support for people to upload various different kinds of multimedia content and then get the extract very accurate information out and feed that into LM. And of course, we support embedding because embedding is very important for semantic search, for RAG, and all this. And in addition to that, we also support text-to-image, image generation models, text-to-image, image-to-image, and we're adding text-to-video as well in our portfolio. So it's a very comprehensive set of model catalog that built on top of File Optimizer and Distributed Inference Engine. But then we talk with more customers, they solve business use case, and then we realize one model is not sufficient to solve their problem. And it's very clear because one is the model hallucinates. Many customers, when they onboard this JNI journey, they thought this is magical. JNI is going to solve all my problems magically. But then they realize, oh, this model hallucinates. It hallucinates because it's not deterministic, it's probabilistic. So it's designed to always give you an answer, but based on probabilities, so it hallucinates. And that's actually sometimes a feature for creative writing, for example. Sometimes it's a bug because, hey, you don't want to give misinformation. And different models also have different specialties. To solve a problem, you want to ask different special models to kind of decompose your task into multiple small tasks, narrow tasks, and then have an expert model solve that task really well. And of course, the model doesn't have all the information. It has limited knowledge because the training data is finite, not infinite. So the model oftentimes doesn't have real-time information. It doesn't know any proprietary information within the enterprise. It's clear that in order to really build a compiling application on top of JNI, we need a compound AI system. Compound AI system basically is going to have multiple models across modalities, along with APIs, whether it's public APIs, internal proprietary APIs, storage systems, database systems, knowledge to work together to deliver the best answer.Swyx [00:20:07]: Are you going to offer a vector database?Lin [00:20:09]: We actually heavily partner with several big vector database providers. Which is your favorite? They are all great in different ways. But it's public information, like MongoDB is our investor. And we have been working closely with them for a while.Alessio [00:20:26]: When you say distributed inference engine, what do you mean exactly? Because when I hear your explanation, it's almost like you're centralizing a lot of the decisions through the Fireworks platform on the quality and whatnot. What do you mean distributed? It's like you have GPUs in a lot of different clusters, so you're sharding the inference across the same model.Lin [00:20:45]: So first of all, we run across multiple GPUs. But the way we distribute across multiple GPUs is unique. We don't distribute the whole model monolithically across multiple GPUs. We chop them into pieces and scale them completely differently based on what's the bottleneck. We also are distributed across regions. We have been running in North America, EMEA, and Asia. We have regional affinity to applications because latency is extremely important. We are also doing global load balancing because a lot of applications there, they quickly scale to global population. And then at that scale, different content wakes up at a different time. And you want to kind of load balancing across. So all the way, and we also have, we manage various different kinds of hardware skew from different hardware vendors. And different hardware design is best for different types of workload, whether it's long context, short context, long generation. So all these different types of workload is best fitted for different kinds of hardware skew. And then we can even distribute across different hardware for a workload. So the distribution actually is all around in the full stack.Swyx [00:22:02]: At some point, we'll show on the YouTube, the image that Ray, I think, has been working on with all the different modalities that you offer. To me, it's basically you offer the open source version of everything that OpenAI typically offers. I don't think there is. Actually, if you do text to video, you will be a superset of what OpenAI offers because they don't have Sora. Is that Mochi, by the way? Mochi. Mochi, right?Lin [00:22:27]: Mochi. And there are a few others. I will say, the interesting thing is, I think we're betting on the open source community is going to proliferate. This is literally what we're seeing. And there's amazing video generation companies. There is amazing audio companies. Like cross-border, the innovation is off the chart, and we are building on top of that. I think that's the advantage we have compared with a closed source company.Swyx [00:22:58]: I think I want to restate the value proposition of Fireworks for people who are comparing you versus a raw GPU provider like a RunPod or Lambda or anything like those, which is like you create the developer experience layer and you also make it easily scalable or serverless or as an endpoint. And then, I think for some models, you have custom kernels, but not all models.Lin [00:23:25]: Almost for all models. For all large language models, all your models, and the VRMs. Almost for all models we serve.Swyx [00:23:35]: And so that is called Fire Attention. I don't remember the speed numbers, but apparently much better than VLM, especially on a concurrency basis.Lin [00:23:44]: So Fire Attention is specific mostly for language models, but for other modalities, we'll also have a customized kernel.Swyx [00:23:51]: And I think the typical challenge for people is understanding that has value, and then there are other people who are also offering open-source models. Your mode is your ability to offer a good experience for all these customers. But if your existence is entirely reliant on people releasing nice open-source models, other people can also do the same thing.Lin [00:24:14]: So I would say we build on top of open-source model foundation. So that's the kind of foundation we build on top of. But we look at the value prop from the lens of application developers and product engineers. So they want to create new UX. So what's happening in the industry right now is people are thinking about a completely new way of designing products. And I'm talking to so many founders, it's just mind-blowing. They help me understand existing way of doing PowerPoint, existing way of coding, existing way of managing customer service. It's actually putting a box in our head. For example, PowerPoint. So PowerPoint generation is we always need to think about how to fit into my storytelling into this format of slide one after another. And I'm going to juggle through design together with what story to tell. But the most important thing is what's our storytelling lines, right? And why don't we create a space that is not limited to any format? And those kind of new product UX design combined with automated content generation through Gen AI is the new thing that many founders are doing. What are the challenges they're facing? Let's go from there. One is, again, because a lot of products built on top of Gen AI, they are consumer-personal developer facing, and they require interactive experience. It's just a kind of product experience we all get used to. And our desire is to actually get faster and faster interaction. Otherwise, nobody wants to spend time, right? And then that requires low latency. And the other thing is the nature of consumer-personal developer facing is your audience is very big. You want to scale up to product market fit quickly. But if you lose money at a small scale, you're going to bankrupt quickly. So it's actually a big contrast. I actually have product market fit, but when I scale, I scale out of my business. So that's kind of a very funny way to think about it. So then having low latency and low cost is essential for those new applications and products to survive and really become a generation company. So that's the design point for our distributed inference engine and the file optimizer. File optimizer, you can think about that as a feedback loop. The more you feed your inference workload to our inference engine, the more we help you improve quality, lower latency further, lower your cost. It basically becomes better. And we automate that because we don't want you as an app developer or product engineer to think about how to figure out all these low-level details. It's impossible because you're not trained to do that at all. You should kind of keep your focus on the product innovation. And then the compound AI, we actually feel a lot of pain as the app developers, engineers, there are so many models. Every week, there's at least a new model coming out.Swyx [00:27:09]: Tencent had a giant model this week. Yeah, yeah.Lin [00:27:13]: I saw that. I saw that.Swyx [00:27:15]: It's like $500 billion.Lin [00:27:18]: So they're like, should I keep chasing this or should I forget about it? And which model should I pick to solve what kind of sub-problem? How do I even decompose my problem into those smaller problems and fit the model into it? I have no idea. And then there are two ways to think about this design. I think I talked about that in the past. One is imperative, as in you figure out how to do it. You give developer tools to dictate how to do it. Or you build a declarative system where a developer tells what they want to do, not how. So these are completely two different designs. So the analogy I want to draw is, in the data world, the database management system is a declarative system because people use database, use SQL. SQL is a way you say, what do you want to extract out of a database? What kind of result do you want? But you don't figure out which node is going to, how many nodes you're going to run on top of, how you redefine your disk, which index you use, which project. You don't need to worry about any of those. And database management system will figure out, generate a new best plan, and execute on that. So database is declarative. And it makes it super easy. You just learn SQL, which is learn a semantic meaning of SQL, and you can use it. Imperative side is there are a lot of ETL pipelines. And people design this DAG system with triggers, with actions, and you dictate exactly what to do. And if it fails, then how to recover. So that's an imperative system. We have seen a range of systems in the ecosystem go different ways. I think there's value of both. There's value of both. I don't think one is going to subsume the other. But we are leaning more into the philosophy of the declarative system. Because from the lens of app developer and product engineer, that would be easiest for them to integrate.Swyx [00:29:07]: I understand that's also why PyTorch won as well, right? This is one of the reasons. Ease of use.Lin [00:29:14]: Focus on ease of use, and then let the system take on the hard challenges and complexities. So we follow, we extend that thinking into current system design. So another announcement is we will also announce our next declarative system is going to appear as a model that has extremely high quality. And this model is inspired by Owen's announcement for OpenAI. You should see that by the time we announce this or soon.Alessio [00:29:46]: Trained by you.Lin [00:29:47]: Yes.Alessio [00:29:48]: Is this the first model that you trained? It's not the first.Lin [00:29:52]: We actually have trained a model called FireFunction. It's a function calling model. It's our first step into compound AI system. Because function calling model can dispatch a request into multiple APIs. We have pre-baked set of APIs the model learned. You can also add additional APIs through the configuration to let model dispatch accordingly. So we have a very high quality function calling model that's already released. We have actually three versions. The latest version is very high quality. But now we take a further step that you don't even need to use function calling model. You use our new model we're going to release. It will solve a lot of problems approaching very high OpenAI quality. So I'm very excited about that.Swyx [00:30:41]: Do you have any benchmarks yet?Lin [00:30:43]: We have a benchmark. We're going to release it hopefully next week. We just put our model to LMSYS and people are guessing. Is this the next Gemini model or a MADIS model? People are guessing. That's very interesting. We're watching the Reddit discussion right now.Swyx [00:31:00]: I have to ask more questions about this. When OpenAI released o1, a lot of people asked about whether or not it's a single model or whether it's a chain of models. Noam and basically everyone on the Strawberry team was very insistent that what they did for reinforcement learning, chain of thought, cannot be replicated by a whole bunch of open source model calls. Do you think that that is wrong? Have you done the same amount of work on RL as they have or was it a different direction?Lin [00:31:29]: I think they take a very specific approach where the caliber of team is very high. So I do think they are the domain expert in doing the things they are doing. I don't think there's only one way to achieve the same goal. We're on the same direction in the sense that the quality scaling law is shifting from training to inference. For that, I fully agree with them. But we're taking a completely different approach to the problem. All of that is because, of course, we didn't train the model from scratch. All of that is because we built on the show of giants. The current model available we have access to is getting better and better. The future trend is the gap between the open source model and the co-source model. It's just going to shrink to the point there's not much difference. And then we're on the same level field. That's why I think our early investment in inference and all the work we do around balancing across quality, latency, and cost pay off because we have accumulated a lot of experience and that empowers us to release this new model that is approaching open-ended quality.Alessio [00:32:39]: I guess the question is, what do you think the gap to catch up will be? Because I think everybody agrees with open source models eventually will catch up. And I think with 4, then with Lama 3.2, 3.1, 4.5b, we close the gap. And then 0.1 just reopened the gap so much and it's unclear. Obviously, you're saying your model will have...Swyx [00:32:57]: We're closing that gap.Alessio [00:32:58]: But you think in the future, it's going to be months?Lin [00:33:02]: So here's the thing that's happened. There's public benchmark. It is what it is. But in reality, open source models in certain dimensions are already on par or beat closed source models. So for example, in the coding space, open source models are really, really good. And in function calling, file function is also really, really good. So it's all a matter of whether you build one model to solve all the problems and you want to be the best of solving all the problems, or in the open source domain, it's going to specialize. All these different model builders specialize in certain narrow area. And it's logical that they can be really, really good in that very narrow area. And that's our prediction is with specialization, there will be a lot of expert models really, really good and even better than one-size-fits-all closed source models.Swyx [00:33:55]: I think this is the core debate that I am still not 100% either way on in terms of compound AI versus normal AI. Because you're basically fighting the bitter lesson.Lin [00:34:09]: Look at the human society, right? We specialize. And you feel really good about someone specializing doing something really well, right? And that's how our way evolved from ancient times. We're all journalists. We do everything. Now we heavily specialize in different domains. So my prediction is in the AI model space, it will happen also. Except for the bitter lesson.Swyx [00:34:30]: You get short-term gains by having specialists, domain specialists, and then someone just needs to train like a 10x bigger model on 10x more inference, 10x more data, 10x more model perhaps, whatever the current scaling law is. And then it supersedes all the individual models because of some generalized intelligence slash world knowledge. I think that is the core insight of the GPTs, the GPT-123 networks. Right.Lin [00:34:56]: But the training scaling law is because you have an increasing amount of data to train from. And you can do a lot of compute. So I think on the data side, we're approaching the limit. And the only data to increase that is synthetic generated data. And then there's like what is the secret sauce there, right? Because if you have a very good large model, you can generate very good synthetic data and then continue to improve quality. So that's why I think in OpenAI, they are shifting from the training scaling law intoSwyx [00:35:25]: inference scaling law.Lin [00:35:25]: And it's the test time and all this. So I definitely believe that's the future direction. And that's where we are really good at, doing inference.Swyx [00:35:34]: A couple of questions on that. Are you planning to share your reasoning choices?Lin [00:35:39]: That's a very good question. We are still debating.Swyx [00:35:43]: Yeah.Lin [00:35:45]: We're still debating.Swyx [00:35:46]: I would say, for example, it's interesting that, for example, SweetBench. If you want to be considered for ranking, you have to submit your reasoning choices. And that has actually disqualified some of our past guests. Cosign was doing well on SweetBench, but they didn't want to leak those results. So that's why you don't see O1 preview on SweetBench, because they don't submit their reasoning choices. And obviously, it's IP. But also, if you're going to be more open, then that's one way to be more open. So your model is not going to be open source, right? It's going to be an endpoint that you provide. Okay, cool. And then pricing, also the same as OpenAI, just kind of based on...Lin [00:36:25]: Yeah, this is... I don't have, actually, information. Everything is going so fast, we haven't even thought about that yet. Yeah, I should be more prepared.Swyx [00:36:33]: I mean, this is live. You know, it's nice to just talk about it as it goes live. Any other things that you want feedback on or you're thinking through? It's kind of nice to just talk about something when it's not decided yet. About this new model. It's going to be exciting. It's going to generate a lot of buzz. Right.Lin [00:36:51]: I'm very excited to see how people are going to use this model. So there's already a Reddit discussion about it. And people are asking very deep, mathematical questions. And since the model got it right, surprising. And internally, we're also asking the model to generate what is AGI. And it generates a very complicated DAG thinking process. So we're having a lot of fun testing this internally. But I'm more curious, how will people use it? What kind of application they're going to try and test on it? And that's where we really like to hear feedback from the community. And also feedback to us. What works out well? What doesn't work out well? What works out well, but surprising them? And what kind of thing they think we should improve on? And those kind of feedback will be tremendously helpful.Swyx [00:37:44]: Yeah. So I've been a production user of Preview and Mini since launch. I would say they're very, very obvious jobs in quality. So much so that they made clods on it. And they made the previous state-of-the-art look bad. It's really that stark, that difference. The number one thing, just feedback or feature requests, is people want control on the budget. Because right now, in 0.1, it kind of decides its own thinking budget. But sometimes you know how hard the problem is. And you want to actually tell the model, spend two minutes on this. Or spend some dollar amount. Maybe it's time you miss dollars. I don't know what the budget is. That makes a lot of sense.Lin [00:38:27]: So we actually thought about that requirement. And it should be, at some point, we need to support that. Not initially. But that makes a lot of sense.Swyx [00:38:38]: Okay. So that was a fascinating overview of just the things that you're working on. First of all, I realized that... I don't know if I've ever given you this feedback. But I think you guys are one of the reasons I agreed to advise you. Because I think when you first met me, I was kind of dubious. I was like... Who are you? There's Replicate. There's Together. There's Laptop. There's a whole bunch of other players. You're in very, very competitive fields. Like, why will you win? And the reason I actually changed my mind was I saw you guys shipping. I think your surface area is very big. The team is not that big. No. We're only 40 people. Yeah. And now here you are trying to compete with OpenAI and everyone else. What is the secret?Lin [00:39:21]: I think the team. The team is the secret.Swyx [00:39:23]: Oh boy. So there's no thing I can just copy. You just... No.Lin [00:39:30]: I think we all come from a very aligned culture. Because most of our team came from meta.Swyx [00:39:38]: Yeah.Lin [00:39:38]: And many startups. So we really believe in results. One is result. And second is customer. We're very customer obsessed. And we don't want to drive adoption for the sake of adoption. We really want to make sure we understand we are delivering a lot of business values to the customer. And we really value their feedback. So we would wake up midnight and deploy some model for them. Shuffle some capacity for them. And yeah, over the weekend, no brainer.Swyx [00:40:15]: So yeah.Lin [00:40:15]: So that's just how we work as a team. And the caliber of the team is really, really high as well. So as plug-in, we're hiring. We're expanding very, very fast. So if we are passionate about working on the most cutting-edge technology in the general space, come talk with us. Yeah.Swyx [00:40:38]: Let's talk a little bit about that customer journey. I think one of your more famous customers is Cursor. We were the first podcast to have Cursor on. And then obviously since then, they have blown up. Cause and effect are not related. But you guys especially worked on a fast supply model where you were one of the first people to work on speculative decoding in a production setting. Maybe just talk about what was the behind the scenes of working with Cursor?Lin [00:41:03]: I will say Cursor is a very, very unique team. I think the unique part is the team has very high technical caliber. There's no question about it. But they have decided, although many companies building coding co-pilot, they will say, I'm going to build a whole entire stack because I can. And they are unique in the sense they seek partnership. Not because they cannot. They're fully capable, but they know where to focus. That to me is amazing. And of course, they want to find a bypass partner. So we spent some time working together. They are pushing us very aggressively because for them to deliver high caliber product experience, they need the latency. They need the interactive, but also high quality at the same time. So actually, we expanded our product feature quite a lot as we support Cursor. And they are growing so fast. And we massively scaled quickly across multiple regions. And we developed a pretty high intense inference stack, almost like similar to what we do for Meta. I think that's a very, very interesting engagement. And through that, there's a lot of trust being built. They realize, hey, this is a team they can really partner with. And they can go big with. That comes back to, hey, we're really customer obsessed. And all the engineers working with them, there's just enormous amount of time syncing together with them and discussing. And we're not big on meetings, but we are like stack channel always on. Yeah, so you almost feel like working as one team. So I think that's really highlighted.Swyx [00:42:38]: Yeah. For those who don't know, so basically Cursor is a VS Code fork. But most of the time, people will be using closed models. Like I actually use a lot of SONET. So you're not involved there, right? It's not like you host SONET or you have any partnership with it. You're involved where Cursor is small, or like their house brand models are concerned, right?Lin [00:42:58]: I don't know what I can say, but the things they haven't said.Swyx [00:43:04]: Very obviously, the drop down is 4.0, but in Cursor, right? So I assume that the Cursor side is the Fireworks side. And then the other side, they're calling out the other. Just kind of curious. And then, do you see any more opportunity on the... You know, I think you made a big splash with 1,000 tokens per second. That was because of speculative decoding. Is there more to push there?Lin [00:43:25]: We push a lot. Actually, when I mentioned Fire Optimizer, right? So as in, we have a unique automation stack that is one size fits one. We actually deployed to Cursor earlier on. Basically optimized for their specific workload. And that's a lot of juice to extract out of there. And we see success in that product. It actually can be widely adopted. So that's why we started a separate product line called Fire Optimizer. So speculative decoding is just one approach. And speculative decoding here is not static. We actually wrote a blog post about it. There's so many different ways to do speculative decoding. You can pair a small model with a large model in the same model family. Or you can have equal pads and so on. There are different trade-offs which approach you take. It really depends on your workload. And then with your workload, we can align the Eagle heads or Medusa heads or a small big model pair much better to extract the best latency reduction. So all of that is part of the Fire Optimizer offering.Alessio [00:44:23]: I know you mentioned some of the other inference providers. I think the other question that people always have is around benchmarks. So you get different performance on different platforms. How should people think about... People are like, hey, Lama 3.2 is X on MMLU. But maybe using speculative decoding, you go down a different path. Maybe some providers run a quantized model. How should people think about how much they should care about how you're actually running the model? What's the delta between all the magic that you do and what a raw model...Lin [00:44:57]: Okay, so there are two big development cycles. One is experimentation, where they need fast iteration. They don't want to think about quality, and they just want to experiment with product experience and so on. So that's one. And then it looks good, and they want to post-product market with scaling. And the quality is really important. And latency and all the other things are becoming important. During the experimentation phase, it's just pick a good model. Don't worry about anything else. Make sure you even generate the right solution to your product. And that's the focus. And then post-product market fit, then that's kind of the three-dimensional optimization curve start to kick in across quality, latency, cost, where you should land. And to me, it's purely a product decision. To many products, if you choose a lower quality, but better speed and lower cost, but it doesn't make a difference to the product experience, then you should do it. So that's why I think inference is part of the validation. The validation doesn't stop at offline eval. The validation will go through A-B testing, through inference. And that's where we offer various different configurations for you to test which is the best setting. So this is the traditional product evaluation. So product evaluation should also include your new model versions and different model setup into the consideration.Swyx [00:46:22]: I want to specifically talk about what happens a few months ago with some of your major competitors. I mean, all of this is public. What is your take on what happens? And maybe you want to set the record straight on how Fireworks does quantization because I think a lot of people may have outdated perceptions or they didn't read the clarification post on your approach to quantization.Lin [00:46:44]: First of all, it's always a surprise to us that without any notice, we got called out.Swyx [00:46:51]: Specifically by name, which is normally not what...Lin [00:46:54]: Yeah, in a public post. And have certain interpretation of our quality. So I was really surprised. And it's not a good way to compete, right? We want to compete fairly. And oftentimes when one vendor gives out results, the interpretation of another vendor is always extremely biased. So we actually refrain ourselves to do any of those. And we happily partner with third parties to do the most fair evaluation. So we're very surprised. And we don't think that's a good way to figure out the competition landscape. So then we react. I think when it comes to quantization, the interpretation, we wrote actually a very thorough blog post. Because again, no one says it's all. We have various different quantization schemes. We can quantize very different parts of the model from ways to activation to cross-TPU communication. They can use different quantization schemes or consistent across the board. And again, it's a trade-off. It's a trade-off across this three-dimensional quality, latency, and cost. And for our customer, we actually let them find the best optimized point. And we have a very thorough evaluation process to pick that point. But for self-serve, there's only one point to pick. There's no customization available. So of course, it depends on what we talk with many customers. We have to pick one point. And I think the end result, like AA published, later on AA published a quality measure. And we actually looked really good. So that's why what I mean is, I will leave the evaluation of quality or performance to third party and work with them to find the most fair benchmark. And I think that's a good approach, a methodology. But I'm not a part of an approach of calling out specific namesSwyx [00:48:55]: and critique other competitors in a very biased way. Databases happens as well. I think you're the more politically correct one. And then Dima is the more... Something like this. It's you on Twitter.Lin [00:49:11]: It's like the Russian... We partner. We play different roles.Swyx [00:49:20]: Another one that I wanted to... I'm just the last one on the competition side. There's a perception of price wars in hosting open source models. And we talked about the competitiveness in the market. Do you aim to make margin on open source models? Oh, absolutely, yes.Lin [00:49:38]: So, but I think it really... When we think about pricing, it's really need to coordinate with the value we're delivering. If the value is limited, or there are a lot of people delivering the same value, there's no differentiation. There's only one way to go. It's going down. So through competition. If I take a big step back, there is pricing from... We're more compared with close model providers, APIs, right? The close model provider, their cost structure is even more interesting because we don't bear any training costs. And we focus on inference optimization, and that's kind of where we continue to add a lot of product value. So that's how we think about product. But for the close source API provider, model provider, they bear a lot of training costs. And they need to amortize the training costs into the inference. So that created very interesting dynamics of, yeah, if we match pricing there, and I think how they are going to make money is very, very interesting.Swyx [00:50:37]: So for listeners, opening eyes 2024, $4 billion in revenue, $3 billion in compute training, $2 billion in compute inference, $1 billion in research compute amortization, and $700 million in salaries. So that is like...Swyx [00:50:59]: I mean, a lot of R&D.Lin [00:51:01]: Yeah, so I think matter is basically like, make it zero. So that's a very, very interesting dynamics we're operating within. But coming back to inference, so we are, again, as I mentioned, our product is, we are a platform. We're not just a single model as a service provider as many other inference providers, like they're providing a single model. We have our optimizer to highly customize towards your inference workload. We have a compound AI system where significantly simplify your interaction to high quality and low latency, low cost. So those are all very different from other providers.Alessio [00:51:38]: What do people not know about the work that you do? I guess like people are like, okay, Fireworks, you run model very quickly. You have the function model. Is there any kind of like underrated part of Fireworks that more people should try?Lin [00:51:51]: Yeah, actually, one user post on x.com, he mentioned, oh, actually, Fireworks can allow me to upload the LoRa adapter to the service model at the same cost and use it at same cost. Nobody has provided that. That's because we have a very special, like we rolled out multi-LoRa last year, actually. And we actually have this function for a long time. And many people has been using it, but it's not well known that, oh, if you find your model, you don't need to use on demand. If you find your model is LoRa, you can upload your LoRa adapter and we deploy it as if it's a new model. And then you use, you get your endpoint and you can use that directly, but at the same cost as the base model. So I'm happy that user is marketing it for us. He discovered that feature, but we have that for last year. So I think to feedback to me is, we have a lot of very, very good features, as Sean just mentioned. I'm the advisor to the company,Swyx [00:52:57]: and I didn't know that you had speculative decoding released.Lin [00:53:02]: We have prompt catching way back last year also. We have many, yeah. So I think that is one of the underrated feature. And if they're developers, you are using our self-serve platform, please try it out.Swyx [00:53:16]: The LoRa thing is interesting because I think you also, the reason people add additional costs to it, it's not because they feel like charging people. Normally in normal LoRa serving setups, there is a cost to dedicating, loading those weights and dedicating a machine to that inference. How come you can't avoid it?Lin [00:53:36]: Yeah, so this is kind of our technique called multi-LoRa. So we basically have many LoRa adapters share the same base model. And basically we significantly reduce the memory footprint of serving. And the one base model can sustain a hundred to a thousand LoRa adapters. And then basically all these different LoRa adapters can share the same, like direct the same traffic to the same base model where base model is dominating the cost. So that's how we advertise that way. And that's how we can manage the tokens per dollar, million token pricing, the same as base model.Swyx [00:54:13]: Awesome. Is there anything that you think you want to request from the community or you're looking for model-wise or tooling-wise that you think like someone should be working on in this?Lin [00:54:23]: Yeah, so we really want to get a lot of feedback from the application developers who are starting to build on JNN or on the already adopted or starting about thinking about new use cases and so on to try out Fireworks first. And let us know what works out really well for you and what is your wishlist and what sucks, right? So what is not working out for you and we would like to continue to improve. And for our new product launches, typically we want to launch to a small group of people. Usually we launch on our Discord first to have a set of people use that first. So please join our Discord channel. We have a lot of communication going on there. Again, you can also give us feedback. We'll have a starting office hour for you to directly talk with our DevRel and engineers to exchange more long notes.Alessio [00:55:17]: And you're hiring across the board?Lin [00:55:18]: We're hiring across the board. We're hiring front-end engineers, infrastructure cloud, infrastructure engineers, back-end system optimization engineers, applied researchers, like researchers who have done post-training, who have done a lot of fine-tuning and so on.Swyx [00:55:34]: That's it. Thank you. Thanks for having us. Get full access to Latent Space at www.latent.space/subscribe
Data drives the world we live in, but have you ever wondered how data flows around the tech world? Or what the difference is between a data engineer, a data scientist, and a data analyst? Well this is the episode for you! Come learn all about the basics of data engineering and key concepts like data pipelines, data storage, and the infamous ETL process. And as you'll hear in the episode, closets and Cluedo are essential to explain all of these topics! New episodes come out fortnightly on Wednesday morning (NZT). Where to Find Us: Instagram Tik Tok The Hot Girls Code Website Sponsored by: Trade Me Jobs
Arv is the Director of Product at GroupBy. He is a passionate entrepreneur currently responsible for product management. Arv has over eight years of experience in the oil and gas industry, including five years with WorleyParsons.In This Conversation We Discuss: [00:45] Intro[01:21] Driving revenue with better product discovery[02:40] Scaling search for retailers with high SKU counts[03:31] Adapting to modern, conversational search queries[05:58] Better product details for user-friendly browsing[08:30] Avoiding AI errors with human-guided review[09:55] Guiding users with smart sorting options[11:46] Product variations using smart automation[13:02] Shopper trust through respectful customization[14:09] Automating reorders with helpful reminders[15:05] Preventing bounce rates with smart stock management[16:32] Clear shipping times to boost purchase confidence[17:11] advanced analytics for more effective action[18:34] Using best-in-class apps as your brand grows[20:17] Recognizing evolving trends in ecommerce technology[21:24] Product search with AI-driven recommendations[23:01] GroupBy for enhancing your retail experienceResources:Subscribe to Honest Ecommerce on YoutubeProduct discovery platform powered by Google Cloud Vertex AI search for retail groupbyinc.com/Follow Arv Natarajan linkedin.com/in/arvnatarajan/If you're enjoying the show, we'd love it if you left Honest Ecommerce a review on Apple Podcasts. It makes a huge impact on the success of the podcast, and we love reading every one of your reviews!
Philippe Noël is Co-Founder & CEO of ParadeDB, the modern Elasticsearch alternative built on Postgres. They're purpose-built for heavy, real-time workloads and their open source project, also called paradedb, has over 6K stars on GitHub. ParadeDB has raised $2M from investors including General Catalyst & YC. In this episode, we dig into the benefits of connecting search directly to the database (ie. no ETL), the types of users / use cases that really benefit from ParadeDB (e-commerce, FinTech, etc.), the decision to focus on Postgres, making adoption super easy, Philippe's learnings as a second-time founder & more!
Matt Welsh is a technical leader at Aryn AI, an AI-powered ETL system for RAG frameworks, LLM-based applications, and vector databases. In this episode, we explore how AI is revolutionizing programming and software development. Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes - with links to many references - can be found on The Data Exchange web site.
Greetings, SaaS CFO enthusiasts! In this episode, we have the pleasure of hosting Paul Dudley, the insightful co-founder and CEO of Streamkap. Paul shares his intriguing journey through the SaaS landscape, from his early days as an analyst at Quid to scaling Brightedge's sales in Europe, and eventually founding Streamkap in 2022. Paul dives deep into the world of streaming ETL, explaining how Streamkap is revolutionizing data movement by offering real-time data processing solutions. We talk about the company's unique value propositions, ideal customer profiles, and pragmatic go-to-market strategies. Paul reveals how Streamkap tackles complex data problems by making streaming data much more accessible and faster than traditional batch processing. Real-world examples, such as real-time data needs in logistics and payments, help illustrate the practical applications of Streamkap's technology. Paul also outlines Streamkap's growth trajectory, funding rounds, and plans for the future. From early customer engagements to raising pre-seed and seed rounds, Paul delves into the company's strategic decisions and lessons learned along the way. Listen in as we explore the exciting advancements in real-time data and gain valuable insights from Paul's extensive experience in the SaaS domain. Don't miss this enlightening episode of The SaaS CFO Podcast! Show Notes: 00:00 Streaming ETL for real-time data integration. 04:02 Shift to affordable, real-time streaming data systems. 07:59 Focus: Mid-market, small enterprises, not startups. 12:14 Launched data transfer product; secured customers, scaled. 15:30 Customer feedback is crucial for prioritization decisions. 18:28 Focus on integrating data into existing architectures. 20:44 New content, customer focus, product transformations. Links: SaaS Fundraising Stories: https://www.thesaasnews.com/news/streamkap-secures-3-3-million-in-funding Paul Dudley's LinkedIn: https://www.linkedin.com/in/pauldudley/ Streamkap's LinkedIn: https://www.linkedin.com/company/streamkap/ Streamkap's Website: https://streamkap.com/ To learn more about Ben check out the links below: Subscribe to Ben's daily metrics newsletter: https://saasmetricsschool.beehiiv.com/subscribe Subscribe to Ben's SaaS newsletter: https://mailchi.mp/df1db6bf8bca/the-saas-cfo-sign-up-landing-page SaaS Metrics courses here: https://www.thesaasacademy.com/ Join Ben's SaaS community here: https://www.thesaasacademy.com/offers/ivNjwYDx/checkout Follow Ben on LinkedIn: https://www.linkedin.com/in/benrmurray
MSDW is previewing Community Summit North America 2024 with a new series of quick podcast episodes featuring exhibitors. In this episode, we speak with Daniel Cai, founder and managing director of KingswaySoft. Daniel tell us about KingswaySoft's newest offering, their JDBC Driver Pack for Dynamics 365. It first launched in July 2024 with Dynamics 365 CRM and Dataverse support, which they called JDBC 2024 release wave 1. At Community Summit they will be releasing release wave 2 for this product family to support D365 F&O and Business Central. We discuss some of the most important use cases for the JDBC Driver pack and how Summit attendees can engage with the KingswaySoft team at booth 300 to talk data management, analytics, ETL, and more. More information: KingswaySoft JDBC Drivers
host Dave Sobel welcomes Ori Rafael, CEO and co-founder of Upsolver, to discuss the emerging concept of lake house architecture in data management. The conversation begins with an exploration of how lake houses compare to traditional data warehouses and data lakes. Ori explains that a lake house is essentially a modern data warehouse architecture that allows customers to manage their own data layers, providing flexibility and control over their data storage and processing.Ori delves into the evolution of data management architectures, highlighting the transition from on-premise data warehouses to cloud-managed solutions. He discusses the challenges faced by database administrators (DBAs) in the past, such as vendor lock-in and the limitations of traditional data warehouses. The lake house model addresses these issues by decoupling storage and compute, enabling organizations to utilize multiple query engines and platforms without being tied to a single vendor.The discussion also touches on the significant advantages of lake house architecture, particularly in terms of cost reduction and operational efficiency. Ori emphasizes that organizations can save a substantial portion of their data warehouse budgets by eliminating the need for expensive ETL processes tied to specific warehouse vendors. Additionally, the ability to leverage various engines for analytics and AI applications empowers businesses to innovate without the constraints of traditional data management systems.As the conversation progresses, Ori highlights the importance of optimizing storage for improved query performance and efficiency. He explains how Upsolver manages the file system layer to ensure that organizations can achieve performance levels comparable to traditional warehouses while maintaining high storage efficiency. The episode concludes with a discussion on the evolving role of data engineers, emphasizing the need for them to transition from developers to platform managers, enabling greater independence and efficiency in data operations. All our Sponsors: https://businessof.tech/sponsors/ Do you want the show on your podcast app or the written versions of the stories? Subscribe to the Business of Tech: https://www.businessof.tech/subscribe/Looking for a link from the stories? The entire script of the show, with links to articles, are posted in each story on https://www.businessof.tech/ Support the show on Patreon: https://patreon.com/mspradio/ Want our stuff? Cool Merch? Wear “Why Do We Care?” - Visit https://mspradio.myspreadshop.com Follow us on:LinkedIn: https://www.linkedin.com/company/28908079/YouTube: https://youtube.com/mspradio/Facebook: https://www.facebook.com/mspradionews/Instagram: https://www.instagram.com/mspradio/TikTok: https://www.tiktok.com/@businessoftechBluesky: https://bsky.app/profile/businessoftech.bsky.social
Lundi 16 septembre, François Sorel a reçu Anne-Gabrielle Dauba-Pantanacce, porte-parole France de Netflix, VP Communication & RP Netflix France & Europe du Sud ; Jean-David Blanc, entrepreneur, fondateur d'AlloCiné et de Molotov ; Guillaume Boutin, cofondateur de SensCritique ; Frédéric Krebs, entrepreneur et investisseur français, ancien directeur marketing de 20th Century Fox France ; Gaumont Pathé, ancien directeur général d'AlloCiné ; Et Léa Benaim, journaliste BFM Business, dans l'émission Tech & Co, la quotidienne sur BFM Business. Retrouvez l'émission du lundi au jeudi et réécoutez la en podcast.
Most of us will have more than one job in our career. In fact many of us will likely find a new job in the next five years. I hope I'm not in that group, but I recognize that it's a possibility. We never know when our situation will change, or our employer's situation will change. That is one reason I recommend you keep your resume up to date and continue to work on improving your skills. I saw an office hours short recently from Brent Ozar, in which someone had asked him if they should apply for a job even though they didn't meet all of the requirements or know all of the desired technologies. Brent recommended the person apply, and his reasoning was that often a DBA (or other data pro) often gets asked to do a variety of tasks in an organization. The DBA job often crosses lots of boundaries and may end up working on a Active Directory issues, reporting, ETL, and more. When A DBA leaves a job, the organization looks for a replacement that can handle that same wide variety of things. Read the rest of Fifty Percent
I've heard of Kafka before. I know it's an Apache project and you can download or read more at https://kafka.apache.org/. I knew it was a way of moving data around, some sort of ETL tool useful for moving things around. More like a message and queueing system, which is a tool that seems like a great idea, but one that everyone struggles to work with. And one that seemed complex. The overview is that Kafka is "a distributed system consisting of servers and clients that communicate via a high-performance TCP network protocol. It can be deployed on bare-metal hardware, virtual machines, and containers in on-premise as well as cloud environments." Read the rest of A Kafka Introduction
On this episode of ETL ... 00:00 | E N T E R T H E L A B • PRESENTED BY BROWARD VINTAGE 03:20 | FORT THRIFTING AT GOODWILL - *TECHNICAL DIFFICULTIES* 12:15 | ELVIS' STUDIO 54 PARTY VIBES - NEXT WEEK ELVIS IN NYC, FORT IN LA 16:20 | NY YANKEES ARE TIED FOR THE BEST TEAM IN BASEBALL 18:00 | UFC 305 REVIEW - UPCOMING UFC 306 AT THE SPHERE 42:30 | NFL PRESEASON - TUA INTERVIEW ABOUT FORMER HC FLORES 49:00 | NY GIANTS START TO THE PRESEASON - DANIEL JONES - NABERS IS A DAWG 54:15 | FLAG FOOTBALL IN THE NEXT OLYMPICS - SHOULD NFL PLAYERS PLAY? 58:50 | NOAH LYLYES CALLS OUT TYREEK HILL 1:01:40 | TATUM & TKACHUK IN ST LOUIS AS CHAMPS 1:06:00 | THE DNC IS LIVE IN CHICAGO THIS WEEK - POST MALONE NEW ALBUM 1:09:00 | LONG LEGS MOVIE • NFL FANTASY DRAFT COMING UP 1:12:20 | EX I T T H E L A B, PEACE...1! Vibin' with Elvis Escobar & Fort Sama • This episode is Presented by Broward Vintage • Subscribe, follow and engage with us on all of our social platforms listed below... • YouTube Channel | ENTER THE LAB • Instagram | @enterthelab • Twitter | @EnterTheLab_ • TikTok | @enterthelab_
In this episode, we sit down with Ori Rafael, CEO and Co-founder of Upsolver, to explore the rise of the lakehouse architecture and its significance in modern data management. Ori breaks down the origins of the lakehouse and how it leverages S3 to provide scalable and cost-effective storage. We discuss the critical role of open table formats like Apache Iceberg in unifying data lakes and warehouses, and how ETL processes differ between these environments. Ori also shares his vision for the future, highlighting how Upsolver is positioned to empower organizations as they navigate the rapidly evolving data landscape.
On this episode of ETL ... 00:00 | E N T E R T H E L A B • PRESENTED BY BROWARD VINTAGE 02:55 | FORT MOVING UP AT PUBLIX - FATHA LEE MERCH AVAILABLE NOW! 14:50 | ELVIS' CAMPING WEEKEND 21:10 | FORT GOING TO GREEN BAY FOR DOLPHINS vs PACKERS IN NOV. 26:45 | WRAPPING UP THE OLYMPICS - USA MEN'S BBALL (CURRY VIDEO) 38:10 | PITBULL STADIUM AT FIU - MR. 305 45:00 | NBA NEWS - NBA ON NBC, OG SONG IS BACK (VIDEO) 51:40 | NFL NEWS - GIANTS & DOLPHINS UPDATES - BROWNS NEW POTENTIAL DOME 1:02:00 | UFC 305 PREVIEW - CANELO TAKING 80% OF EVERYTHING (VIDEO) 1:11:55 | WHAT'S ON IN THE LAB - INDUSTRY ON HBO - INTERSTELLAR MOVIE 1:16:05 | EX I T T H E L A B, PEACE...1! Vibin' with Elvis Escobar & Fort Sama • This episode is Presented by Broward Vintage • Subscribe, follow and engage with us on all of our social platforms listed below... • YouTube Channel | ENTER THE LAB • Instagram | @enterthelab • Twitter | @EnterTheLab_ • TikTok | @enterthelab_
Shuveb Hussain is co-founder of Unstract, a no-code platform that uses large language models to extract structured data from unstructured documents, allowing users to build API endpoints and ETL pipelines to automate document processing workflows. Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes - with links to many references - can be found on The Data Exchange web site.
Daniel Jaye was co-founder in 1995 of Engage, a pioneer in bringing database marketing to the internet. A competitor of DoubleClick, Engage built arguably the largest database of pseudonymous profiles at the time, and Daniel and his team created innovative technologies for ETL, large-scale analytics and behavioral targeting. Daniel was also the man behind much…More
On this episode of ETL ... 00:00 | E N T E R T H E L A B • PRESENTED BY BROWARD VINTAGE *ELVIS' AUDIO CLEARS UP AFTER 4:30 MIN MARK* 04:30 | FORT TRYNA SURVIVE A HURRICANE HITTING SAVANNAH, GA 12:15 | ALGERIA'S IMANE KHELIF VICTORIES IN THE OLYMPICS AMID GENDER CONTROVERSY 25:45 | THE OLYMPIC GAMES & OTHER NEWS FROM THE WEEK 35:00 | NOAH LYLES WINS GOLD IN TRACK - DWAYNE WADE LOVES PAINTED NAILS 44:45 | THE START OF THE NFL PRESEASON - THE NEW KICKOFF RULES DEBUTS 49:20 | UFC FIGHT NIGHT RECAP • SUMMERSLAM RECAP 1:04:20 | YANKEES BACK IN 1ST PLACE - JUDGE GETTING THE BONDS TREATMENT 1:05:15 | FORT MONOLOGUES ABOUT REALITY OF THE NEXT PRESIDENTIAL ELECTION 1:12:00 | HOUSE OF DRAGON SEASON FINALE WAS.... (NO SPOILERS) 1:14:00 | EX I T T H E L A B, PEACE...1! Vibin' with Elvis Escobar & Fort Sama • This episode is Presented by Broward Vintage • Subscribe, follow and engage with us on all of our social platforms listed below... • YouTube Channel | ENTER THE LAB • Instagram | @enterthelab • Twitter | @EnterTheLab_ • TikTok | @enterthelab_
On this episode of ETL ... 00:00 | E N T E R T H E L A B • PRESENTED BY BROWARD VINTAGE 04:00 | FORT STARTED A JUICE CLEANSE - ELVIS DOING SOME YARD WORK 21:10 | WHITE DUDES FOR KAMALA (VIDEO) 25:40 | THE CONTROVERSIAL OLYMPICS OPENING CEREMONY (VIDEO) 30:40 | NBA ACCEPTS THE NBC OFFER OVER THE TNT OFFER - THE MEN'S OLYMPIC TEAM 36:55 | UFC 304 REVIEW IN THE UK - JON JONES NEXT FIGHTS 55:05 | UFC FIGHT NIGHT THIS SATURDAY - WWE SUMMERSLAM 57:20 | SEASON FINALE OF HOUSE OF DRAGON THIS SUNDAY 1:00:00 | EX I T T H E L A B, PEACE...1! Vibin' with Elvis Escobar & Fort Sama • This episode is Presented by Broward Vintage • Subscribe, follow and engage with us on all of our social platforms listed below... • YouTube Channel | ENTER THE LAB • Instagram | @enterthelab • Twitter | @EnterTheLab_ • TikTok | @enterthelab_
On this episode of ETL ... 00:00 | E N T E R T H E L A B • PRESENTED BY BROWARD VINTAGE 02:30 | THE LAB TALKS THE LIFE OF SALES - https://www.kay.com/videos/a/daniel-fort/286449?s=creatable 16:35 | JAKE PAUL KO's MIKE PERRY - WHAT'S NEXT FOR JAKE PAUL? 41:00 | DREW WILLIAMS ENTERS THE LAB WITH HIS TAKE ON PAUL vs TYSON 45:40 | UFC 304 PREVIEW 54:55 | NFL TOP 100 DROPS & THE TOP 100 ATHLETES IN SPORT FOR THE 21st CENTURY - LET'S DEBATE THE TOP 10 1:09:10 | DREW WILLIAMS POPS BACK INTO THE LAB 1:14:30 | THE NEW CLIPPERS INTUIT DOME IS SICK! (VIDEO) 1:20:00 | THE REPUBLICAN CONVENTION WAS ELECTRIC (VIDEO) 1:24:40 | NFL TRAINING CAMP HAS BEGUN • THE BOYS FINALE 1:27:05 | EX I T T H E L A B, PEACE...1! Vibin' with Elvis Escobar & Fort Sama • This episode is Presented by Broward Vintage • Subscribe, follow and engage with us on all of our social platforms listed below... • YouTube Channel | ENTER THE LAB • Instagram | @enterthelab • Twitter | @EnterTheLab_ • TikTok | @enterthelab_
Our guests this week hold special meaning to ETL, as they have been the physical therapy providers for the kids of our ETL team members and have been cheerleaders for them every step of the way. In this episode, Rianna Silverstein, DPT and Kaitlyn Evers, DPT share their perspectives on their experience treating children with trisomy 18, discuss what to look for in a PT, explain the purpose of DMI, and speak to the power of intensive therapy. We hope you will feel encouraged by hearing the way these two amazing therapists champion the trisomy community and the abilities of children with a trisomy diagnosis. Disclaimer: The thoughts and opinions shared in this episode are personal perspectives and experience and are NOT meant to replace the guidance of your child's licensed healthcare and therapy providers. Please consult with your child's therapists and doctors for care specific to their needs. Extra To Love is a non-profit organization that aims to improve the lives of people with Trisomy 18 and Trisomy 13 by supporting their families. Through Extra To Love: A Trisomy Podcast, we hope effected families will be empowered, connected, supported and educated by hearing personal stories from parents and healthcare providers. To receive support or learn more about our mission, visit www.extratolove.org Follow us on socials!https://www.facebook.com/extratolovehttps://www.instagram.com/extratolovehttps://www.instagram.com/extratolovepodcast
On this episode of ETL ... 00:00 | E N T E R T H E L A B • PRESENTED BY BROWARD VINTAGE 02:55 | COLOMBIANS RAID THE HARD ROCK FOR THE COPA AMERICA (VIDEO) • SPAIN BEATS ENGLAND IN THE EURO's 14:40 | THE FAILED ASSASSINATION ATTEMPT ON DONALD TRUMP (VIDEO) 31:35 | MLB HOME RUN DERBY GOING ON LIVE - PREDICTIONS 35:15 | FORT GOT A PS5 - COLLEGE FOOTBALL 2K4 OUT NOW 43:45 | REED SHEPPARD - ARE THE ROCKETS THE NEW AGE GS WARRIORS? 51:40 | MAYOR ADAMS IN NYC INTRODUCES A NEW GARBAGE CAN... (VIDEO) 56:05 | THE MLB HOME RUN DERBY NATIONAL ANTHEM (VIDEO) 59:30 | EX I T T H E L A B, PEACE...1! Vibin' with Elvis Escobar & Fort Sama • This episode is Presented by Broward Vintage • Subscribe, follow and engage with us on all of our social platforms listed below... • YouTube Channel | ENTER THE LAB • Instagram | @enterthelab • Twitter | @EnterTheLab_ • TikTok | @enterthelab_
TestTalks | Automation Awesomeness | Helping YOU Succeed with Test Automation
Today, we're diving deep into the complexities of ETL testing with our esteemed guests, Chris Thompson and Mike Calabrese from RTTS. Register now for webinar on how AI helps ETL testing: testguild.me/aietl With over two decades of combined experience in data QA and automated testing, Chris and Mike will guide us through the intricacies of an actual data warehouse project plan. They uncover the challenges in data mappings, explore different transformation types, and discuss the critical role of a comprehensive mapping document. Chris and Mike also highlight the importance of rigorous data testing to prevent costly errors and ensure accurate decision-making. Join us as we explore the critical aspects of ETL testing and data project planning and gain invaluable insights from two industry veterans who have seen it all. This is a must-listen episode for anyone involved in data testing and looking to optimize their data transformation processes. I also recommend you check out our upcoming Webinar with RTTS on AI in ETL Testing. Register now!
On this episode of ETL ... 00:00 | E N T E R T H E L A B • PRESENTED BY BROWARD VINTAGE 02:35 | RECAPPING THE ETL's 4TH OF JULY WEEK OFF - ELVIS' MYRTLE BEACH VACATION 11:00 | THE TRUMP & BIDEN DEBATE.... (COMEDIC PURPOSES ONLY) 16:00 | ELVIS SPENDS $350+ AT BUC-EE's IN TOTAL 22:00 | NBA DRAFT REVIEW, SIGNINGS & TRADES - BRONNY THE #55 PICK 38:20 | THE MEN'S USA BBALL OLYMPIC TEAM REPORTS TO CAMP - EXPECTATIONS & ROSTER 47:30 | UFC 303 RECAP - MASVIDAL vs DIAZ, WHO WON? 1:02:20 | MLB ALL STAR SELECTIONS - YANKEES STINK - GIANTS ON HARDKNOCKS 1:07:30 | WHAT's ON IN THE LAB - RIGHTEOUS GEMSTONES, HOUSE OF DRAGON, THE BOYS, THE BEAR 1:12:55 | SKETCH O.F. LEAK - ETL SUPPORTS SKETCH 1:20:55 | EX I T T H E L A B, PEACE...1! Vibin' with Elvis Escobar & Fort Sama • This episode is Presented by Broward Vintage • Subscribe, follow and engage with us on all of our social platforms listed below... • YouTube Channel | ENTER THE LAB • Instagram | @enterthelab • Twitter | @EnterTheLab_ • TikTok | @enterthelab_
Have you ever wondered what the future of data management looks like? In this episode, we dive into the world of data lakehouses with Ori Rafael, the CEO and co-founder of Upsolver. Ori shares his insights on why the lakehouse is poised to be the next big thing in data, and how Upsolver is at the forefront of this revolutionary architecture. A data lakehouse is not just a buzzword; it's a transformative approach that decouples storage, metadata management, and compute. Ori explains how this separation allows for greater flexibility and significant cost savings compared to traditional data warehouses. By leveraging object storage like S3, open-source Iceberg for metadata, and various compute engines, lakehouses reduce vendor lock-in and provide the ability to use specialized engines for different workloads, such as AI. We explore the key advantages of lakehouses, including cost reduction, flexibility, and avoiding vendor lock-in. However, transitioning to a lakehouse architecture is not without its challenges. Ensuring performance parity with data warehouses and managing data access controls are significant hurdles. Ori discusses how Upsolver is tackling these challenges head-on, providing ETL solutions and lake management capabilities that optimize data lakes for performance and interoperability. The episode also delves into the trends shaping the future of data management. With the rapid adoption of open lakehouses and Iceberg emerging as the standard, enterprises are moving away from traditional data warehouses and legacy data lakes. Ori provides a glimpse into how open source catalogs with governance capabilities are evolving, paving the way for more robust and scalable data management solutions. We wrap up the conversation by asking Ori a fun question: If he could have a private breakfast or lunch with anyone in the business, VC funding, or tech world, who would it be and why? You never know, the person he mentions might just be listening! Join us for this insightful discussion on the future of data management and discover why the lakehouse is the next big thing in the industry. Be sure to find out more about Upsolver and their innovative solutions by visiting their website or connecting with their team online.
On this episode of ETL ... 00:00 | E N T E R T H E L A B • PRESENTED BY BROWARD VINTAGE 00:45 | GAME 7 STANLEY CUP - PANTHERS vs OILERS PREGAME 19:25 | CELTICS WIN THEIR 18TH BANNER - NBA FINALS RECAP 25:55 | RYAN GARCIA & JIMMY BUTLER PLAYING POKER (VIDEO) 32:35 | JUSTIN TIMBERLAKE DUI GETS WIERD - BILL BELICHICK DATING A 24 YR OLD 36:45 | TRUMP & BIDEN DEBATE THIS THURSDAY 38:35 | JJ REDICK IS THE NEW HC OF THE LA LAKERS 45:50 | RIP WILLIE MAYS • YANKEE INJURIES 47:20 | COPA AMERICA HAS BEGUN - USA WINS 48:15 | UFC 303 QUICK PREVIEW • KENDRICK LAMAR AMAZON CONCERT 50:20 | EX I T T H E L A B, PEACE...1! Vibin' with Elvis Escobar & Fort Sama • This episode is Presented by Broward Vintage • Subscribe, follow and engage with us on all of our social platforms listed below... • YouTube Channel | ENTER THE LAB • Instagram | @enterthelab • Twitter | @EnterTheLab_ • TikTok | @enterthelab_
On this episode of ETL ... 00:00 | E N T E R T H E L A B • PRESENTED BY BROWARD VINTAGE 01:15 | FORT OFFICIALLY MOVES INTO HIS NEW SPOT IN SAVANNAH, GA 10:40 | PANTHERS UP 3-1 ON THE OILERS - CELTICS UP 3-1 ON THE MAVS 29:15 | YANKEES BEST TEAM IN MLB - DOUBLE BAGS AT 1st BASE, NEW RULE 41:30 | UPCOMING UFC CARD UPDATES - UFC 303 47:30 | JAKE PAUL NEW FIGHT vs MIKE PERRY - GERVONTA DAVIS KO's FRANK MARTIN 56:00 | THE 2024 EURO's HAVE BEGAN 58:15 | WHAT's ON IN THE LAB - HOUSE OF DRAGON, THE BOYS, FARGO 1:02:20 | EX I T T H E L A B, PEACE...1! Vibin' with Elvis Escobar & Fort Sama • This episode is Presented by Broward Vintage • Subscribe, follow and engage with us on all of our social platforms listed below... • YouTube Channel | ENTER THE LAB • Instagram | @enterthelab • Twitter | @EnterTheLab_ • TikTok | @enterthelab_
WBSRocks: Business Growth with ERP and Digital Transformation
Consulting companies generated trillions of dollars by orchestrating the movement of data and crafting ETL codes. This was essential to safeguard the operational systems from the potential impact of analytical workloads on performance. Operational systems handled transactional heavy lifting, while analytical workloads focused on study and decision science. Snowflake revolutionized this landscape by introducing technology that eliminated the need to copy data for analytical purposes. This not only resulted in instant cost savings but also empowered companies to explore real-time analytical workloads. Despite Snowflake being a top-tier technology, organizations sometimes employ it for unintended purposes, either perceiving it as unnecessarily expensive or not achieving the desired business outcomes. To understand Snowflake's place in the architecture and how it compares with other enterprise software technologies, one must delve into its positioning and evaluate its performance in relation to alternative solutions.In today's episode, we invited a panel of industry experts for a live discussion on LinkedIn to conduct an independent review of Snowflake's capabilities. We covered many grounds, including where Snowflake might be a fit in the enterprise architecture and where it might be overused. Finally, they analyze many data points to help understand the core strengths and weaknesses of Snowflake.For more information on growth strategies for SMBs using ERP and digital transformation, visit our community at wbs. rocks or elevatiq.com. To ensure that you never miss an episode of the WBS podcast, subscribe on your favorite podcasting platform.
On this episode of ETL ... 00:00 | E N T E R T H E L A B • PRESENTED BY BROWARD VINTAGE 01:25 | QUICK SHOUTOUTS & CATCHING UP IN THE LAB - FORT ROCKING HIS YANKEE CAP 12:25 | MORE NEWS THAT FAUCI LIED • TRUMP FOUND GUILTY ON 34 COUNTS 23:00 | UPDATE: SCOTTIE SCHEFFLER CHARGES DROPPED 25:40 | FATHA LEE ENTER THE LAB... 39:45 | BREAKING DOWN THE NBA FINALS - MAVERICKS vs CELTICS 59:15 | FATHA LEE EXITS THE LAB... 1:07:35 | CAITLIN CLARK, WNBA DRAMA CONTINUES 1:16:00 | DARREN WALLER RETIREMENT MUSIC VIDEO • JUSTIN JEFFERSON GETS HIS BAG 1:24:50 | THE NY YANKEES ARE THE BEST TEAM IN BASEBALL 1:32:00 | REAL MADRID WIN UEFA, 15x - NOW THEY GET MBAPPE • PANTHERS vs OILERS IN THE STANLEY CUP FINALS 1:35:25 | UFC 302 REVIEW • JACK PAUL vs MIKE TYSON POSTPONED 1:42:55 | GUY IN COURT FOR SUSPENDED LICENSE ON SKYPE GETS CAUGHT 1:48:15 | EX I T T H E L A B, PEACE...1! Vibin' with Elvis Escobar & Fort Sama • This episode is Presented by Broward Vintage • Subscribe, follow and engage with us on all of our social platforms listed below... • YouTube Channel | ENTER THE LAB • Instagram | @enterthelab • Twitter | @EnterTheLab_ • TikTok | @enterthelab_
WBSRocks: Business Growth with ERP and Digital Transformation
The assertion of a data platform serving as an ERP is not new. Many companies grapple with discerning the disparity between the two. The distinction is further muddled as data platforms become software-defined, emulating ERP functions. While data platforms traditionally reside in the architecture atop the transactional layer, their role should be confined to study, analysis, simulations, and providing recommendations. They are by no means a substitute for operational systems. So, how does Palantir measure up against other solutions in the market?In today's episode, we invited a panel of industry experts for a live discussion on LinkedIn to conduct an independent review of Palantir's capabilities. We covered many grounds, including their overlap with other platforms such as ERP, supply chain suites, data integration, and ETL platforms. We also analyzed their current positioning and why they are essentially a custom development platforms for the decision science layer despite their claims of being a replacement for ERP and other transactional software.For more information on growth strategies for SMBs using ERP and digital transformation, visit our community at wbs. rocks or elevatiq.com. To ensure that you never miss an episode of the WBS podcast, subscribe on your favorite podcasting platform.