POPULARITY
Try Keboola
In this episode, Lois Houston and Nikita Abraham continue their deep dive into Oracle GoldenGate 23ai, focusing on its evolution and the extensive features it offers. They are joined once again by Nick Wagner, who provides valuable insights into the product's journey. Nick talks about the various iterations of Oracle GoldenGate, highlighting the significant advancements from version 12c to the latest 23ai release. The discussion then shifts to the extensive new features in 23ai, including AI-related capabilities, UI enhancements, and database function integration. Oracle GoldenGate 23ai: Fundamentals: https://mylearn.oracle.com/ou/course/oracle-goldengate-23ai-fundamentals/145884/237273 Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ X: https://x.com/Oracle_Edu Special thanks to Arijit Ghosh, David Wright, Kris-Ann Nansen, Radhika Banka, and the OU Studio Team for helping us create this episode. ----------------------------------------------------------------- Episode Transcript: 00:00 Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we'll bring you foundational training on the most popular Oracle technologies. Let's get started! 00:25 Lois: Hello and welcome to the Oracle University Podcast! I'm Lois Houston, Director of Innovation Programs with Oracle University, and with me is Nikita Abraham, Team Lead: Editorial Services. Nikita: Hi everyone! Last week, we introduced Oracle GoldenGate and its capabilities, and also spoke about GoldenGate 23ai. In today's episode, we'll talk about the various iterations of Oracle GoldenGate since its inception. And we'll also take a look at some new features and the Oracle GoldenGate product family. 00:57 Lois: And we have Nick Wagner back with us. Nick is a Senior Director of Product Management for GoldenGate at Oracle. Hi Nick! I think the last time we had an Oracle University course was when Oracle GoldenGate 12c was out. I'm sure there's been a lot of advancements since then. Can you walk us through those? Nick: GoldenGate 12.3 introduced the microservices architecture. GoldenGate 18c introduced support for Oracle Autonomous Data Warehouse and Autonomous Transaction Processing Databases. In GoldenGate 19c, we added the ability to do cross endian remote capture for Oracle, making it easier to set up the GoldenGate OCI service to capture from environments like Solaris, Spark, and HP-UX and replicate into the Cloud. Also, GoldenGate 19c introduced a simpler process for upgrades and installation of GoldenGate where we released something called a unified build. This means that when you install GoldenGate for a particular database, you don't need to worry about the database version when you install GoldenGate. Prior to this, you would have to install a version-specific and database-specific version of GoldenGate. So this really simplified that whole process. In GoldenGate 23ai, which is where we are now, this really is a huge release. 02:16 Nikita: Yeah, we covered some of the distributed AI features and high availability environments in our last episode. But can you give us an overview of everything that's in the 23ai release? I know there's a lot to get into but maybe you could highlight just the major ones? Nick: Within the AI and streaming environments, we've got interoperability for database vector types, heterogeneous capture and apply as well. Again, this is not just replication between Oracle-to-Oracle vector or Postgres to Postgres vector, it is heterogeneous just like the rest of GoldenGate. The entire UI has been redesigned and optimized for high speed. And so we have a lot of customers that have dozens and dozens of extracts and replicats and processes running and it was taking a long time for the UI to refresh those and to show what's going on within those systems. So the UI has been optimized to be able to handle those environments much better. We now have the ability to call database functions directly from call map. And so when you do transformation with GoldenGate, we have about 50 or 60 built-in transformation routines for string conversion, arithmetic operation, date manipulation. But we never had the ability to directly call a database function. 03:28 Lois: And now we do? Nick: So now you can actually call that database function, database stored procedure, database package, return a value and that can be used for transformation within GoldenGate. We have integration with identity providers, being able to use token-based authentication and integrate in with things like Azure Active Directory and your other single sign-on for the GoldenGate product itself. Within Oracle 23ai, there's a number of new features. One of those cool features is something called lock-free reservation columns. So this allows you to have a row, a single row within a table and you can identify a column within that row that's like an inventory column. And you can have multiple different users and multiple different transactions all updating that column within that same exact row at that same time. So you no longer have row-level locking for these reservation columns. And it allows you to do things like shopping carts very easily. If I have 500 widgets to sell, I'm going to let any number of transactions come in and subtract from that inventory column. And then once it gets below a certain point, then I'll start enforcing that row-level locking. 04:43 Lois: That's really cool… Nick: The one key thing that I wanted to mention here is that because of the way that the lock-free reservations work, you can have multiple transactions open on the same row. This is only supported for Oracle to Oracle. You need to have that same lock-free reservation data type and availability on that target system if GoldenGate is going to replicate into it. 05:05 Nikita: Are there any new features related to the diagnosability and observability of GoldenGate? Nick: We've improved the AWR reports in Oracle 23ai. There's now seven sections that are specific to Oracle GoldenGate to allow you to really go in and see exactly what the GoldenGate processes are doing and how they're behaving inside the database itself. And there's a Replication Performance Advisor package inside that database, and that's been integrated into the Web UI as well. So now you can actually get information out of the replication advisor package in Oracle directly from the UI without having to log into the database and try to run any database procedures to get it. We've also added the ability to support a per-PDB Extract. So in the past, when GoldenGate would run on a multitenant database, a multitenant database in Oracle, all the redo data from any pluggable database gets sent to that one redo stream. And so you would have to configure GoldenGate at the container or root level and it would be able to access anything at any PDB. Now, there's better security and better performance by doing what we call per-PDB Extract. And this means that for a single pluggable database, I can have an extract that runs at that database level that's going to capture information just from that pluggable database. 06:22 Lois And what about non-Oracle environments, Nick? Nick: We've also enhanced the non-Oracle environments as well. For example, in Postgres, we've added support for precise instantiation using Postgres snapshots. This eliminates the need to handle collisions when you're doing Postgres to Postgres replication and initial instantiation. On the GoldenGate for big data side, we've renamed that product more aptly to distributed applications in analytics, which is really what it does, and we've added a whole bunch of new features here too. The ability to move data into Databricks, doing Google Pub/Sub delivery. We now have support for XAG within the GoldenGate for distributed applications and analytics. What that means is that now you can follow all of our MAA best practices for GoldenGate for Oracle, but it also works for the DAA product as well, meaning that if it's running on one node of a cluster and that node fails, it'll restart itself on another node in the cluster. We've also added the ability to deliver data to Redis, Google BigQuery, stage and merge functionality for better performance into the BigQuery product. And then we've added a completely new feature, and this is something called streaming data and apps and we're calling it AsyncAPI and CloudEvent data streaming. It's a long name, but what that means is that we now have the ability to publish changes from a GoldenGate trail file out to end users. And so this allows through the Web UI or through the REST API, you can now come into GoldenGate and through the distributed applications and analytics product, actually set up a subscription to a GoldenGate trail file. And so this allows us to push data into messaging environments, or you can simply subscribe to changes and it doesn't have to be the whole trail file, it can just be a subset. You can specify exactly which tables and you can put filters on that. You can also set up your topologies as well. So, it's a really cool feature that we've added here. 08:26 Nikita: Ok, you've given us a lot of updates about what GoldenGate can support. But can we also get some specifics? Nick: So as far as what we have, on the Oracle Database side, there's a ton of different Oracle databases we support, including the Autonomous Databases and all the different flavors of them, your Oracle Database Appliance, your Base Database Service within OCI, your of course, Standard and Enterprise Edition, as well as all the different flavors of Exadata, are all supported with GoldenGate. This is all for capture and delivery. And this is all versions as well. GoldenGate supports Oracle 23ai and below. We also have a ton of non-Oracle databases in different Cloud stores. On an non-Oracle side, we support everything from application-specific databases like FairCom DB, all the way to more advanced applications like Snowflake, which there's a vast user base for that. We also support a lot of different cloud stores and these again, are non-Oracle, nonrelational systems, or they can be relational databases. We also support a lot of big data platforms and this is part of the distributed applications and analytics side of things where you have the ability to replicate to different Apache environments, different Cloudera environments. We also support a number of open-source systems, including things like Apache Cassandra, MySQL Community Edition, a lot of different Postgres open source databases along with MariaDB. And then we have a bunch of streaming event products, NoSQL data stores, and even Oracle applications that we support. So there's absolutely a ton of different environments that GoldenGate supports. There are additional Oracle databases that we support and this includes the Oracle Metadata Service, as well as Oracle MySQL, including MySQL HeatWave. Oracle also has Oracle NoSQL Spatial and Graph and times 10 products, which again are all supported by GoldenGate. 10:23 Lois: Wow, that's a lot of information! Nick: One of the things that we didn't really cover was the different SaaS applications, which we've got like Cerner, Fusion Cloud, Hospitality, Retail, MICROS, Oracle Transportation, JD Edwards, Siebel, and on and on and on. And again, because of the nature of GoldenGate, it's heterogeneous. Any source can talk to any target. And so it doesn't have to be, oh, I'm pulling from Oracle Fusion Cloud, that means I have to go to an Oracle Database on the target, not necessarily. 10:51 Lois: So, there's really a massive amount of flexibility built into the system. 11:00 Unlock the power of AI Vector Search with our new course and certification. Get more accurate search results, handle complex datasets easily, and supercharge your data-driven decisions. From now through May 15, 2025, we are waiving the certification exam fee (valued at $245). Visit mylearn.oracle.com to enroll. 11:26 Nikita: Welcome back! Now that we've gone through the base product, what other features or products are in the GoldenGate family itself, Nick? Nick: So we have quite a few. We've kind of touched already on GoldenGate for Oracle databases and non-Oracle databases. We also have something called GoldenGate for Mainframe, which right now is covered under the GoldenGate for non-Oracle, but there is a licensing difference there. So that's something to be aware of. We also have the OCI GoldenGate product. We are announcing and we have announced that OCI GoldenGate will also be made available as part of the Oracle Database@Azure and Oracle Database@ Google Cloud partnerships. And then you'll be able to use that vendor's cloud credits to actually pay for the OCI GoldenGate product. One of the cool things about this is it will have full feature parity with OCI GoldenGate running in OCI. So all the same features, all the same sources and targets, all the same topologies be able to migrate data in and out of those clouds at will, just like you do with OCI GoldenGate today running in OCI. We have Oracle GoldenGate Free. This is a completely free edition of GoldenGate to use. It is limited on the number of platforms that it supports as far as sources and targets and the size of the database. 12:45 Lois: But it's a great way for developers to really experience GoldenGate without worrying about a license, right? What's next, Nick? Nick: We have GoldenGate for Distributed Applications and Analytics, which was formerly called GoldenGate for big data, and that allows us to do all the streaming. That's also where the GoldenGate AsyncAPI integration is done. So in order to publish the GoldenGate trail files or allow people to subscribe to them, it would be covered under the Oracle GoldenGate Distributed Applications and Analytics license. We also have OCI GoldenGate Marketplace, which allows you to run essentially the on-premises version of GoldenGate but within OCI. So a little bit more flexibility there. It also has a hub architecture. So if you need that 99.99% availability, you can get it within the OCI Marketplace environment. We have GoldenGate for Oracle Enterprise Manager Cloud Control, which used to be called Oracle Enterprise Manager. And this allows you to use Enterprise Manager Cloud Control to get all the statistics and details about GoldenGate. So all the reporting information, all the analytics, all the statistics, how fast GoldenGate is replicating, what's the lag, what's the performance of each of the processes, how much data am I sending across a network. All that's available within the plug-in. We also have Oracle GoldenGate Veridata. This is a nice utility and tool that allows you to compare two databases, whether or not GoldenGate is running between them and actually tell you, hey, these two systems are out of sync. And if they are out of sync, it actually allows you to repair the data too. 14:25 Nikita: That's really valuable…. Nick: And it does this comparison without locking the source or the target tables. The other really cool thing about Veridata is it does this while there's data in flight. So let's say that the GoldenGate lag is 15 or 20 seconds and I want to compare this table that has 10 million rows in it. The Veridata product will go out, run its comparison once. Once that comparison is done the first time, it's then going to have a list of rows that are potentially out of sync. Well, some of those rows could have been moved over or could have been modified during that 10 to 15 second window. And so the next time you run Veridata, it's actually going to go through. It's going to check just those rows that were potentially out of sync to see if they're really out of sync or not. And if it comes back and says, hey, out of those potential rows, there's two out of sync, it'll actually produce a script that allows you to resynchronize those systems and repair them. So it's a very cool product. 15:19 Nikita: What about GoldenGate Stream Analytics? I know you mentioned it in the last episode, but in the context of this discussion, can you tell us a little more about it? Nick: This is the ability to essentially stream data from a GoldenGate trail file, and they do a real time analytics on it. And also things like geofencing or real-time series analysis of it. 15:40 Lois: Could you give us an example of this? Nick: If I'm working in tracking stock market information and stocks, it's not really that important on how much or how far down a stock goes. What's really important is how quickly did that stock rise or how quickly did that stock fall. And that's something that GoldenGate Stream Analytics product can do. Another thing that it's very valuable for is the geofencing. I can have an application on my phone and I can track where the user is based on that application and all that information goes into a database. I can then use the geofencing tool to say that, hey, if one of those users on that app gets within a certain distance of one of my brick-and-mortar stores, I can actually send them a push notification to say, hey, come on in and you can order your favorite drink just by clicking Yes, and we'll have it ready for you. And so there's a lot of things that you can do there to help upsell your customers and to get more revenue just through GoldenGate itself. And then we also have a GoldenGate Migration Utility, which allows customers to migrate from the classic architecture into the microservices architecture. 16:44 Nikita: Thanks Nick for that comprehensive overview. Lois: In our next episode, we'll have Nick back with us to talk about commonly used terminology and the GoldenGate architecture. And if you want to learn more about what we discussed today, visit mylearn.oracle.com and take a look at the Oracle GoldenGate 23ai Fundamentals course. Until next time, this is Lois Houston… Nikita: And Nikita Abraham, signing off! 17:10 That's all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We'd also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.
Avsnitt 141 med Ted Solomon på Ctrl Digital där vi djupdyker i begreppet förstapartsdata och hur marknadsförare bör arbeta med det. Vi pratar om allt från vad förstapartsdata är och hur Ted brukar definiera begreppet. Till vilka olika typer av förstapartsdata det finns. Ted förklarar också varför den här typen av data är viktigare än någonsin för marknadsförare. Och vad han ser hindrar många företag från att arbeta med det. Du får dessutom höra om: Olika sätt att samla in förstapartsdata Vad som krävs för att samla och hantera den Vanliga myter kring förstapartsdata Vad det innebär att aktivera den här datan Och exempel på vad det innebär i praktiken Om gästen Ted Solomon är vd och grundare av Ctrl Digital. Han är i grunden digital analytiker och fokuserar idag på datainsamling och aktivering av förstapartsdata. Han har även lång erfarenhet av att bygga upp measurement-avdelningar på stora bolag som Entercard och Klarna. Ctrl Digital är en specialistbyrå inom digital analys och teknisk mätning. Där hjälper han och hans team företag att samla in, strukturera och aktivera sin data, framför allt inom försäljning och marknadsföring. Tidsstämplar 00:00 Introduktion 02:34 Förstapartsdata vs andra- och tredjepartsdata 06:24 State of förstapartsdata 2025 13:24 Myter och hur företag arbetar idag 22:21 Business case kring förstapartsdata 26:15 Samla in och hantera på rätt sätt 31:58 Vikten av analys och tydligt mål 37:36 Komma igång med aktivering av datan 42:36 IKEA Family och exempel på aktivering 47:21 Teds tips och råd till marknadsförare Länkar Ted Solomon (LinkedIn-profil) Ctrl Digital (webbsida) Migrera, kom igång och få ut mer av Google Analytics 4 – Ted Solomon #104 (poddavsnitt) Funnel (verktyg) Supermetrics (verktyg) Google Tag Manager (GTM) (verktyg) Google BigQuery (verktyg) CDP Institute (resurs) RFM Segmentation Using Bigquery - Ctrl Digital (artikel) Building a Marketing Data Warehouse in BigQuery - Ctrl Digital (artikel) First-Party Data Activation Playbook - Think with Google (whitepaper) IKEA-caset med förstapartsdata - Loyalty Point (case) Gartner: Marketing Budgets: Benchmarks for CMOs in the Era of Less (resurs) Confetti (samarbete) Digitalenta (samarbete)
Bayer's Data Evolution with AlloyDBThe Big Themes:Data complexity and intelligent agriculture: Bayer Crop Science is addressing agriculture's complex data challenges. The company integrates data such as satellite imagery, weather conditions, soil data, and IoT device inputs, to drive innovation in seed development and farming practices. By leveraging cloud technologies like AlloyDB, Bayer's teams can support the future of farming, despite challenges posed by climate change and rising global food demand.Integrating BigQuery for comprehensive analytics: To further enhance its data-driven insights, Bayer integrates Google BigQuery alongside AlloyDB for extensive data analysis. BigQuery serves as the central analytics warehouse, receiving billions of phenotypic data points for in-depth modeling and decision-making. During harvest season, Bayer can quickly access and analyze comprehensive datasets, enabling better decisions across production and supply chains.Harvest season demands and system resilience: During harvest season, Bayer Crop Science faces intense pressure as high volumes of data flow in, requiring real-time analysis and decision-making. The peak demand period sees a sharp increase in read and write operations, making it essential for Bayer's data system to function seamlessly. AlloyDB played a crucial role in handling these spikes by providing low-latency data processing and high availability.The Big Quote: “Climate change is a new challenge. You see some of these forecasts coming out of academia that yields will go down by 30% — that will arrest this great trend that we've seen continually increasing over the last 100 years. We need to solve for that, and that's going to take new types of data and new approaches and these types of things."
لو خطر في بالك قبل كده ليه عندنا كل قواعد البيانات دي, و ليه فيه منهم انواع مختلفة DBMS, NOSQL و غيرهم, طيب الناس اللي بتشتغل على الحاجات دي ايه التحديات اللي بيواجهوها, و ايه التخصص ده و ايه المتطلبات بتاعته. Ahmed Ayad is a SQL Engineer by trade, a database guy by education and training, and data dude by passion. I am currently an Engineering Director of the Managed Storage and Workload Management team in Google #BigQuery, building the best large scale enterprise data warehouse on the planet. My team owns the core parts of BigQuery involved in managing user data, metadata catalog, streaming and batch ingestion, replication, resource management and placement, physical sharding, and structured lake analytics. Over the years we have: - Grew data under management by several orders of magnitude. - Grew BigQuery's global footprint to more than 20+ regions and counting. - Enabled the hyper scaling of data analytics for a Who's Who list of Fortune 500 users, both Enterprise and Cloud-native. I am passionate about building cool technologies at scale, and the effective teams that create them. Things I did in previous professional lives: - I have shipped components in SQL Server product since SQL Server 2008. Worked on the Performance Data Collector, Policy Based Management, AlwaysOn, The Utility Control Point, SQL Azure stack from the backend to the middle-tier and Portal, SQL Server Agent, SQL Server Optimizer, and SQL Server Management Tools. - Did Database research in the areas of Data Mining, Query Optimization, and Data Streaming.
A founding engineer on Google BigQuery and now at the helm of MotherDuck, Jordan Tigani challenges the decade-long dominance of Big Data and introduces a compelling alternative that could change how companies handle data. Jordan discusses why Big Data technologies are an overkill for most companies, how MotherDuck and DuckDB offer fast analytical queries, and lessons learned as a technical founder building his first startup. Watch the episode with Tomasz Tunguz: https://youtu.be/gU6dGmZzmvI Website - https://motherduck.com Twitter - https://x.com/motherduck Jordan Tigani LinkedIn - https://www.linkedin.com/in/jordantigani Twitter - https://x.com/jrdntgn FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (00:56) What is the Small Data? (06:56) Marketing strategy of MotherDuck (08:39) Processing Small Data with Big Data stack (15:30) DuckDB (17:21) Creation of DuckDB (18:48) Founding story of MotherDuck (24:08) MotherDuck's community (25:25) MotherDuck of today ($100M raised) (33:15) Why MotherDuck and DuckDB are so fast? (39:08) The limitations and the future of MotherDuck's platform (39:49) Small Models (42:37) Small Data and the Modern Data Stack (46:47) Making things simpler with a shift from Big Data to Small Data (50:04) Jordan Tigani's entrepreneurial journey (58:31) Outro
In this episode of Fundraising Today and the Go Beyond Fundraising podcast, we explore how nonprofits can use Google Analytics 4 (GA4) to boost their fundraising efforts. Emil Isaakov, a data analyst with Allegiance Group + Pursuant, walks us through the significant changes that GA4 brings, such as event-based tracking and predictive metrics. Emil also outlines the steps nonprofits can take to optimize their websites for better conversions. He touches on additional integrations with Google Ads and Google BigQuery that can help organizations build on and store the data they're getting through GA4. In addition, we'll look at GA4 through the lens of a few specific examples: How nonprofits can determine which communication channels are giving them the biggest bank for their buck, and how they can use GA4 to grow their email newsletter.
Join us as we welcome the Data Cloud Queen herself, Danielle Larregui. Get ready to witness the groundbreaking power of Einstein 1 Studio as Danielle unveils its transformative capabilities within the Salesforce Data Cloud. Discover how developers can effortlessly create AI models using a no-code or low-code approach directly with their Data Lake data. We'll explore the practicality of generating predictions, integrating external AI platforms, and leveraging built-in tools for assessing prediction accuracy. Brace yourself for the standout feature of 'Bring Your Own Model,' which allows seamless, real-time data sharing without the need for ETL processes. We'll discuss the availability of Snowflake's integration and the potential that lies with Google BigQuery. Imagine how these integrations can revolutionize your external data management, from segmentation to identity resolution. Stay tuned to learn how Data Cloud Enrichment could further enhance your Salesforce CRM by leveraging the power of Data Cloud data. Show Highlights: Introduction of Einstein 1 Studio and Model Builder within Salesforce Data Cloud for creating AI models using no-code or low-code approaches. How the "Bring Your Own Model" feature enables real-time data sharing with Salesforce Data Cloud without ETL processes. How Data Cloud Enrichment allows Salesforce CRM records to be updated with Data Cloud data. Remote Data Cloud, which could unify data management for organizations with multiple Salesforce instances. Ability to use predictions made by AI models in Salesforce flows, Apex classes, and reporting within Data Cloud. Links: Bring Your Google Vertex AI Models To Data Cloud - https://developer.salesforce.com/blogs/2023/11/bring-your-google-vertex-ai-models-to-data-cloud Use Model Builder to Integrate Databricks Models with Salesforce - https://developer.salesforce.com/blogs/2024/03/use-model-builder-to-integrate-databricks-models-with-salesforce
Prof. Dr. Hannes Mühleisen is a creator of the DuckDB database management system and Co-founder and CEO of DuckDB Labs. Jordan is co-founder and chief duck-herder at MotherDuck, a startup building a serverless analytics platform based on DuckDB. MLOps podcast #202 with Hannes Mühleisen, Co-Founder & CEO of DuckDB Labs and Jordan Tigani, Chief Duck-Herder at MotherDuck, Small Data, Big Impact: The Story Behind DuckDB. // Abstract Navigate the intricacies of data management with Jordan Tagani and Hannes Mühleisen, the creative geniuses behind DuckDB and MotherDuck. This deep dive unravels the game-changing principles behind DuckDB's creation, tackling the prevailing wisdom to passionately fill the gap for smaller data set management. Let's also discover MotherDuck's unique focus on providing an unprecedented developer experience and its innovative edge in visualization and data delivery. This episode is teeming with enlightening discussions about managing community feedback, funding, and future possibilities that should not be missed for any tech enthusiasts and data management practitioners. // Bio Hannes Mühleisen Prof. Dr. Hannes Mühleisen is a creator of the DuckDB database management system and Co-founder and CEO of DuckDB Labs, a consulting company providing services around DuckDB. Hannes is also Professor of Data Engineering at Radboud Universiteit Nijmegen. His' main interest is analytical data management systems. Jordan Tigani Jordan is co-founder and chief duck-herder at MotherDuck, a startup building a serverless analytics platform based on DuckDB. He spent a decade working on Google BigQuery, as a founding engineer, book author, engineering leader, and product leader. More recently, as SingleStore's Chief Product Officer, Jordan helped them build a cloud-native SaaS business. Jordan has also worked at Microsoft Research, the Windows Kernel team, and at a handful of star-crossed startups. His biggest claim to fame is predicting world cup matches using machine learning with a better record than Paul the Octopus. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Websites: https://duckdb.org/ https://motherduck.com/ --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Hannes on LinkedIn: https://www.linkedin.com/in/hfmuehleisen/ Connect with Jordan on LinkedIn: https://www.linkedin.com/in/jordantigani/ Timestamps: [00:00] Hannes and Jordan's preferred coffee [01:30] Takeaways [03:43] Swaggers in the house! [07:13] Duck DB's inception [09:38] Jordan's background [12:28] Simplify Developer Experience [17:54] Big Data Shift [26:01] Creation of MotherDuck [30:58] Duck DB and MotherDuck Partnership [31:57] Incentive Alignment Concerns [37:46] Building an incredible developer experience [43:38] User Testing Lab [47:18] Setting a higher standard [49:22] The moments before the moment [52:18] Gathering feedback and talking to the community [54:30] MotherDuck Features [1:00:19] Cloud Innovation for MotherDuck [1:02:41] ML Engineers and DuckDB [1:08:03] Wrap up
SQL is Dead, Long Live SQL!Fast jede Applikation hat irgendeine Form von persistenter Datenhaltung. Oft in Form einer Datenbank. Der Platzhirsch bei Datenbanken sind Systeme, die sich mit der Structured Query Language (kurz SQL) abfragen lassen. MySQL, PostgreSQL, Oracle, MSSQL Server, sqlite, Google BigQuery und so weiter.Die coolen Kids haben vielleicht irgendeine Form von NoSQL-Datenbank im Einsatz. Aber auch da kommt man nicht um SQL herum.Für die meisten Entwickler*innen ist SQL ein alter Hut. SELECT * FROM Tabelle WHERE foo = bar GROUP BY id. Das haben wohl die meisten gelernt und damit kommt man schon sehr weit. Doch war es das mit den Möglichkeiten von SQL? Klare Antwort: Nein.Die Sprache wird weiterentwickelt, bekommt moderne Features und hat weitaus mehr zu bieten als manch einer denkt. Und darüber sprechen wir in dieser Episode mit dem SQL-Experten Markus Winand.Wir sprechen über Modernes SQL, die verschiedenen SQL Standards, ORMs und die Trennung von “Daten-Logik” und “Application-Logik” sowie über eine “Can I Use”-Platform für SQL Features.Bonus: Warum die Übernahme von MySQL durch Oracle das Beste war, was MySQL passieren konnte.Das schnelle Feedback zur Episode:
MONEY FM 89.3 - Prime Time with Howie Lim, Bernard Lim & Finance Presenter JP Ong
A switchboard is known as a piece of equipment that directs all phone calls made to and from a particular building and area. But what about a switchboard that automatically helps enterprises direct and organise data into dashboards for business leaders to carry out data-driven projects on revenue operations, deal plannings, and campaign performance? Well, that's exactly what our guest Switchboard does. Founded in 2014 by the former Google BigQuery employees, Switchboard provides a platform for revenue and marketing teams to launch and maintain data projects at scale without the need for specialised data engineering teams. The firm said its platform can make it 10 times faster for firms to implement data projects as compared to hiring a team of data engineers and managing vast amounts of data on their own. Its software serves clients from a wide range of industries ranging from retail, technology, franchise operations and financial services. This includes a number of notable players such as Spotify and Orangetheory Fitness. But what do clients in these industries require and how does Switchboard plug the gaps? Meanwhile, Switchboard had in July this year received US$7 million in a Series A funding round led by GFT Ventures and Quest Venture Partners. But what opportunities do investors see in them? On Under the Radar, The Evening Runway's finance presenter Chua Tian Tian posed these questions to Ju-kay Kwek, CEO, Switchboard AND Joseph Lee, Partner and Head of Investor Relations, GFT Ventures.See omnystudio.com/listener for privacy information.
Per chi fa DIgital Marketing oramai non si può prescindere da Google BigQuery.I vantaggi sono troppi anche per varie limitazioni a Google Analytics e Google Search Console.Ma specialmente per le nuove cose che stanno arrivando!FastLetter: Una fonte buona dalla quale aggiornarsihttps://giorgiotaverniti.substack.com/-------------------------------INDICE E FONTI01:24 Storage dei Contenuti02:11 Analisi Avanzate02:42 La proprietà dei dati03:11 L'uso dell'AI sui dati03:57 Recupero dei dati storici L'articolo su Relevant di Denishttps://relevant.searchon.it/bigquery-5-ragioni/-------------------------------Per seguire FastForward:Su YouTube: iscriviti e attiva la campanella
AWS Morning Brief for the week of October 10, 2023 with Corey Quinn. Links: Sponsor re:Quinnvent Amazon DataZone is now generally available Amazon EC2 Hibernate now supports more operating systems Lambda test events are now available in AWS SAM CLI Simplify data transfer: Google BigQuery to Amazon S3 using Amazon AppFlow Coming November 2023: A new analysis experience on Amazon QuickSight Implement auto-increment with Amazon DynamoDB The Future of Personal Digital Records: Unlocking Security and Efficiency through Blockchain and Smart Contracts Slack elevates media pipeline with AWS Elemental MediaConvert and Amazon Transcribe Integrate multiple Microsoft Entra ID tenants with AWS IAM Identity Center Building high-throughput satellite data downlink architectures with AWS Ground Station WideBand DigIF and Amphinicy Blink SDR Save the Date: Join AWS at the Reality Capture Network Conference, Oct 17 – Oct 19, 2023
What's up folks, today we'll be joined by various martech pros sharing their opinions on the topic of warehouse-native martech.The landscape of marketing technology architecture has been undergoing – what you might call – a seismic shift and many don't even realize it. In this transformation, there's a remarkable development - warehouse-native marketing technology, an innovative breakthrough that promises to reshape the entire industry for the better, but comes with plenty of questions and skepticism. Here's today's main takeaway: As we navigate the potential transformation to warehouse-native martech, the single most critical action is to prioritize achieving high-quality, well-structured data; it's the golden key to unlocking the full potential of these emerging tools and strategies.This episode explores the various facets of warehouse-native martech and its viability, pulling in insights from industry experts, piecing together a comprehensive view of this groundbreaking shift.What are warehouse native martech or connected apps?In Dec 2021, Snowflake introduced a new term, 'connected applications'. Unlike traditional managed SaaS applications, connected applications process data on the customers' data warehouse, giving customers control over their data Benefits include preventing data silos, removing API integration backlog, enabling custom analytics, upholding data governance policies, improving SaaS performance, and facilitating actionable reporting In other words, instead of making a copy of your DWH like most CDPs ad MAPs do today, everything lives on top of the DWH and you don't have to pay for copying your db.Some companies solving this for product analytics are Rakam, Indicative, and Kubit. Census and Hightouch are also doing this, being warehouse-native activation tools sitting on top of a DWH and don't store any of your data. Some Messaging companies solving this use case natively on the cloud warehouse are Vero, Messagears, and Castled.Revolutionized Data Handling in Customer Engagement PlatformsIndia Waters currently leads growth and technology partnerships at MessageGears. She explains how her company's differentiation comes from its unique handling of customer data.Unlike competitors such as Salesforce Marketing Cloud or Oracle, which require a copy of customer data to live within their tool, MessageGears directly taps into modern data warehouses like Snowflake or Google BigQuery. This unique approach is born out of the inefficiency and high costs of older platforms that necessitate copying and moving data into multiple marketing tools.India vividly portrayed the challenge this old approach creates, imagining the confusion and resource consumption of working with out-of-date data across numerous tools. By not having to have a copy of customer data, MessageGears solves this problem for big companies, eliminating waste and creating a more coherent understanding of the customer's journey. Clients like OpenTable, T-Mobile, and Party City can now work with the most up-to-date data, using it as a source of truth for better analytics and customer experiences.Reflecting on how MessageGears had to become thought leaders in this approach, India acknowledged that it took time for the industry to understand and accept this innovative method. But as awareness has grown, the approach is now seen as a logical and necessary step in the evolution of customer data handling.Takeaway: MessageGears' refusal to follow the traditional path of copying customer data into its tools is a game-changer in the world of customer engagement platforms. By plugging directly into modern data warehouses, they've solved a problem that has plagued big companies, enabling them to use the most up-to-date data for insights and experiences. The industry has evolved, and MessageGears is leading the way with an approach that makes sense for today's data-driven world.Rethinking User Database Size Pricing in MartechWhile MG has been around since 2011, more and more startups are waking up to the idea of directly accessing brands' first-party data instead of relying on cloud data syncs. We also chatted with Arun Thulasidarhan, CEO & Co-founder at Castled.io. They're a warehouse-native customer engagement platform that sits directly on top of cloud data warehouses. Arun and his team set out to disrupt traditional martech to fix some of the fundamental problems as it relates to the significant disconnect between the number of users a company pays to store in their database and the actual value derived from them.He emphasized that having millions of users doesn't necessarily translate to substantial revenue or value, especially for smaller B2C companies. He critically questioned whether traditional pricing models based on user database size were really delivering value for businesses. Arun then went on to explain how Castled.io approaches this differently, choosing a more logical and direct connection between cost and benefit. Unlike other martech firms that charge based on customer numbers, Castled.io bases its pricing on the number of team members using the tool. Arun argues that this is a more accurate reflection of the value a company gets from the service, as more marketers using the tool likely means a more substantial investment in the platform. He also touched on how they handle data look-back periods and the importance of data retention for retargeting and reengagement. With traditional systems, data engineers might have to wait for months, while with Castled.io, the data is readily available in the data warehouse. The integration of data warehousing and marketing tools, according to Arun, is the future of martech pricing – something he sees as a "no-brainer." Takeaway: Traditional martech pricing models have significant inconsistencies, often failing to align the number of customers with the real value obtained. Castled.io challenges this paradigm by pricing their services based on the number of team members using the tool and ensuring that data retention aligns with business needs. This more logical and direct approach may be an essential step forward for the martech industry, promoting fairness and value over mere numbers.Aligning Pricing Metrics with Customer NeedsMessageGears and Castled.io's groundbreaking approach in martech isn't merely an isolated occurrence. It's part of a broader trend that calls for a deliberate rethinking of pricing metrics within the industry. This movement emphasizes the alignment of price with real value and accessibility. It's worth highlighting the intricacies of selecting the right pricing metric. We spoke with Dan Balcauski, a SaaS pricing expert who highlights that it's not just about being innovative; it's about making choices that truly resonate with customer needs and market demands. Dan delved into the complexities of pricing metrics and how they can be used to either aid or hinder competitive differentiation. Though he admitted that his knowledge of the specific market wasn't extensive, he was able to break down the various facets of pricing strategy, sharing an intriguing case study to illustrate his point.Dan emphasized the importance of choosing a pricing metric that aligns with customers' business requirements and the perceived value of the product. This metric, according to him, must balance fairness, predictability, and operational ease for both the buyer and the seller.He highlighted the example of Rolls Royce's innovative approach to jet engine pricing, where they chose to charge "power by the hour" instead of selling the engines outright. This usage-based model aligned the interests of the buyer and seller, streamlining many ancillary aspects such as maintenance and replacement.However, Dan also warned against unnecessarily complex or "cute" pricing metrics. He stated that success in implementing innovative pricing strategies likely comes easier to industry leaders or highly innovative products. Trying to be different just for the sake of it can lead to confusion and additional costs in educating the market.Takeaway: In the world of martech, warehouse-native pricing changes are a nuanced subject. As Dan's analysis reveals, the successful implementation of a pricing strategy requires a careful balance of alignment with customer needs, perceived value, predictability, and operational efficiency. Innovative approaches can bring success, but they must be implemented thoughtfully and with a true understanding of the market. Being different for its own sake may lead to complexity without adding real value.The Undeniable Movement Towards a Universal Data LayerBefore getting into the weeds of the viability of this shift, let's get the lowdown from one of the most respected voices in martech. You guessed, we're talking about Scott Brinker. The Martech Landscape creator, the VP of Platform ecosystem at Hubspot and the acclaimed force behind chiefmartec.com, hailed universally as the martech world's ultimate wellspring of knowledge and insight.Scott sees a clear trend in martech towards consolidating data across the company into the warehouse, and making that data accessible across various applications. He doesn't hesitate to point out that this is a bit different from being truly warehouse-native, which raises questions about the architecture layers and the way data interacts operationally with the warehouse.On the exciting side, Scott highlights the robust experimentation in the field. However, he's keen to identify the challenges too, such as the need to rationalize data that is inherently messy when consolidated into data lakes and warehouses. The sheer volume and complexity of data require layers of interpretation and structuring, something that individual martech products often provide.Scott also highlights the performance dimension, noting that while technological advances have improved the read/write performance of data in a warehouse, there are still cases where millisecond differences in performance can have critical impacts on user experience or search engine rankings. He sees the need for operational databases fine-tuned to specific engagements as a continued necessity in the martech architecture.In the end, Scott recognizes the undeniable movement towards a universal data layer where martech companies are being driven to contribute and leverage data from the warehouse. However, he doesn't see it as something that will entirely replace all localized and context-specific databases in the immediate future.Takeaway: Scott provides a balanced and insightful perspective on the warehouse-native approach in martech, seeing it as an interesting and evolving aspect but not a complete solution. He emphasizes that while consolidation and accessibility of data are crucial, the complex nature of data, performance considerations, and the need for specific databases mean that the warehouse-native concept is still more of a developing direction rather than an established end point in the martech landscape.The Necessity of Cloning Data in Warehouse Native MartechAs we talk about shifts in the data management landscape, Pini Yakuel, the CEO of Optimove (Marketing Automation Platform) provides a practical example of these changes, discussing how they're CDP component is built on top of Snowflake.Pini dived into the subject of warehouse native martech with a keen eye for architectural details. He spoke candidly about the convenience of copying data from one place to another and the efficiency of Snowflake, allowing for a seamless client experience. A clear advocate for this technology, Pini mentioned how companies can leverage Snowflake to have data easily accessible without having to move it around. The snowflake-to-snowflake data mirroring, for instance, eliminates the need for ETL, providing a significant advantage.However, Pini didn't shy away from the challenges either. The same technology that enables quick data processing doesn't necessarily translate into fast response times for user experience. For instance, Snowflake, being an analytical data warehouse in the cloud, may not respond quickly enough for UX requirements.Pini concluded with an optimistic note about the future, mentioning that Snowflake and BigQuery are emerging as significant players. But, he also acknowledged that the need to have copies of data close for certain operations still exists, leaving room for technology to evolve further.Takeaway: While warehouse native martech, especially through platforms like Snowflake, offers incredible convenience and has been a game-changer, it's essential to recognize the need for closer data positioning in some cases. The current landscape is promising, but the future might hold even better ways to copy and utilize data without hindrance.The Misguided Myth of Zero Data CopyWhether it's technically possible or not, not everyone is on board with the notion of zero copy data, using martech without ever needing to copy data in any of your tools. Enter Michael Katz, CEO and co-founder at mParticle, the leading independent and packaged customer data platform.When asked about the concept of zero data copy and why he considers it misguided, Michael passionately dove into the core of the argument. He began by highlighting that copying data's implication of creating inefficiency, particularly in terms of access cost, is fundamentally flawed. In his view, the cost of storage is negligible compared to the cost of computation, a fact well understood in the industry. Hence, creating duplicate copies of data doesn't significantly change the overall cost structure.Michael then went on to emphasize that it's been demonstrated time and time again that replicating data brings tremendous efficiency for various uses and applications. He further expanded on his argument by noting that the belief in zero data copy not only misleads but also directs individuals and companies down a path of solving non-existent problems. He remarked that the focus should be on minimizing costs to maximize resources for growth, not chasing an illusion of efficiency.Adding another layer to his argument, Michael revealed the dirty secret behind many reverse ETL companies, citing a persistent churn problem. These companies, he pointed out, offer what appears to be an "easy button" solution, but when the button is pressed, things turn out to be far from easy.Takeaway: Michael's debunking of the zero data copy concept is a compelling reminder that chasing illusions can lead to more harm than good. The true focus should be on understanding the problem at hand and allocating resources wisely, rather than getting lost in the allure of simplified solutions that often prove ineffective. This insight urges us to be more discerning in evaluating the effectiveness and underlying motives of the tools and strategies we adopt in the world of martech.Solving the Puzzle of Compute Charges in the Cloud Data WarehouseMany industry experts agree with Michael that one of the biggest hurdles for warehouse native martech is computing charges and creating a load on your DWH/Snowflake that can add up quickly. Here's what Arun from Castled.io had to say about his solution for this compute challenge.When asked about how to tackle the prevalent problem of compute charges in existing cloud data warehouses, Arun clearly outlined the importance of addressing this issue. In his view, it's more than just a concern about expenses; it's an integral part of deciding to have a data warehouse, which still holds great value to many.Arun dove into the core of the problem, explaining that once a data warehouse has been implemented, businesses often aim to not only enable data analytics but also marketing, where significant investments are made. This decision leads to one of the major reasons behind the compute charges: hiring bulk analytic engineers, many lacking the necessary experience to write optimal SQL queries.Arun's perspective on the solution is straightforward and rooted in his experience. For him, once the data is collected in the data warehouse, the most scalable model involves using warehouse-native applications like Castled.io. These applications reduce the charges by running all kinds of load tests to ensure minimal and optimal expenditure. Arun emphasized the care taken to ensure that even a minor filter change doesn't lead to unnecessary extra charges.Takeaway: Arun's insights highlight a common yet overlooked aspect of cloud data warehouse management: compute charges. By understanding the root causes and adopting warehouse-native applications, companies can not only minimize these charges but also maximize the value and efficiency of their data warehouses. His approach illustrates a thoughtful and scalable way to ensure that technology investments align with financial considerations.Is Warehouse Martech More Beneficial for Cloud Providers Than Customers?Despite hearing this solution on compute charges and the benefits of zero copy data, Michael Katz, CEO of mParticle held firm on his stance going back to the value to customers.Michael began by laying out a common structure of the marketing tech stack, mentioning different components such as analytics, customer engagement platforms, experimentation tools, and customer support services like Zendesk. In this context, he highlighted that between five and ten different categories could be observed across most martech stacks.Michael then questioned the real beneficiaries of building everything natively on a Cloud Data Warehouse. He argued that such an approach seems to favor the data warehouse provider rather than delivering genuine value to the customer. Moreover, he expressed skepticism about the notion that having all vendors run their own compute cycles on the data warehouse would necessarily lead to cost savings. He pointed out that while theoretically possible, no one has conducted a side-by-side comparison to prove that assumption.Further, Michael emphasized that whether dealing with providers like Snowflake or mParticle, everyone is in essence reselling cloud compute, either with a markup or bundled into services. The assumption of inherent cost savings, he asserted, doesn't stand up to scrutiny, and the claim that avoiding the creation of multiple copies of data will automatically save money is not necessarily true.Takeaway: Michael's examination of the warehouse native approach reveals that what might seem like a cost-saving strategy on the surface might not deliver real benefits to the customers. This insight warns against blindly accepting theoretical advantages without concrete evidence, encouraging a more nuanced understanding of how value is truly generated in the martech world.Why Zero Data Copy in Martech is Not a Black-and-White IssueMichael's scrutiny of the warehouse native approach invites a broader conversation about adaptability and tailored solutions in martech. It challenges the standard view, paving the way for alternative methods that don't cling to conventional wisdom. Recognizing that one approach doesn't fit every scenario, some companies are proposing a hybrid approach and shaping the conversation around customization and efficiency.In this camp is Tamara Gruzbarg, VP Customer Strategy at ActionIQ – an enterprise Customer Data Platform. When asked about the widespread arguments dismissing zero data copy as a flawed concept, Tamara offered a thoughtful perspective. She didn't outright reject the notion, but rather emphasized the importance of not viewing it in black and white terms. In her view, the concept of zero data copy isn't necessarily something that will work for everyone in the immediate future, but that doesn't mean the industry shouldn't be moving in this direction.Tamara continued to explain that once sufficient work has been done to create a robust data environment within a client's internal structure, there's a real opportunity to leverage that investment. It's about using the data from its original location to minimize costs, rather than insisting on either 0% or 100% adherence to a zero copy or fully composable CDP model.Speaking from her experience at ActionIQ, she emphasized the value of creating a "future-proof" environment where different components from the best vendors or internal solutions can be utilized. This approach allows for adaptability, not locking into a rigid framework, and instead opting for a path that works for the individual needs of a company, with the capacity to optimize over time.Takeaway: Tamara's insight sheds light on the nuanced reality of the zero data copy debate. Rather than clinging to absolutes, she encourages a more flexible approach that aligns with the individual needs and future directions of a company. Her focus on creating a future-proof environment underscores the importance of adaptability and optimization in the ever-changing martech landscape, without falling prey to rigid ideologies.Warehouse Native Martech Impacting Enterprise More Than SMBsThe push for flexibility and optimization in data handling hints at a wider trend affecting large enterprises. This focus on warehouse native solutions aligns more closely with the complex needs of large organizations than with SMBs, setting the stage for a broader industry shift that some experts continue to explore.One of these experts is Wyatt Bales a senior exec at BlueprintX, an enterprise focused-martech and growth agency. When asked about the potential future of martech being warehouse native, Wyatt presented a comprehensive view on the subject. He emphasized that this path is indeed the way forward for enterprises, defining these as organizations with 10,000 employees or more. Wyatt agreed that traditional tools, such as duplicated databases and interfaces for marketing automation, are being replaced by more sophisticated and flexible solutions.He shared insights from current projects, where customers are rethinking their approach and moving towards more direct communication through APIs and delivery services. This transition, according to Wyatt, is not only efficient but also resonates with the changing needs of enterprise clients.However, he didn't see this trend affecting the Small and Medium Business (SMB) sector in the same way. The traditional path of migrating from simpler tools like MailChimp to more advanced platforms like Marketo still holds relevance for SMBs. Wyatt predicts an emerging trend where SMB markets might see the integration of work management tools, such as Asana, with marketing automation platforms. This would provide an end-to-end solution that meets the specific needs of smaller businesses.Wyatt also highlighted the importance of adaptability in skillsets, particularly within the context of warehouse-native solutions. Emphasizing the value of SQL knowledge, he discussed how organizational decisions and structures are changing, affecting even hiring and staff positioning. The future, according to Wyatt, is not only about mastering specific tools but also having the ability to talk about cloud storage, integrations, and other technological advancements. He stressed the importance of versatility in skillsets, particularly in a landscape that is rapidly shifting towards warehouse native solutions.Takeaway: The future of martech is clearly leaning towards warehouse native solutions for enterprises, reflecting a desire for flexibility, efficiency, and direct control. However, this shift is not universal, and Wyatt points out that SMBs will continue to have different needs and paths. The landscape is evolving, and success will depend on adaptability, both in technology and in the skillsets of those navigating this complex ecosystem.API Connections Versus Warehouse Native ApproachThis being more impactful for enterprise is an argument that's echoed by MessageGears when talking about the difference between APIs integrations and the warehouse native approach. Here's India Waters from MessageGears (again).India described the contrasting experiences of these two models by focusing on the real-world implications. She broke down the seemingly straightforward task of setting up individual APIs for real-time data access, especially in small to medium-sized businesses.The problem, India explained, lies in the constantly changing environment. Whether it's adding new fields or updating existing ones, the complexity of these tasks grows exponentially. When businesses try to synchronize tools like SalesLoft, Salesforce Pardot, or even something as specific as demand-based sales tools, the complexity doesn't just double; it becomes an almost unmanageable challenge. Imagine a company like Best Buy or Home Depot, with countless customers and enormous volumes of first-party data. The complexity becomes a daunting puzzle.India's solution through MessageGears provides a refreshing perspective. By allowing businesses to view their modern data warehouse without the burden of storing data, the approach untangles the web of syncing, matching, and complying with new data privacy laws. India expressed a frustration with those who still don't get this new approach, highlighting how the warehouse native model renders concerns like HIPAA compliance almost irrelevant.Takeaway: India's insights shed light on the intricacies of API connections versus the warehouse native approach. Her detailed explanation helps us understand how even simple tasks can become a tangled web as business grows. By adopting innovative solutions like MessageGears, businesses can bypass these complexities, align with modern data privacy laws, and efficiently manage their data, demonstrating a forward-thinking approach to the technological future.Does Warehouse Native Martech Replace Reverse ETL Tools?Some of the emerging tools to replace API integrations are called reverse ETL, basically pushing data from your warehouse to your business tools. Some of the startups solving this are Hightouch and Census. The question though is, does warehouse native martech (sitting on top of the warehouse) also replace the need for reverse ETL solutions. Just like you might prefer using a bridge to cross a river rather than paying a ferry. Here's Arun again from Castled:When asked about whether warehouse native can replace reverse ETL tools, Arun provided a perspective that goes beyond a simple yes or no. His insights highlight the intricate balance between technology and purpose.Arun explained that while warehouse native solutions can indeed eliminate the need for reverse ETL pipelines, it's essential to understand why a business would want to do so. The motivation to adopt warehouse native shouldn't be solely to eliminate reverse ETL; otherwise, the solution may fall short. With companies like Customer.io actively incorporating reverse ETL into their systems, a mere desire to remove reverse ETL isn't enough.Arun's approach emphasizes the problem-solving capabilities of the warehouse native approach. If there are tangible limitations in existing tools, and if a warehouse native solution can solve those problems, then the path becomes clear. But starting on this path just to eliminate reverse ETL, without considering the broader issues, would be a mistake.Takeaway: Arun's insights underscore the importance of aligning technology with genuine needs. Warehouse native solutions offer the ability to bypass reverse ETL, but this shouldn't be the sole driver. Businesses need to identify real challenges that can be addressed by warehouse native solutions, creating a synergy between technological innovation and problem-solving. Anything less is a fleeting pursuit that's likely to fall short.Established Platforms vs Warehouse Native Marketing Automation Obviously reverse ETL platforms are going to have some hot takes about this question. One of them is Tejas Monahar, the co-founder and co-CEO of Hightouch, a reverse ETL tool that's taken a controversial stance against the packaged CDP, claiming that it's dead and that they can replace it.Tejas noted that while ease of warehouse native tools are on the rise, he doesn't envision them taking over established platforms like Salesforce Marketing Cloud or Iterable. To Tejas, these tools can't replicate the variety of channels and functions available in existing martech solutions.Tejas explained that marketers need to utilize their data across all channels, and solutions like Hightouch make this process simple. He was unafraid to share that he's not bullish on the trend of warehouse native marketing tools dominating the space, as they do not address the unique needs of marketers. This includes all sorts of concerns that a Customer Engagement or ESP platform handles, not related to the data warehouse, such as data quality, governance, privacy, and identity risk.However, Tejas clarified that his stance does not mean there's no room for new businesses in martech like warehouse native. On the contrary, he sees a wealth of opportunities to build in this field, especially with localization and integration. What he doesn't foresee is a platform shift that replaces giants like Salesforce and Adobe. The focus should be on integrating the data and marketing sides of the business, and Hightouch is positioned as an ideal solution for this.Key Takeaway: Warehouse native tools and CDPs are growing, but Tejas argues that they will not replace the multifaceted capabilities of existing martech providers. While they may add some new functionality, their integration with traditional platforms seems more likely. The focus, he believes, should be on how marketers can use data effectively across all channels, and he sees Hightouch as the perfect solution to bridge the gap between data and marketing needs.Effortless Data Movement and Apps as Lightweight Caches on Core WarehousesNot all reverse ETL vendors have a negative view on warehouse native approach though. Boris Jabes, the co-founder and CEO at Census, another Reverse ETL tool, has a different perspective.When Boris was asked about the future of warehouse native martech and its potential to replace reverse ETL, his response not only highlighted a promising vision but also revealed Census's pioneering role in the field.Boris acknowledged the attraction of a world where warehouse native martech diminishes fragmentation and promotes consistency in written data. He was quick to point out that Census has been a trailblazer in this domain, adopting a warehouse native solution even before the term was coined. This, he said, is a testament to the company's innovation and leadership in the space.He detailed Census's offerings, such as the Audience Hub, a segmentation experience native to the warehouse. These solutions not only reflect Census's deep understanding of warehouse native systems but also underscore the company's commitment to letting marketers activate data without hassle.However, Boris also emphasized the challenges and necessities of this path. Perfect data in the warehouse is key. Understanding the relationships between different sets of data, customizing relationships, and validating data before use are all integral to making warehouse native martech work seamlessly.Boris's vision culminated in the anticipation of a world where data movement systems are no longer a concern, and every application becomes a lightweight cache on the core data warehouse. Though he believes in this future, he cautioned that it may take time to come to fruition and urged companies to focus on transforming and modeling user data.Key Takeaway: Boris's insights cast a spotlight on the potential of warehouse native martech, with Census leading the way before it was even a recognized term. His vision of applications as lightweight caches on core data warehouses paints a compelling future. Yet, it's grounded in the reality that clean, well-structured data and a deep understanding of relationships between data sets are crucial to making this dream a reality. The path is laid out; the journey, according to Boris, requires focus, innovation, and a commitment to quality.Closing Thought on Warehouse Native MartechThe shifting tides of warehouse-native technology are promising but they come with a fair share of skepticism. This shift is not just a simple tool swap, but a nuanced evolution requiring careful understanding and strategic decision-making, shaped by a company's unique needs and data maturity. Is zero data copy really achievable?Does it save costs for the customer? Or does it benefit the cloud warehouse companies?How long will local database copies be a requirement?Can compute charges be solved with higher quality queries?Will warehouse native martech affect more enterprise or startup companies?Does warehouse native martech replace the need for reverse ETL pipelines?Yet, amid the complexity, and all the questions, a promise shines through - a future of reduced data pipelines, seamless integration, and more efficient, direct data access. The challenge, as well as the opportunity, lies in the journey towards that future, a journey fueled by the symbiosis of pioneering tools and clean data.You heard it here first folks: As we navigate the transformation to warehouse-native martech, the single most critical action is to prioritize achieving high-quality, well-structured data; it's the golden key to unlocking the full potential of these emerging tools and strategies.✌️--Intro music by Wowa via UnminusCover art created with MidjourneyMusic generated by Mubert https://mubert.com/render
Mark Rittman is joined in this final episode of the current series of Drill to Detail by returning guest Jordan Tigani, Co-Founder & CEO at Motherduck to talk about the journey from big data to small data and bringing hybrid cloud execution to DuckDB.Big Data is DeadTeach Your Duckdb To FlyAnnouncing Motherduck: Hybrid Execution Scales Duckdb From Your Laptop Into The CloudDrill to Detail Ep.64 ‘Google BigQuery, BI Engine and the Future of Data Warehousing' with Special Guest Jordan Tigani
Mach5 Search is a slide-in, cloud-native replacement for Elasticsearch and OpenSearch that immediately saves up to 90% in operating cost. Mach5 Search can run on top of Google BigQuery, Snowflake, and Databricks, or natively on Object Stores in all the major clouds. Vinayak Borkar is the CEO and Co-Founder of Mach5 Software and he joins The post Cloud Native Search with Vinayak Borkar appeared first on Software Engineering Daily.
Mach5 Search is a slide-in, cloud-native replacement for Elasticsearch and OpenSearch that immediately saves up to 90% in operating cost. Mach5 Search can run on top of Google BigQuery, Snowflake, and Databricks, or natively on Object Stores in all the major clouds. Vinayak Borkar is the CEO and Co-Founder of Mach5 Software and he joins The post Cloud Native Search with Vinayak Borkar appeared first on Software Engineering Daily.
Mach5 Search is a slide-in, cloud-native replacement for Elasticsearch and OpenSearch that immediately saves up to 90% in operating cost. Mach5 Search can run on top of Google BigQuery, Snowflake, and Databricks, or natively on Object Stores in all the major clouds. Vinayak Borkar is the CEO and Co-Founder of Mach5 Software and he joins The post Cloud Native Search with Vinayak Borkar appeared first on Software Engineering Daily.
This episode features an interview with Yullia Tkachova, Co-founder and CEO of Masthead Data, an observability platform that catches anomalies in Google BigQuery in real-time. She holds degrees in Management Information Systems, Math, Statistics, and Marketing. Prior to Masthead, Yuliia designed complex BI products and solutions powered by ML and utilized by Fortune 500 companies.In this episode, Sam and Yuliia discuss how ML is shaping the future of data analytics, caring about users, and the fundamental human right to privacy.-------------------“We map those errors and anomalies on lineage, helping to understand what upstreams and downstreams are affected, what business users are affected. And that actually speeds up all the troubleshooting from hours to minutes. And this is the ultimate goal where we deliver. Because again, my belief that if you don't have this lineage piece was mapped anomalous in errors, it's not observability. It's monitoring. [...] What is also very unique to us, because Masthead operates on logs, it's triggered by logs. So, we do support streaming data. Unlike SQL-first solutions, as you can guess. We don't have to run SQL queries to see if they're anomalous, we're triggered by logs. And this is also what sets us apart.” – Yuliia Tkachova-------------------Episode Timestamps:(01:14): What got Yuliia excited about math and statistics(11:31): The basic human right to privacy(18:21): What open source data means to Yuliia(28:00): Yuliia's reason for building a solution focused on privacy and security(38:09): One question Yuliia wishes to be asked(42:21): Yuliia's advice for the audience(44:46): Backstage takeaways with executive producer, Audra Montenegro-------------------Links:LinkedIn - Connect with YuliiaVisit Masthead Data
Joining Mark Rittman for this 101st Episode Special for the Drill to Detail Podcast is Tristan Handy, CEO and Founder of dbt Labs talking about what went right at RJ Metrics, how the Analyst Collective led to today's community around the open-source dbt project and his personal journey from being in the lab building Fishtown Analytics to CEO of today's hottest data analytics startup … and why he secretly wishes he was Mark (according to Mark).Ep.100 Special ‘Past, Present and Future of the Modern Data Stack' with Special Guests Keenan Rice, Stewart Bryson and Jake Stein (Drill to Detail Podcast)My $2.6 Billion Ecosystem Fail: an RJMetrics Post Mortem (Bob Moore)How Best-in-Class eCommerce Businesses Achieve 230% Growth (2x eCommerce)Introducing the RA Warehouse dbt Framework : How Rittman Analytics Does Data Centralization using dbt, Google BigQuery, Stitch and Looker (Rittman Analytics Blog)Goodbye RJMetrics, Hello Fishtown Analytics (Tristan Handy)Ep.33 'Building Out Analytics Functions in Startups' With Special Guest Tristan Handy (Drill to Detail Podcast)Analytics is a Trade (Tristan Handy)Analyst Collective website (via the Internet Archive)Building a Mature Analytics Workflow: The Analyst Collective Viewpoint (SlideShare)Fishtown Analytics : Frequently-Asked Questions (via the Internet Archive)Ep 41: dbt Labs + Transform join forces on metrics w/ Nick Handel + Drew Banin (Analytics Engineering Podcast via Spotify)
Neste episódio conversamos com Hugo Baraúna, criador e mantenedor da newsletter Elixir Radar e cofundador da Plataformatec (a empresa onde Elixir nasceu). Escute esta entrevista no YouTube em https://youtu.be/P59Au1a4DAo Hugo Baraúna https://br.linkedin.com/in/hugobarauna https://github.com/hugobarauna https://twitter.com/hugobarauna https://hugobarauna.com/ Elixir Radar https://elixir-radar.com/ Elixir Patterns book https://elixirpatterns.dev/ Elixir Ecosystem 2020 responses data https://github.com/hugobarauna/elixir-ecosystem-2020-reponses-data TDD e BDD na prática: Construa aplicações Ruby usando RSpec e Cucumber https://www.amazon.com.br/TDD-BDD-pr%C3%A1tica-Construa-aplica%C3%A7%C3%B5es-ebook/dp/B08FPVS5TW/ Respondendo a Elixir Ecosystem Survey 2020 https://www.youtube.com/watch?v=SVNy2RD5SY0 What's new in Livebook 0.7 https://www.youtube.com/watch?v=lyiqw3O8d_A How to query and visualize data from Google BigQuery using Livebook https://www.youtube.com/watch?v=F98OWdigCjY O que é um sabático? Pra que serve e o que você aprende com ele? https://hugobarauna.com/o-que-e-um-sabatico-pra-que-serve-e-o-que-voce-aprende-com-ele/ O case da Plataformatec com o Elixir - Hugo Baraúna https://www.youtube.com/watch?v=XnEAllPTNWw Elixir Code Smells https://anchor.fm/elixiremfoco/episodes/15--Elixir-Code-Smells-com-Lucas-Vegi-UFV-e-Marco-Tulio-Valente-UFMG-e1jb1bb Hugo Baraúna: Cofundador da Plataformatec https://anchor.fm/policast/episodes/Hugo-Barana-Cofundador-da-Plataformatec-e1pa4mt Inside Elixir Radar with Hugo Baraúna https://podcast.thinkingelixir.com/15 Nosso canal é https://www.youtube.com/@ElixirEmFoco Associe-se à Erlang Ecosystem Foundation em https://bit.ly/3Sl8XTO. O site da fundação é https://bit.ly/3Jma95g. Nosso site é https://elixiremfoco.com. Estamos no Twitter em @elixiremfoco https://twitter.com/elixiremfoco. Nosso email é elixiremfoco@gmail.com. --- Send in a voice message: https://podcasters.spotify.com/pod/show/elixiremfoco/message
Googles aktuelles Product Reviews Update bezieht erstmals neben Englisch auch weitere Sprachen ein, zu denen auch Deutsch gehört. Was bedeutet das für deutschsprachige Websites? Wie schafft es das neue Bing, in seinen Chatantworten auch aktuelle Informationen und Ereignisse zu berücksichtigen? Zwei neue Features für die Google Search Console gab es in dieser Woche. So können jetzt große Datenmengen nach Google BigQuery exportiert werden. Ein 'noindex' kann per Redirect auf andere Seiten übertragen werden.
Universal Analytics is sunsetting in July 2023, and its replacement, Google Analytics 4, isn't exactly getting a warm reception. For digital marketers, SEOs, analysts, and basically anyone else who got used to GA3 over the past decade, it's a bitter pill to swallow.Ok, I'll confess: I've been a bit further behind on Google Analytics 4 than I wanted. Like many marketers, I struggle to balance martech innovation against the reality of my day-to-day life. I admit I had been procrastinating on learning GA4, but no more.I've spent the last few months going as deep as I can on GA4 and, by extension Google Tag Manager. I'm not going to sit here and tell you that GA4 is Google's gift to digital marketing – I think it's still an immature platform.I am going to tell you GA4 is getting a much worse rap that it deserves precisely because so many marketers have been deep in GA3/UA for so long. Changing habits is tough, and GA4 makes it more challenging because of a new interface, not too mention a completely different approach to web analytics. No big deal - you can learn all this in a Sunday afternoon, right?Yeah, that's going to be tough.Today I'm going to give a procrastinator's guide to GA4. If you're expecting me to complain about GA3, this episode isn't for you. We'll mourn the loss of GA3, briefly, but then move on to making the most of this situation. I don't think GA4 is all bad – actually, GA4 is pretty slick and I think if it weren't for the contrast to its predecessor, many folks would be pretty happy with GA4. – – – Alright JT, it's great to be back behind the mic. We're starting off with a fun one here. I'll admit I've been out of touch with top of funnel reporting and analytics for the last couple years so I'm excited to learn about GA4. There's rightfully been a lot of noise since its release in October 2020… maybe we can start there actually, the Google decision. Google has basically said that they are making the switch from Universal Analytics (UA) to Google Analytics 4 (GA4) in order to provide users with “more advanced tracking for digital marketers” But aside from new features like automated events, cross-device reporting and BQ support, there's a lot more behind the decision to make the switch.Why is Google making the switch from UA to GA4?Needs attribution: Lawsuits in EU where UA used as evidence Privacy regulations End of 3rd party cookies, rise of first party cookies Single-page applications Event-based measurement So October 14, 2020: This was the date when Google officially announced GA4 and began rolling it out to users. What dates should marketers be aware about when it comes to the “forced switch from UA?”What are the important dates and why are they importantJuly 1, 2023, data collection stops. 6 months later, you won't be able to access your dataYou've got 6 months to move to GA4 or another web analytics solution or you'll be flying blind… You need a solution for your historic data (excel, bigquery, or API)Sounds like it's time to put down that Netflix remote, grab a cup of coffee, and dive into the exciting world of GA4!It seems like such a big hurdle… JT, how can marketers start to learn GA4?How do I learn GA4There's going to be a few layers to learning GA4. Let's break it out into 2 roles: Web Developer, implementation Digital marketer or web analyst For web developers or implementers, GA4 can be installed directly on your website by inserting the code directly onto each page. This isn't new. I think what is new is that GA4 is much more closely tied with Google Tag Manager. It is absolutely the recommended way to install and configure GA4. There's a whole episode or series about Google Tag Manager we could do, but the short of it is that GTM gives you a huge toolset to do tons of cool stuff: event tracking, sending additional data through dataLayer, and modifying your implementation without having to directly modify your website.If you're not already using GTM, GA4 should push you to start using it.For digital marketers and analysts, the task is about getting used to the new interface, migrating configuration settings from GA3, and making a habit of pulling reports from GA4. The big hurdle here is matching up the data from both tools – because I've never actually seen both tools give the same number.I think this is what people are most unprepared for: the new reporting paradigm and definitions. Things like users have modified definitions, in no small part because GA4 is better at tracking individual users and corrects known errors in GA3. However, whenever a disparity in the numbers arise, much hand-wringing and gnashing of teeth ensues…So getting it installed and playing around with new features is one of the first things folks should be doing. Data history and collection is important.These new features are more powerful and are said to help you better understand and optimize your digital marketing efforts… JT, what are some of the new features that excite you the most compared to UA vs GA4?What is different between GA3 & GA4Bounce rate, conversion tracking, user definitions;Event-based approach, more akin to product analytics tools, and, frankly, this is better for modern web (problem: vast majority of sites aren't on modern tech) User Interface Data collection and real-time data Data retention So gone are the days of needing to manually set up event tracking codes for specific things like we had to do in GTM? No, still more than enough in GTM. Enhanced Measurement gives us some events out of the box that seem to mostly work for some websites. Events are much better in GA4 – can send custom parametersOne thing a lot of folks mention is improved cross-device reporting, have you dived into this? How is Google associating traffic from multiple devices to unique users?I'm more of a Redshift guy than Big Query these days but I do feel like the switch to GA4 is also pushing many users to adopting Big Query right? GA4 includes native support for BigQuery integration, which allows you to connect your GA4 data with other data sources in Google BigQuery.JT what do you like the most about GA4 so far? Is it the Conversion Probability report or the Customer Lifetime Value report? Or just the new UI and design? What does Jon like about GA4?It might seem like putting lipstick on a pig, but I kind of like GA4. Maybe I'm just coping a bit or being obstinately positive, but change is the name of martech. This isn't the first time I've had to switch tools against my will, and it won't be the last.Everything is a tool, and GA4 is no different. Events are customizable and don't have to send same parameters/fields as UA. You can send anything which is powerful when looking at custom data. Conversion events are much more accurate (citation) Reports are much more customizable and better looking Machine learning to surface insights Some of the coolest ML insights come in the form of predicting the likelihood that a user will convert on your website or app. This is based on their behavior and other factors. So theoretically, your business can better identify high-potential users and tailor your marketing efforts accordingly.Do you know what this looks like practically? Can you push segments of these users to BQ then Hubspot and send custom emails or better yet, to your product and surface different offerings?So like we said, there are many ways to learn GA4, including online courses, tutorials, and guides. Start reading through the documentation and tutorials provided by GoogleInstall it and play around… what else? Time to panic?How do we prepare for the inevitable? Is it time to panic?Absolutely - it's the best time to panic. So get it out of your system. No matter what I say or you say, GA3 will be sunset and GA4 will be your default option.There are obviously two parts to this. First is the implementation – if you haven't set up GA4 then you need to get that set up as soon as possible. Data retention for custom reports maxes out at 14 months, so you'll want to collect some of that historical data.You'll want to make sure that GA4 is fully configured to track events and conversions you were tracking previously. This could be an opportunity to clean up events and even send more data. The second is reporting or, probably more accurately, the human side of the equation. The discrepancy in numbers from each system is difficult to articulate, particularly to your management or executive team. The improved accuracy of user and conversion tracking can make for pretty different numbers. Typically, I've seen lower numbers of users but higher conversions, which jives with what GA4 is supposed to do.I think there's some marketing therapy and education that needs to happen for a lot of teams. Explaining GA4 is sort of like I can explain it to you, but I can't understand it for you. I admit it's taken a bit for the differences to sink in, and I'm still not sure I can confidently explain them.This is going to be our challenge moving forward – why are these numbers different, and how do I explain this to my executive team?Unfortunately, the messenger is almost always shot first! This is one of the worst feelings ever, using one tool you uncover a big win, maybe it's a conversion rate lift coming from a specific subset of pages you created or recently optimized.You shout it from all public Slack channels only to find out that someone else using a different tool doesn't see the same lift at all… and thus ensues the debate… who's got the accurate data?Conclusion – Teaser of next GA4 episodeAll right, this was a fairly surface level conversation on GA4 – in a future episode, however, we're going to get a lot deeper into GA4, the cool shit. I like a lot of these features, and rather than mourning UA, I'm going to look forward.In conclusion, GA4 is here and there's nothing you can do to hold onto UA. Changing habits is tough, and GA4 makes it more challenging because of a new interface, not to mention a completely different approach to web analytics. You should move on to making the most of this situation and start playing around with it… maybe you'll agree with JT and find that GA4 is pretty slick and I think if it weren't for the contrast to its predecessor, many folks would be pretty happy with GA4. ✌️--Intro music by Wowa via UnminusCover art created with DALL-E
Block199.io Interview For our very first guest podcast, Rantum spoke with ElBarto_crypto a Dune wizard and crypto data scientist, working on research at Block119.io. Recently, he's researched and published on topics like NFT Hodler segmentation, airdrops, and NFT wash trading. How to check if a collection is made of hodlers or flippers Check wash trading volume for collections & platforms What tools el_barto uses to use to analyze collections What topics he plans to research next Rough Transcript [00:00:00] Rantum: All right. Uh, here we are. I, this is ran. Andrew, here I am with our very first guest. We've talked about this on the, if you've listened for a little while, we've talked about this on the intro for a long time, that we would have guests, we've, uh, yet to actually do that and. Well, that's up until today. Uh, we've had, uh, we have a data sign, another data scientist joining us today. [00:01:06] He actually reached out his Alberto Crypto as he goes by on Twitter. On Twitter. He reached out, uh, and asked to, to be a guest on the podcast. And, you know, I, I felt like we really needed to, to actually do it. So here we are. Welcome to the show, El Baro Crypto . [00:01:24] Elbarto Crypto: Gne. Well, thanks for having me. I've realized in life if you just asked people enough, they either get sick of you or they let you do something. [00:01:30] So in this case, I'm glad to be the latter. [00:01:33] Rantum: Yeah, yeah. Really excited to actually do this. You know, it's a good needy, you know, little instigator to, to make it happen. . Perfect. I love it. So tell me a little bit about yourself. How'd you get where you are? How'd you get into crypto? [00:01:48] Elbarto Crypto: Yeah, so I am actually a data scientist by trade. [00:01:51] I worked in marketing analytics for probably a little longer than a decade, doing everything from segmentation, machine learning just writing basic SQL queries. Got to spend a lot of time doing customer analytics for very large brands, uh, in the marketing space and. Sort of caught the Bitcoin bug in like end of 2014 ish and you couldn't do really anything at that point. [00:02:17] You could just send Bitcoin to each other and that was, that was fun. . And then yeah, Ethereum ICO came. Um, once again, people still just, you know, sending Ether around, maybe trading on XX Ether Delta, these crazy, um, underground exchanges. Uh, but once for, for a while, you know, really the only thing you could do is, is not get caught up in a scam or an ICO scam. [00:02:39] And then, uh, right around 2019, I believe it was, um, dune Analytics launched and that was like sort of. . That was sort of like the great equalizer for data scientists because now all of a sudden you have this platform that you could tap into to query the blockchain, you know, for free, honestly. And that was like the, the most beautiful thing. [00:02:58] You could just easily share careers with people, share dashboards with people. Then this Analytics community, sort of, joined around that. And then other products like SEN launched, um, flip side crypto, a lot of these, you know, data open blockchain on, on BigQuery. And so now all of a sudden you, you know, you didn't have to worry about data engineering running your own validator, running your own node. [00:03:18] You could just. Query data and you build, you know, beautiful data sets. And, uh, I think a lot of things like in the marketing analytics background, definitely applied to crypto just in terms of, you know, user retention who's actually using products, machine learning, things like that. So yeah, absolutely. [00:03:34] It's been a very natural transition to, uh, to analyzing data. And now, uh, now. It's gotten crazy. analyze a lot of data on Yeah. Yeah. , it's, it's a lot of fun though. Meet a lot of the people like yourself. So, yeah, [00:03:46] Rantum: similar background. I was, I did a lot in, in e-commerce, analytics, marketing, uh, for this and, you know, fem way into crypto. [00:03:53] And, and it's, it's, it's really exciting having access to all of the, the data as opposed to, you know, just what the, uh, the one company that you're working with, uh, provides. Right. [00:04:03] Elbarto Crypto: Yeah. I mean, that's such a good. Like I, I think people don't realize like how special this data set is if you go work for you know, like a big enterprise company, even as a data scientist, like, well one, like you probably don't get access to data right away. [00:04:17] You probably have to get approved. You have to, you know, build the trust of people to start analyzing data. Um, here from day one, you know, it's, it's open right here. You can even, I think Google BigQuery even gives you $300 free. Uh, so you can just tap in immediately and. Securing data. It's really, uh, it's really amazing. [00:04:35] Rantum: Yeah. Yeah. How did you. , what was, what did you get started with when you started on Dune? I mean, you saw that it was available, there's free data. What, what were the first projects that you got interested [00:04:46] Elbarto Crypto: in? Yeah, so the first thing was just really looking at like what, who's using, uh, who's using the platform? [00:04:52] And as many, many people know, my dog who's in the background, he's, uh, he's, he's running from farm to farm is what he's doing. Uh, many people, uh, I think at first it was just creating, you know, The raw blockchain data itself to really like, understand what's going on. I think, I think one of the biggest problems people face is there is this open data set, but immediately they, um, get inundated with these, uh, very specific contract calls and they have to, um, decode data and all of a sudden you're dealing with OS and hashes. [00:05:25] And I think at first people get really like overwhelmed. And I remember looking. , um, dune for the first time and just seeing all these contract abstractions, like, I think it, it must have started with Ave back then, but just thinking like, what is going on here? Like how this is, this seems impossible. So I think it definitely takes a lot of patience, but at first it was just, you know, how many people are using the Ethereum blockchain? [00:05:47] You know how many people a day are using it? How many people used it last month and now are using it this month again? So it was really, for me, it was really just basic before, um, really diving into like more more of the abstractions around like ave and then compound and really seeing that lending side and then really just trying to visualize like who's making big D trades, you know, just very basic, like, hey, just let's just sort dex trades by u SD volume and like, let's just see who's buying things. [00:06:14] So I think like, yeah, at first it was like super basic because honestly it, it's very overwhelming when you first look at it. Yeah, absolutely. [00:06:20] Rantum: I mean, having all of the data available is, is great. And it's also overwhelming, right? Having all the choices. Exactly. [00:06:29] Elbarto Crypto: Everything's thrown at you at once. It's uh, yeah, it's a lot of our. [00:06:34] Rantum: How did you, you know, or 2019 it was, you know, a lot of D trades, you know, then we got into, you know, the kind of defi after that and then, you know, saw, you know, NFTs come, you know, big in, you know, in 2021. I, you know, it's is when it was, you know, really took off. How did you, uh, What, what did, how did you start working with the N F T data? [00:06:56] Um, and how'd you find that different than working with some of the, uh, the early token data? Yeah, [00:07:01] Elbarto Crypto: I like, the reason I like N F T data a lot is because, um, the same token, i e d and the same, um, the transfers across trade. So I think one of the biggest problems you have with Dex trades and analyzing like defi data, , um, people can immediately go anonymous. [00:07:19] Like, if I make huge Dex trade I can then just go back into Binance or just send my phones back on a Coinbase or and then it just becomes like an, an empty hole. Like, well, what happens with that person? What happens with that token? But for n n T data, since it's all on chain and you can't really, and each token is, is essentially a, a unique token, it really lends to some interesting analytics around. [00:07:42] how long people hold NFTs. So I think that's what one of the things that drew me to it was, it's understanding the nuances are obviously like very complicated, like with lost shells, which we'll probably talk about later. But from the very beginning, it made sense to me that like ID one is always ID one and it always corresponds to the ID one. [00:08:03] So it, it really lends to some beautiful, like long-term a. , um, with NFT [00:08:08] Rantum: drill. Yeah. That, that non, that non fungible aspect really takes exactly [00:08:12] Elbarto Crypto: way, right, . Yeah, exactly. I think people take, I, I think, I think people don't realize like how unique that is. Uh, and definitely allows for, for interesting transac for interesting analysis. [00:08:22] Rantum: Yeah. I mean, I, I've definitely found that interesting. When you look at NFT collections, I mean, you can't look at trades as, you can't look at every trade as the same. I mean, there are. oftentimes reasons that things are trading at, at much, maybe something significantly higher than the floor. You know, sometimes there aren't, and you know, we know there's a lot of wash trading out there. [00:08:42] Um, have you found that, have you found that aspect difficult to decipher, you know, what is real versus, uh, sort of the wash trade or how much do you [00:08:52] Elbarto Crypto: Yeah. Uh, shout out to Hilda today for, for introducing a new, uh, uh, wash trading flag. So at first, like I remember this is like really dating myself. So like, well let's, you know, starting with like the looks rare, um, that was like when, when people first like, oh man, yeah, that, um, wash trade concept of like, hey, volume is exploding, but in reality like people are just nibis and terraforms. [00:09:19] Yeah, exactly. So I think it's like, yeah, I think it. It's so interesting because you, you, you can do very basic, you can do very good analytics, just the basic, you know, SQL group buy statements, and this is, this is Data Pod Science podcast. We'll just go right into and nerding it out. But you can do very like, easy analytics with, you know, group buy day, you know, what's the volume of n ft sales? [00:09:41] And I think a lot of a lot of mainstream publications like to just look at like, okay, um, volume is down 80. . But in reality, like it's obviously much more different than that. So, we now know that, uh, a lot of volume all looks rare was really just people farming will look for a token just mindlessly trading tokens. [00:09:57] And so from there it, you know, it'll, it, it forced people to sort of think a little deeper in terms of, you know, a trade is not just a trade and you know, what is an actual wash sale trade or and now I think. . It's interesting because blur almost, it, it forces you to rethink about analytics once again, because we know like what a, a wash trade is now. [00:10:18] But now there's like these very interesting, um, analytics going on about people, you know. Taking this one step further and doing all sorts of crazy placing orders like slightly above the floor price when they not be a wash sale or maybe swapping out NFTs. So like let's say if you own a board ape and you just want to farm, you don't really care what ape you own and you just kind of wanna farm the the blur sale, you can sort of just. [00:10:42] Sell eight, buy another one, sell that one, buy another one. You're not really changing anything. Even taking slate laws [00:10:48] Rantum: to Exactly, yeah. Of the blurred to eventual, blurred [00:10:51] Elbarto Crypto: token . Exactly. In the hopes of getting a $50,000 airdrop. Uh, so it's interesting now, like, as these N F T models evolve for, for revenue for these exchanges, like then doing the due diligence of analytics to, to figuring out, you know, what's. [00:11:09] Rantum: Yeah, it, it's, you know, we see these incentives and, I mean, we've seen all these, uh, sort of vampire attacks on open sea and different ways to try to maybe disincentivize people from gaming, the systems, and it seems like. Collectors or, you know, maybe not, and maybe those, I shouldn't really call those people the collectors. [00:11:28] These are the, the people out there. There's somebody that's going to figure out how to, to make the most of the system. And, you know, it's, it's a tough one when you to, to, to look ahead and realize what people are going to [00:11:38] Elbarto Crypto: do, . Oh, absolutely. Yeah. And I think it's, it, it's, I don't wanna say it's fa as an analytics person, but like it definitely makes you think, and it definitely challenges you to say, , how can I use data to, to understand what, what's really going [00:11:53] Rantum: on? [00:11:55] Yeah. I mean, it, it is, I mean, all of these different crypto charts, there's always, you know, looks all steady until there's a serious inflection point and something drastic changes. And, and you see this all the time. I mean, you know, and in just the volume that you see among opens sea and blur and, and looks rare and, you know, you, you sort of need to know the story behind why these, these things are changing so drastically. [00:12:17] Elbarto Crypto: Yeah, exactly. And I think that makes you a better analytics person too. Like I, you know, if, if you're looking at break into this sector, and you know, like on job interviews and things like that. Like definitely bringing up points of like, well, why do you think um, hey, this data looks like the way it is. Is it sustainable? [00:12:33] Exactly to your point. Like, is it sustainable? Like, what's really going on here? I think, do you really have a ability to differentiate, like in this space, if you can bring these, you know, differentiated analytics. So can you tell me [00:12:46] Rantum: a little bit about block 1 99 and then the N F T Hudler segmentation that you've been working? [00:12:53] Elbarto Crypto: Yeah. So, um, we are essentially just a research firm. Uh, we do research for, you know, individuals for protocols, uh, all across the stack. And so a while back someone was essentially wanted to understand, you know, what, you know, how people, how different N F T projects are and what their users look like. [00:13:15] So for example, like if you have an N F T project where. , most of the holders are people that you know, are these larger whales? Maybe just like trading in and out of things, you know, what does the long-term engagement really look like versus an N F T project where a lot of people. we're in it from the beginning. [00:13:37] And, um, they're true loyalists like of a project. And so what they were trying to understand was, you know, should I, you know, invest in this N F T project because are are the people that are really long term holders. So the way to do that, and, and it's really a, a beauty of the blockchain is you, you can, uh, you can essentially just take all the holders of a certain n ft collection and then you. [00:14:02] uh, count how many NFTs they own. Some the amount of volume that's been spent look at, you know, are they how much wash sales have they done? How many floor sweeps have they done? So you essentially have this data frame of, you know, the 10, you know, the 5,000 holders and, you know, what's all their N F T activity. [00:14:19] From there, you can just like run a kme segmentation to find these different, um, groups of, of people and then figure out like, okay, well what is their activity in the entire N F T space? So if you see that, you know, 90% of the holders, you know, on average holding N F T collection for 20 days, or are the, you know, top 10% of traders on X two, Y two, you may think twice about buying that FT collection because, you know, people are just flipping it for the sake [00:14:47] Rantum: of Flipp. [00:14:48] Right. Right. And so when you're looking at these, the activity, you're looking at activity from all [00:14:55] Elbarto Crypto: NFTs, [00:14:56] Rantum: that, that may have ha uh, been passed through this wallet. You're not looking at necessarily that specific collection. Right, [00:15:01] Elbarto Crypto: exactly. Yeah. And I think like, you, you know, you can, you can almost sort of like then do this matrix if you will. [00:15:08] Like, you could segment, um, just like current activity and then sort of like create this matrix of like Yeah. All other n ft activity to get really, like, dive into. How the users play out. But ideally, like what you're looking for and in the case of this person that they were looking for was, they wanted to see that people who are like true contributors to the N F T community or they had like strong holdings and like what would held on, would hold on a lot. [00:15:32] think like some good examples of that are like, if you look at pudgy penguins, um, shout out to the, to the poo group. Um, a lot of them have, if you, you just look at simple like distributions of how long they've been the project for. Like a lot of them, um, have held since the beginning almost. And so you kind of look at that community and say like, okay, is that's something that I wanna do for the long term. [00:15:54] and the price. And, you know, the price has done pretty well. They had on a recent mini surge this, um, past couple of weeks. But it makes sense because, you know, just the supply of, of poos to hit the market is if no one's selling, it's always going down the, the number, the amount of supply that's hitting the market. [00:16:11] So, um, it makes sense that, you know, eventually, uh, Prices price go up. So Right. Versus, uh, versus like, you see [00:16:20] Rantum: the sustained interest, I guess is, is something that it's a little hard to measure from inactivity, but if they're not, if they're not dumping, there is something there. Right. . Yeah, exactly. [00:16:30] Elbarto Crypto: And I think that's the, that's like the holy grail of NFT engagement is how do you measure engagement of, of the nons sellers. [00:16:37] Because right now I, you know, Like volume does not equal engagement. If someone is selling their N F T, they're the least engaged with the brand. [00:16:45] Rantum: Right, right. Volume is is a very poor measure of, of the quality of a [00:16:50] Elbarto Crypto: project. Exactly. I, the problem is like unfortunately, like we don't have another way to, to do this. [00:16:55] And I think like the ways so far, it's interesting looking at the, the board ape ecosystem and, and, and maybe you can see. , you know, how many people are participating in governance or how many, you know, a polls also hold ape coin. But, you know, getting outside of the blockchain, you know, engagement becomes a much more interesting thing. [00:17:18] I think the evolution of NFTs should be some sort of like, loyalty, you know, coming back to the marketing world, you know, some sort of like marketing, uh, you know, CRM engagement platform where you can engage like outside of the blockchain. . Uh, so it's, it's, it's definitely in its ancy. And I think I'll, I think, I think NFT brands that are, are data first and are looking to expand their analytics capability are definitely gonna do better. [00:17:43] Rantum: Yeah. It feels like there's just a lot not being used. I mean, you can obviously find out so much about your, your holders and what their interests are just by looking at the wallet, and I don't think that's being, being used by many, uh, project creators or uh, leaders at this point. [00:17:59] Elbarto Crypto: Oh yeah, absolutely agree. [00:18:01] Rantum: Um, so you've developed something called an nf or you're working on something called an N F D D Gen Score. Can you just tell me a little bit about that? [00:18:08] Elbarto Crypto: Yeah, so I'm looking for someone to take this over actually. So DM me because I'm going out on crypto paternity leave soon. But essentially, uh, there's something called the defi, um, D Degen score, or it's just, I think it's just called DEGEN score. [00:18:23] um, your defi activity, right? And I, I want to build, I'm in the process of building, haven't built yet. Sort of like this N F T engagement or D gen score of, okay, how many you know, how many projects do you hold? How many mints have you done? How many, how many trades have you placed? Um, the idea here is more along the lines of it's almost like a segmentation, but. [00:18:46] if you're an N F T project, that's going to whitelist some you know, whitelist or project for certain holders or addresses. It would be nice if you had an idea of, of, you know, who would you want to whitelist this for? And like, do you wanna exclude certain groups? Do you wanna include only people with above a certain score or, you know, have had certain engagement? [00:19:06] So I think it's like really meant for, you know, hopefully for projects to better understand. , Hey, let's, you know, make this let's make this white list. Like for people that really care about NFTs or like people that aren't just gonna like immediately dump, um, these NFTs. So there's a lot of ways you can do this. [00:19:24] I, I would want to almost optimize this for like, engagement with NFTs instead of just like farming to dump. But I think that's where the hard part comes in. [00:19:36] Rantum: Yeah. Yeah. I mean, I think we're seeing, I mean, we're so sort of a misalignment, so many places of, you know, where are the, you know, what is activity and, you know, what, where is the, you know, where are the royalties coming from? [00:19:46] When you're thinking about all these, uh, these different issues, it seems like we come back to the same thing. The people that are are selling are the ones that are co, I mean, you know, obviously a, a cell requires a buy, but that's when the, the income comes into a project. When. Ideally you just want nobody to, to really wanna sell. [00:20:05] Right? And [00:20:05] Elbarto Crypto: yes. Yes, [00:20:07] Rantum: exactly. How do you, I don't know. I'm not, you know, I see that one, I know that one of the projects that you wanna work on is, uh, trying to calculate the royalty income and, you know, do you see that as being a sustainable model going forward? [00:20:21] Elbarto Crypto: I, you know, I'm very mixed on this because I used to think, like, I used to want projects. [00:20:26] I used to really like the royalty initiative because I think it is fair in some sense. . I think if we were to eliminate royalties like tomorrow, I think they would force N F T brands to start thinking of like alternative business models because yeah, like as we said over and over again, like the royalty optimizes towards people selling. [00:20:47] In reality, it should be the complete flip of that. Like the most engaged people, like across every brand, the most engaged people with the brand are the ones like creating the revenue for you. I, while I, I don't know exactly where I fall in the royalty space right now. I, I, you know, it's not, luckily it's not my job to do that. [00:21:07] Yeah, right. . But I think that like, it would force n f T brands to think of new revenue streams that I only think will help them. Because right now royalties are obviously limited to people that are buying and selling NFTs. And the goal is to. a brand. Just a brand in general, right? Like when you think about Ferrari, like everyone knows Ferrari. [00:21:30] Their, their brand recognition is beyond anything. They, I, I was surprised though. I was looking at their, their income statement and they, the majority of their money, they still make off of selling cars, selling parts, et cetera. I think only like 15% of their revenue comes from like merchandising and things like that. [00:21:46] For an N F T brand, that should be the opposite. It, in my opinion, like 15% should come from royalties and then 85% should come from some other I don't know what that is. Hopefully someone smarter than they can figure that out, but yeah. Yeah, it's [00:21:58] Rantum: tough to say what that is because they're, you know, as collectors. [00:22:02] I, I think people expect to get something from that. Mm-hmm. . And as we've seen recently with the, uh, artifact uh, Nike, uh, project, you know, people are not terribly happy when the, the, the next part was the ability to buy something and, you know, maybe a discount. So, you know, I think that still has to be worked out and I'm, I'm curious where that does go. [00:22:23] Yeah, I'm not , I'm not exactly sure where I fall overall, you know, I'm very pro royalty. Artist, but I think that's a very different thing than, than these 10,000 piece collections. You know, when I think of art, you know, I'm thinking of small collections or, or even one of one. I see that as being a sustainable model going forward. [00:22:41] Very different than, you know, 5,000 or 10,000 because you've always got other people that are, I mean, it becomes a bigger issue for the liquidity of a project when you've got a five or 10%, um, royalty, uh, fee built in. [00:22:54] Elbarto Crypto: Oh yeah, absolutely. Like the art box, when you have only have 200 of them, you would hope, yeah, you would hope that some sort of wallet royalty, right? [00:23:01] Yeah. But um, it's also, yeah, like when people expect something, it's like they, they don't understand like the, the basic like lifetime value, the customer acquisition cost ratio, which is, which is completely flipped now because if I buy a doodle right now, uh, doodle will get, you know, whatever the 3% of. [00:23:24] $6,000 sale. So let's call it $180. Like great. Uh, yeah. That, that's tough for them. . Yeah. Yeah. To give me more, uh, to give me, I, yeah. I think people definitely expect $180 more of stuff. And I know, I hear, I hear doodle putt was really fun, but. . [00:23:44] Rantum: Yeah. I didn't make it over there. I was, I was down in Miami and I did not get to go to that. [00:23:48] Elbarto Crypto: Unfortunately. , I'm sure, uh, I'm sure many of the doodle holders though are, are, uh, negative. Uh, yeah. So that's like one of the issues, right? Like if you become a doodle holder, you expect the world, but like the, the economics just don't work yet. Hopefully they do in the future, but Yeah. Or like, I mean, I think this, like, I think this introduces like new. [00:24:10] New forms of business. Like if, if doodles were to, you know, create these events, but like you could lend your N F t to somebody to go, or you know, that they would pay a fee and like lend a doodle or something like that. I think like there could be interesting innovation there. It's just. [00:24:28] Rantum: Yeah, it is, it does feel like it's still to be determined how, how to really structure these for, for a longer term. [00:24:35] You know, it, it reminds me a bit of. Just as you saw advertising in, you know, online advertising, it got more and more expensive and, you know, there's more promotions and, and things to get people to come and, and click. And, you know, you're talking about the, the lifetime acquisition. You know, it went from early on when you were advertising on, on Google, you know, you could make. [00:24:54] You know, you could even maybe make a profit on, on a sale or something. And then it turned into a longer and longer lifetime and you just saw that the cost go up as it sort of got inflated. And, you know, I think Barta that was maybe, you know, there was obviously more money being spent and you saw that there was a lot of, a lot of venture capital coming in. [00:25:12] Um, I'm wondering like how that's going to start affecting. , you know, these rewards and everything. You know, if it's going to be, you know, in, I mean, I assume that it's going to get redistributed, they're like less middle men. You know, you can, it's being given directly and, you know, it's hard to recognize what is, what's, what's real versus, you know, what's sort of artificially pumped in [00:25:32] Elbarto Crypto: Yeah, absolutely. I, yeah, and it'll probably take another year to play out to see what these, yeah. What this looks like. But, [00:25:38] Rantum: um, what else are you, uh, excited about? Right. . [00:25:40] Elbarto Crypto: I think definitely like the more of the in-person experiences, um, within the NFT space. I think like I, I, I am glad everyone is mad at like Live Nation and Ticketmaster right now because I, I hope that, uh, I hope that somehow the N F T space, uh, can solve for that somehow. [00:25:59] Uh, like I'm a big fan of India Top Shot. I, the product is beautiful. It's just a, it's a great experience, but you know, like as a, as a, as a top shot, like minnow, not quite a plankton, I'll, I'll call myself a minnow. Uh, you know, I'm definitely looking for that, like further engagement of like, with, you know, with the team that I like or with the player that I like. [00:26:22] Like how, how does that like work and, and, and sort of like what's the next step for them. Certainly looking for, you know, forward to that. And then I think like on like the zuki uh, collection, you know, I. , I kind of wanna see. They have great, like website. They have a great website right now. I think they're, you know, they're branding. [00:26:40] It's really on spot. I'm, I'm really a [00:26:42] Rantum: lot with, uh, with, with wearables as well, right? Yeah, [00:26:45] Elbarto Crypto: exactly. Like I saw, I, I, I was just like walking around, uh, where I lived the other day and someone was wearing, wearing a, um, a Bathing Ape shirt and I'm like, I'm like, oh man, I, I'm kind of like, I'm hoping like bored ape supplements that, or something like that. [00:26:58] So I'm definitely looking for these like real world experiences, like the bridge between the two and then, yeah, like in like the new innovations of, of business models for these, uh, and like when, you know, when everything will just airdrop, you know, that's what I'm looking for. . [00:27:13] Rantum: Yeah. Right. Air when Airdrop [00:27:14] Yeah. When [00:27:15] Elbarto Crypto: airdrop, I'm just gonna mask token. Yeah, yeah. Exactly. Just one day, just give it all to me, you know? Right. Yeah. [00:27:21] Rantum: That would be, those were nice back, back when those were just coming out every couple weeks. Right. it, it was nice. Yeah, it was [00:27:28] Elbarto Crypto: nice. Uh, have [00:27:30] Rantum: you been to, uh, many in-person events or have you gotten to get to. [00:27:33] Elbarto Crypto: I, you know, it's funny, before the pandemic I used to go to a lot and I tell this funny story of, um, I went to a, this is like really gonna show my age. I went to a Dere meetup which is, for those who don't know, is, uh, is a, a Bitcoin esque product. And, um, , they, the where I went, they ordered, um, they ordered 10 pizzas, but only eight people went to the event. [00:27:57] So, uh, that just kind of shows you like how, uh, popular crypto was in [00:28:03] Rantum: 2018. [00:28:05] Elbarto Crypto: I wanna say. This was, um, I bet they didn't pay for the pizza with Bitcoin anyway. No, they, no, they did not. Uh, they used good old Uncle Sam dollars. So it's. It's really interesting to see like how how far things have come. And, and I was, uh, I went to an event, like a coup, a pretty small event, um, about a couple months ago. [00:28:25] And, um, it's good to see the energy back with people. Um, I like to see that. I, I, uh, I think I'm gonna make a huge, you know, I'm, I think I'm gonna make a splash next year by trying to go to some of these places, but, um, it seems like, uh, it seems like, yeah, there's good energy there. . Uh, yeah, [00:28:42] Rantum: I've enjoyed getting out to some events. [00:28:44] Just was at Basl as I, as I mentioned, and, uh, hoping to get to N ft N YC again, uh, this coming year. They seem to just, uh, make, make it difficult and move the month every , every event, . Um, lovely. So you can't really, you can never predict it, but , did you say anything? [00:29:01] Elbarto Crypto: Was there any interesting activations at, at Basil? [00:29:05] Uh, [00:29:06] Rantum: so NFT now had a big. Two blocks are so. Just for their own event. But then they had different booths within there where art blocks was there and Meta mask was there. And, uh, nine dcc, uh, which I talked about, um, recently, they had a min a shirt there, a one of 1200 was a snow fro. Oh. Uh, inspired art on this shirt. [00:29:32] So they worked with him on that. So there were some, there were some definitely cool, uh, N F T events. Um, you know, it was a little quieter in the N F T areas than, than maybe I I expected, uh, compared to, compared to something like, I mean, N F T NYC is just, it's, it's pretty big, you know? I know there's. [00:29:49] There's some complaints about, uh, the event. And, and I would also say that you don't necessarily have to go to, to the official event to, uh, to find many things to do. I mean, that's, that's, that's definitely the case at Basel, you know, where it's very unofficial. It's part of ma maybe it's part of Miami Art Week. [00:30:05] I'm not sure if it's technically even, even that. Mm-hmm. Mm-hmm. Right, right. But yeah, the in-person events I think are great. You know, I think people are still kind of getting out and I think part of, you know, even the, the fashion things, it's part of kind of bringing. Off the screen, off the, you know, off the computer and making it a little bit more real [00:30:23] Right. Exactly. [00:30:25] Elbarto Crypto: Step away from farming. It'll be, you'll be okay. You don't have to get all of them. [00:30:29] Rantum: Yeah. Right. , So you have, you have some new dashboards coming soon, but you are, you're well right now. It sounds like you are, uh, paternity leave. Huh? [00:30:38] Elbarto Crypto: trying to figure out how to make, how to do a swaddle. Yeah. So, oh man. Yeah, [00:30:42] Rantum: I remember that one. . [00:30:44] Elbarto Crypto: Yeah. Any need tips? Let me know. Yeah, I mean, I'm always looking on, you know, always looking for new data. [00:30:48] Um, always trying to build, you know, Data dashboards really to help people understand, you know, how data's being surfaced. You know, I've got a lot of like, you know, research projects on the back burner. Really trying, really, like trying to go after this question of like N F T engagement and you know, what kind of, who will be around for the long haul. [00:31:07] And then really, I am very curious about this defi N F T overlap. I, I think there's very, you know, I think the two are very separate right now, and I think people just assume by building a, a lending NFT platform that everyone will just come, you know, no questions asked. But in reality, the reason people got into NFTs is not because. [00:31:33] they're also excited about uncollateralized lending. Like they don't know what any of those words mean. And frankly, I don't think , I [00:31:40] Rantum: don't think anyone. Right. Yeah, that's good. Good [00:31:42] Elbarto Crypto: point. Yeah. So I, I, but then at the other hand, like on the other hand, I, I think, you know, if you have a board ape, you know, if you did get lucky and held on for all this time and you're sitting on, you know, $70,000, , it would be nice, you know, to realize some of these profits or, you know, maybe you have a dog who's going off to college and, and you need to pay tuition. [00:32:07] Uh, you know, it would be nice to realize maybe the little, these profits, um, yeah, [00:32:11] Rantum: right. For fractional [00:32:12] Elbarto Crypto: ownership. So I think it's definitely an un, an untapped area. I, I don't know how it'll work though in terms of UI and, and execut. [00:32:21] Rantum: what, uh, what, what, what have you been active with in what, in your wallet Recently? [00:32:27] Elbarto Crypto: What has been my wallet? I, you know, I'm like a, a real, a real defi, uh, degen. Uh, recently I've been, uh, I've been, I was like really bad farms that I've been going into. It's, uh, it's pretty sad. I, I've been trying to figure out recently, you know, some of these like NFT projects that have really gotten sold off. [00:32:47] there's a lot of work going behind the scenes. So like Rumble Kongs, for example. Um, this was like a project that really, you know, had a lot of hype. Um, so I own a couple Rumble Kongs in, in full transparency. This is a project that got a lot of hype. I think Steph Curry was wearing a Rumble Kong hat at some point. [00:33:04] Okay. ? Yeah. And like they have people working on it behind the scenes. You know, there are, there are truly people working. They are alive. Um, they are, you know, programming. I'm, I'm gonna check, I'm gonna check the floor [00:33:15] Rantum: price right now. Here's mine. And the ones that are, they're still busy and haven't left. [00:33:20] I mean, we know that a lot of these are going, they're going to zero. And, um, it is. You know, finding the ones that are still busy and gonna keep building through, you know, through the bear. That's, that's, uh, it's key if you can find them . [00:33:34] Elbarto Crypto: Right, exactly. So, yeah, you know, I'm kind of looking at that area a little bit. [00:33:38] I've been trying to like, you know, look at more like these illiquid art blocks, collections. Yeah, . I just don't know right now like what the best way. I don't know what the best way to display them or engage with like other people or like really engage with the artists is yet. So I'm still trying to like, I also just don't want to get ripped off by like buying, you know, something for like four E and then being like, what have I done, like [00:34:02] Rantum: immediately? [00:34:02] Right. But I mean, there's some, there are some very pricey, very illiquid pieces in, uh, in our blocks. You know, there's definitely some, some grape buys. Uh, can be difficult to tell the difference. And, you know, that that's the beauty of NFTs, right? , yeah, exactly. [00:34:18] Elbarto Crypto: Um, you know, I, I respect what the pudgy team is doing. [00:34:20] Uh, I don't own any poos maybe, you know? Yeah. [00:34:24] Rantum: That is impressive. They're, they're the strength of the holders. The D Gen score is high, huh? Absolutely. [00:34:29] Elbarto Crypto: Yeah. And they're really, uh, they're really executing there. So, um, you know, shout out to the team there for. Putting something good together. So yeah, that's like, I've been really trying to think about like, you know, what projects are still actively being worked on and, and sort of like, can you scoop up any, any good values? [00:34:46] Uh, yeah. And like, you have to like what it is, right? I mean, like, I don't, like if I buy something I, you know, honestly, like I don't wanna sell it. I, I, uh, You know, I'll just hold on [00:34:55] forever. [00:34:56] Rantum: That's the best way, right? Just by what you, what you actually want to own . [00:34:59] Elbarto Crypto: Yeah, exactly. Yeah. That way if it goes zero, you feel a little less bad. [00:35:04] So [00:35:04] Rantum: yeah. Hey, we all have at least a few of those, right? Oh yeah. Oh yeah. Awesome. Uh, so where can people find you and, uh, [00:35:12] Elbarto Crypto: You. Yeah. You can find a very inactive crypto account starting today at El barto underscore crypto on Twitter. Uh, I'll be back in February. Don't worry. [00:35:22] Rantum: You do have a list of, uh, research for I, you know, people will wanna get some homework. [00:35:25] There's, there's some . [00:35:27] Elbarto Crypto: Yeah. If you feel like you have nothing to do over the, over the break and, uh, you wanna. do some NFT research. I have plenty of projects for people to hand out. So Is the baby here already? No. No. It, it's coming soon. Yeah. All right. Well, [00:35:42] Rantum: so very exciting. That's awesome. Um, anything else you wanna add before we sign off here? [00:35:47] Elbarto Crypto: You know, I would add if anyone has any advice on having a bo a dog stop barking in a shadow, like please reach out to me because, uh, it's been a, a constant thorn on my side. He's a great, he's a great pal and a great farmer, so welcome to you. [00:36:01] Rantum: That shadow's throwing him. Awesome. Well, thank you so much Alberto Crypto. [00:36:04] This is awesome. First interview. Uh, very excited, and we'll have to talk again soon. [00:36:10] Elbarto Crypto: Thank you, sir. Thank you, sir. All right.
The SaaS Podcast - SaaS, Startups, Growth Hacking & Entrepreneurship
Hung Dang is the founder and CEO of Y42, a fully-managed DataOps Cloud that helps companies design production-ready data pipelines on top of their Google BigQuery or Snowflake data warehouse. Show Notes: https://saasclub.io/334 Join Our Email List Get weekly SaaS learnings, new podcast episodes, and actionable insights right in your inbox: https://saasclub.io/email Join Our Community for Free SaaS Club is the community for early-stage SaaS founders and entrepreneurs: https://saasclub.co/join
Y42 is the first fully managed Modern DataOps Cloud for production-ready data pipelines on top of Google BigQuery and Snowflake.
Datengetriebene Entscheidungen oder auch "Glaube keiner Statistik, die du nicht selbst gefälscht hast".Entscheidungen treffen und die nächsten Schritte planen ist nicht einfach. Relevante Daten können einem die Entscheidung erleichtern. Doch wie fängt man mit datengetriebenen oder daten-unterstützenden Entscheidungen eigentlich an? Woher weiß man, ob man die richtigen Daten hat? Was wird zur entsprechenden Aufbereitung benötigt und wie kann ich die Daten entsprechend visualisieren?In dieser Episode geben wir einen Einblick in das Feld der datengetriebenen Entscheidungen. Wir beantworten, warum Tortendiagramme blödsinn sind, wie die Architektur aussehen kann, ob das Bauchgefühl überhaupt noch relevant ist und warum man nicht mehr sein eigenes JavaScript Frontend mehr bauen muss.Bonus: Was warmes Bier mit Husten zu tun hat und wie das Oktoberfest unsere Podcast-Statistiken beeinflusst.Feedback (gerne auch als Voice Message)Email: stehtisch@engineeringkiosk.devTwitter: https://twitter.com/EngKioskWhatsApp +49 15678 136776Gerne behandeln wir auch euer Audio Feedback in einer der nächsten Episoden, einfach Audiodatei per Email oder WhatsApp Voice Message an +49 15678 136776LinksWartungsfenster Podcast mit "Make or Buy": https://wartungsfenster.podigee.io/20-make-or-buyEngineering Kiosk #43 Cloud vs. On-Premise: Die Entscheidungshilfe: https://engineeringkiosk.dev/podcast/episode/43-cloud-vs-on-premise-die-entscheidungshilfe/Engineering Kiosk #12 Make oder Buy: https://engineeringkiosk.dev/podcast/episode/12-make-oder-buy/ClickHouse Datenbank: https://clickhouse.com/Google BigQuery: https://cloud.google.com/bigquery?hl=deQlikView: https://www.qlik.com/de-deTableau: https://www.tableau.com/de-dePowerBI: https://powerbi.microsoft.com/de-de/Amazon QuickSight: https://aws.amazon.com/de/quicksight/GCP Looker Studio: https://cloud.google.com/looker-studioMetabase: https://github.com/metabase/metabaseApache Superset: https://github.com/apache/supersetRedash: https://github.com/getredash/redashGrafana: https://github.com/grafana/grafanaOpen Podcast: https://openpodcast.dev/Engineering Kiosk Podcasts zum Thema Datenbanken: https://engineeringkiosk.dev/tag/datenbanken/@EngKiosk Tweet mit Metabase Statistiken: https://twitter.com/EngKiosk/status/1590373145793396736Sprungmarken(00:00:00) Intro(00:01:00) Cloud vs. On-Premise im Wartungsfenster Podcast(00:04:32) Das heutige Thema: Datengetriebene und Daten unterstützende Entscheidungen(00:05:16) Was verstehen wir unter datengetriebenen Entscheidungen?(00:08:18) Gefälschte Statistiken und die richtige Daten-Visualisierung(00:10:25) Wer hat Zugang zu den Daten und wie sieht die Daten-Transparenz aus?(00:14:05) Muss jeder Mitarbeiter SQL-Abfragen erstellen können?(00:15:55) Die Architektur für datengetriebene Entscheidungen(00:18:53) Pre-Processing, OLAP, OLTP und Datenbank-Normalformen(00:21:46) Was ist Clickhouse und welche Tools gibt es auf dem Markt?(00:22:59) Sind Tortendiagramme Blödsinn?(00:23:46) Die Visualisierung: Wie finde ich heraus, welche Fragen wir eigentlich aus den Daten beantwortet haben wollen?(00:25:53) Wie verwende ich Datenvisualisierung, ohne ein eigenes Team?(00:28:30) Schnelle Dashboards und Performance von Queries(00:29:28) Was ist bei Datenbanken in Bezug auf Analytics optimiert?(00:31:03) Muss man noch sein eigenes Dashboard-Frontend mit JavaScript bauen?(00:36:21) Welche Tipps würdest du Neulingen zur Dashboards-Erstellungen geben?(00:39:17) Ist das Bauchgefühl noch relevant?(00:41:30) Ab wann sind Daten aussagekräftig (statistisch signifikant)?(00:45:51) Welche Firmen sind Vorreiter im Bereich datengetriebene Entscheidungen?(00:47:29) Kann man zu datengetrieben sein?(00:48:21) Woher weiß ich, auf welche Daten ich gucke?(00:50:10) Outro: Podcast-StatistikenHostsWolfgang Gassler (https://twitter.com/schafele)Andy Grunwald (https://twitter.com/andygrunwald)Feedback (gerne auch als Voice Message)Email: stehtisch@engineeringkiosk.devTwitter: https://twitter.com/EngKioskWhatsApp +49 15678 136776
On this episode, Bob Moore, Co-Founder and CEO at Crossbeam, a partner ecosystem platform that helps companies build more valuable partnerships, talks about what's really happening with the data that's coming out of the emerging class of software products. The presentation centers on the modern SaaS innovation cycle: Step 1: New technology is developed, which leads to Step 2: New data comes out of the new technology, which results in Step 3: New product integrations. Bob explains the latest dimensions of context and action: traditionally, your CRM silo vs. the new ecosystem CRM, shared pipeline, real-time updates, partner attribution and actionable insights. He explains how new ecosystems are born. New data for new technologies breeds a new ecosystem and the most important ecosystem of all: the meta-ecosystem. It is made up of all the tools used by partner teams and those they support. The data is as complex as the opportunity is vast. Bob explains how to solve the problem by hiring experts who can wrangle these systems and drive productive outputs. This is where partner ops come in. They ultimately make partner data useful, actionable and scalable. They put the data and the playbooks it enables in the hands of the right people, in the right places, at the right times. Resources Mentioned: Stitch - https://stitch.money/ Looker - https://www.looker.com/ Google BigQuery - https://cloud.google.com/bigquery Salesforce - https://www.salesforce.com/eu/ Sigma - https://sigmaconnected.com/za/ Fivetran - https://www.fivetran.com/ PartnerPortal.io - https://www.partnerportal.io/ Terminus - https://terminus.com/ Slack - https://slack.com/ Partnered - https://partnered.com/ Note: SaaS Connect 2023 will take place in San Francisco April 19th and 20th. If you would like to be a sponsor, please contact us at admin@cloudsoftwareassociation.com for information. #saas #software #cloud Thank you to our amazing podcast team at Content Allies. Want to launch your own B2B revenue-generating podcasts? Contact them at https://ContentAllies.com.
Livebook Desktop is a recent project that makes it much easier for people to start using Elixir and Livebook. Wojtek Mach joins us to explain what Livebook Desktop is and how it works. We learn who the project is for and the problems it helps solve. We ask if this approach makes sense for other projects and how to get started. Wojtek also shares some cool things in the works that make it possible to load our own Phoenix project into a Livebook! Show Notes online - http://podcast.thinkingelixir.com/113 (http://podcast.thinkingelixir.com/113) Elixir Community News - https://github.com/elixir-lang/elixir/releases/tag/v1.14.0-rc.1 (https://github.com/elixir-lang/elixir/releases/tag/v1.14.0-rc.1) – Elixir v1.14.0-rc.1 - the last stop before v1.14 - https://twitter.com/elixirlang/status/1559133733478977538 (https://twitter.com/elixirlang/status/1559133733478977538) – Elixir v1.14.0-rc.1 announced as the last stop before v1.14 - https://github.com/phenixdigital/phxlivestorybook (https://github.com/phenixdigital/phx_live_storybook) – Phoenix Live Storybook - A pluggable storybook for your LiveView components - https://phx-live-storybook-sample.fly.dev/storybook/colors (https://phx-live-storybook-sample.fly.dev/storybook/colors) – Public sample project of Phx Live Storybook - https://github.com/elixir-lsp/elixir-ls/releases/tag/v0.11.0 (https://github.com/elixir-lsp/elixir-ls/releases/tag/v0.11.0) – Update to ElixirLS - https://twitter.com/lukaszsamson/status/1558923305012400136 (https://twitter.com/lukaszsamson/status/1558923305012400136) – ElixirLS adds Elixir 1.14 support - https://asdf-vm.com/ (https://asdf-vm.com/) – Version manager for multiple runtimes like Erlang, Elixir, Node and many more - https://twitter.com/josevalim/status/1558156309454798848 (https://twitter.com/josevalim/status/1558156309454798848) – José shared that Livebook Enterprise will be shipping soon - https://twitter.com/michalmuskala/status/1557374130793680899 (https://twitter.com/michalmuskala/status/1557374130793680899) – Research paper describing the WhatsApp approach to static types in Erlang with the eqWAlizer project - https://research.facebook.com/publications/inferl-scalable-and-extensible-erlang-static-analysis/ (https://research.facebook.com/publications/inferl-scalable-and-extensible-erlang-static-analysis/) – The eqWAlizer static types research paper - https://twitter.com/josevalim/status/1558554226384670723 (https://twitter.com/josevalim/status/1558554226384670723) – Nx v0.3.0 was released - https://twitter.com/sean_moriarity/status/1558579500761358336 (https://twitter.com/sean_moriarity/status/1558579500761358336) – Axon/AxonOnnx v0.2.0 released - https://elixirconf.uy/ (https://elixirconf.uy/) – Nov 11-12, the first Elixir conference to be held in Montevideo, Uruguay. Do you have some Elixir news to share? Tell us at @ThinkingElixir (https://twitter.com/ThinkingElixir) or email at show@thinkingelixir.com (mailto:show@thinkingelixir.com) Discussion Resources - https://livebook.dev/ (https://livebook.dev/) - https://github.com/livebook-dev/livebook/blob/27f1e1406481edd2b38c730d75ce72df514ab4a6/mix.exs#L146:L178 (https://github.com/livebook-dev/livebook/blob/27f1e1406481edd2b38c730d75ce72df514ab4a6/mix.exs#L146:L178) – AppBundler setup - https://fly.io/phoenix-files/safe-ecto-migrations/ (https://fly.io/phoenix-files/safe-ecto-migrations/) - https://github.com/elixir-desktop/desktop (https://github.com/elixir-desktop/desktop) - https://podcast.thinkingelixir.com/98 (https://podcast.thinkingelixir.com/98) – Elixir desktop interview "Elixir in the iOS App Store with Dominic Letz" - https://en.wikipedia.org/wiki/LAMP(softwarebundle) (https://en.wikipedia.org/wiki/LAMP_(software_bundle)) - https://github.com/burrito-elixir/burrito (https://github.com/burrito-elixir/burrito) - https://news.livebook.dev/how-to-query-and-visualize-data-from-google-bigquery-using-livebook-3o2leU (https://news.livebook.dev/how-to-query-and-visualize-data-from-google-bigquery-using-livebook-3o2leU) – How to query and visualize data from Google BigQuery using Livebook - https://github.com/burrito-elixir/burrito (https://github.com/burrito-elixir/burrito) – Burrito project - https://github.com/elixir-lang/elixir/pull/12051 (https://github.com/elixir-lang/elixir/pull/12051) – Mix.install :configpath + :lockfile - https://github.com/livebook-dev/kino/issues/132#issuecomment-1207293134 (https://github.com/livebook-dev/kino/issues/132#issuecomment-1207293134) – kinobenchee to automatically render benchee results in Livebook - https://hexdocs.pm/req/changelog.html#v0-3-0-2022-06-21 (https://hexdocs.pm/req/changelog.html#v0-3-0-2022-06-21) – Req v0.3.0 changelog - https://nsis.sourceforge.io/Download (https://nsis.sourceforge.io/Download) - https://nsis.sourceforge.io/Main_Page (https://nsis.sourceforge.io/Main_Page) - https://en.wikipedia.org/wiki/NullsoftScriptableInstall_System (https://en.wikipedia.org/wiki/Nullsoft_Scriptable_Install_System) - https://www.winamp.com/ (https://www.winamp.com/) - https://github.com/livebook-dev/livebook/tree/main/app_bundler (https://github.com/livebook-dev/livebook/tree/main/app_bundler) - https://github.com/boydm/scenic (https://github.com/boydm/scenic) Guest Information - https://twitter.com/wojtekmach (https://twitter.com/wojtekmach) – on Twitter - https://github.com/wojtekmach/ (https://github.com/wojtekmach/) – on Github - http://wojtekmach.pl/ (http://wojtekmach.pl/) – Blog Find us online - Message the show - @ThinkingElixir (https://twitter.com/ThinkingElixir) - Email the show - show@thinkingelixir.com (mailto:show@thinkingelixir.com) - Mark Ericksen - @brainlid (https://twitter.com/brainlid) - David Bernheisel - @bernheisel (https://twitter.com/bernheisel) - Cade Ward - @cadebward (https://twitter.com/cadebward)
On the podcast this week, our guests Laurie White and Aaron Yeats talk with Stephanie Wong and Kelci Mensah about higher education and how Google Cloud is helping students realize their potential. As a former educator, Laurie has seen the holes in tech education and, with the help of Google, is determined to aid faculty and students in expanding learning to include cloud education as well as the standard on prem curriculum. Aaron and Laurie work together toward this goal with programs like their Speaker Series. Laurie's approach involves supporting faculty as they design courses that incorporate cloud technologies. With the busy lives of students today, she recognizes that the best way to get the information into the hands of students is through regular coursework, not just through elective activities outside the regular classroom. Aaron's work with students and student organizations rounds out their support of higher education learning. He facilitates the creation of student clubs that use Cloud Skills Boost, a program in which students navigate full pathways as they learn the skills they need to create and manage cloud builds. Soon, Aaron will offer hack-a-thons that encourage students to attend weekend events to work together on passion projects outside of regular classwork. Our guests talk more about the specifics of Google Cloud Higher Education Programs and the importance of incorporating certifications into the higher education learning process. Aaron talks about expanding the program and his hopes for reaching out to more schools and students and Laurie talks about the funding for students and how Google Cloud's system of credits for students enables them to use real cloud tools without a credit card. Laurie and Aaron tell us fun stories about past student successes, conference interactions, and hack-a-thon projects that went well. Laurie White Laurie taught CS in higher ed for over 30 years where her biggest frustration was trying to keep the curriculum up with the field. She thought she was retiring seven years ago but got the call from Google to a job where she could help faculty around the world keep their curriculum up with cloud computing, so here she is. Aaron Yeats Aaron Yeats has been working in education outreach for two decades. His work in education has included Texas government education programs including public health, non-profit advocacy, and education. Cool things of the week How Wayfair is reaching MLOps excellence with Vertex AI blog Hidden gems of Google BigQuery blog Google Cloud Innovators site Google Cloud and Apollo24|7: Building Clinical Decision Support System (CDSS) together blog Interview Google Cloud Higher Education Programs site Google Cloud Speaker Series site Google Cloud Skills Boost site CSSI site Tech Equity Collective site GDSC site What's something cool you're working on? Steph has been working on an AlphaFold video. You can learn more here. Kelci is working on developing a Neos tutorial for introductory Google Cloud developers to learn how to write HTTP functions in Python all within the Google Cloud environment and wrapping up her summer internship with Google! Hosts Stephanie Wong and Kelci Mensah
Google Chrome extension that displays a LiveView and integrates with a web page like Gmail? Steve Bussey shares how he did just that! We learn how it worked, why he did it, the benefits he sees, and how this differs from the conventional approach. He explains the small JS shim used, recommends a library to help when integrating with Gmail and he explains how the user experience is great, particularly when rolling out new versions! Steve goes further to talk about Chrome's new v3 extension API and targeting multiple browsers. Show Notes online - http://podcast.thinkingelixir.com/112 (http://podcast.thinkingelixir.com/112) Elixir Community News - https://github.com/WhatsApp/eqwalizer/blob/main/FAQ.md (https://github.com/WhatsApp/eqwalizer/blob/main/FAQ.md) – WhatsApp static type checker eqWAlizer added a FAQ. - https://twitter.com/robertoaloi/status/1555470447671754753 (https://twitter.com/robertoaloi/status/1555470447671754753) – Experimental support in erlang_ls has also been built for eqWAlizer - https://github.com/erlang-ls/erlang_ls/pull/1356 (https://github.com/erlang-ls/erlang_ls/pull/1356) – Erlang LS eqWAlizer support - https://twitter.com/michalmuskala/status/1554813818475319296 (https://twitter.com/michalmuskala/status/1554813818475319296) – Erlang/OTP's Dialyzer can now be run incrementally, which works out ~7x faster on average - https://github.com/erlang/otp/pull/5997 (https://github.com/erlang/otp/pull/5997) – Dialyzer PR with more details - https://twitter.com/chris_mccord/status/1554478915477028864 (https://twitter.com/chris_mccord/status/1554478915477028864) – Initial verified routes announcement from Chris McCord for Phoenix 1.7 - https://twitter.com/josevalim/status/1554512359485542400 (https://twitter.com/josevalim/status/1554512359485542400) – José Valim gave more clarification on what verified routes means. - https://twitter.com/hugobarauna/status/1554547730302832641 (https://twitter.com/hugobarauna/status/1554547730302832641) – Hugo Baraúna created a 5 minute Youtube video showing how to integrate Livebook with Google BigQuery. - https://twitter.com/akoutmos/status/1556046188784324616 (https://twitter.com/akoutmos/status/1556046188784324616) – Alex Koutmos teased that he's adding Benchee support to Livebook. - https://podcast.thinkingelixir.com/94 (https://podcast.thinkingelixir.com/94) – Benchee discussion with Tobias Pfeiffer in episode 94. - https://erlangforums.com/t/pgmp-postgresql-client-with-logical-replication-to-ets/1707 (https://erlangforums.com/t/pgmp-postgresql-client-with-logical-replication-to-ets/1707) – Interesting Erlang library launched called pgmp - https://github.com/shortishly/pgmp (https://github.com/shortishly/pgmp) – pgmp is a PostgreSQL client with support for simple and extended query, and logical replication to ETS. Do you have some Elixir news to share? Tell us at @ThinkingElixir (https://twitter.com/ThinkingElixir) or email at show@thinkingelixir.com (mailto:show@thinkingelixir.com) Discussion Resources - https://twitter.com/yoooodaaaa/status/1544434779327811585 (https://twitter.com/yoooodaaaa/status/1544434779327811585) – Tweet about creating a chrome extension - https://pragprog.com/titles/sbsockets/real-time-phoenix/ (https://pragprog.com/titles/sbsockets/real-time-phoenix/) – Author of "Real-Time Phoenix" book - https://salesloft.com/ (https://salesloft.com/) - https://chrome.google.com/webstore/detail/honey-automatic-coupons-r/bmnlcjabgnpnenekpadlanbbkooimhnj?hl=en-GB (https://chrome.google.com/webstore/detail/honey-automatic-coupons-r/bmnlcjabgnpnenekpadlanbbkooimhnj?hl=en-GB) - https://www.streak.com/post/announcing-inboxsdk (https://www.streak.com/post/announcing-inboxsdk) - https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe (https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe) Guest Information - https://twitter.com/YOOOODAAAA (https://twitter.com/YOOOODAAAA) – on Twitter - https://github.com/sb8244/ (https://github.com/sb8244/) – on Github - https://stephenbussey.com (https://stephenbussey.com) – Blog - https://pragprog.com/titles/sbsockets/real-time-phoenix/ (https://pragprog.com/titles/sbsockets/real-time-phoenix/) – Real-Time Phoenix book Find us online - Message the show - @ThinkingElixir (https://twitter.com/ThinkingElixir) - Email the show - show@thinkingelixir.com (mailto:show@thinkingelixir.com) - Mark Ericksen - @brainlid (https://twitter.com/brainlid) - David Bernheisel - @bernheisel (https://twitter.com/bernheisel) - Cade Ward - @cadebward (https://twitter.com/cadebward)
Trazemos nesse episódio o especialista Lucas Magalhães para falar um pouco de projetos de Big Data e Analytics dentro do Google GCP Discutimos sobre os projetos que podem ser facilmente implementados assim como melhores formas e tecnologias utilizadas para lidar com processamento massivo de dados.No YouTube possuímos um canal de Engenharia de Dados com os tópicos mais importantes dessa área e com lives todas as quartas-feiras.https://www.youtube.com/channel/UCnErAicaumKqIo4sanLo7vQ Quer ficar por dentro dessa área com posts e updates semanais, então acesse o LinkedIN para não perder nenhuma notícia.https://www.linkedin.com/in/luanmoreno/ Disponível no Spotify e na Apple Podcasthttps://open.spotify.com/show/5n9mOmAcjra9KbhKYpOMqYhttps://podcasts.apple.com/br/podcast/engenharia-de-dados-cast/ Luan Moreno = https://www.linkedin.com/in/luanmoreno/
This week Dan and Dara are joined by Mark Rittman to talk about the 'modern data stack' and how the danger with all analytics implementations is the ‘so what' factor. They also touch on how the changes in Google Analytics 4 are changing things for the modern data stack, and what type of company would truly benefit from a fully-fledged setup. Check out Rittman Analytics - agile analytics consulting for the modern data stack - https://bit.ly/3LTE9oV. Mark's Medium article on his Rittman Analytics' modern data stack setup called "How Rittman Analytics does Analytics Part 2 : Building our Modern Data Stack using dbt, Google BigQuery, Looker, Segment and Rudderstack." - https://bit.ly/3wRFOXO. Check out Mark's DJing music on SoundCloud - https://bit.ly/3PIYMap. In other news, Dan MeasureCamps, Dara swims and Mark cycles and makes music! This is the last episode in this run, 40 episodes can you believe it?! See you all in a few weeks after a short break! Check out on LinkedIn: - Mark - https://bit.ly/3PLkdI0 - Dan - https://bit.ly/3JQKHEb - Dara - https://bit.ly/3vzV0bO - Measurelab - https://bit.ly/3Ka513y Music from Confidential, check out more of their lofi beats on Spotify at https://spoti.fi/3JnEdg6 and on Instagram at https://bit.ly/3u3skWp. Please leave a rating and review in the places one leaves ratings and reviews. If you want to join Dan and Dara on the podcast and talk about something in the analytics industry you have an opinion about (or just want to suggest a topic for them to chit-chat about), email podcast@measurelab.co.uk or find them on LinkedIn and drop them a message. Full show notes and transcript over at https://bit.ly/3LXcMKt. The post #40 What is the modern data stack? (with Mark Rittman) appeared first on Measurelab.
Twelve years ago the RIPE NCC set out to build the largest Internet measurement network ever made. Today, RIPE Atlas gives users an unprecedented understanding of the state of the Internet. I caught up with Robert Kisteleki (R&D Manager, RIPE NCC) to talk about how RIPE Atlas has developed thanks to the efforts of the community. (Watch the video on Youtube.)RIPE Atlas website RIPE Atlas on RIPE Labs02:53 - RIPE Atlas probe map03:10 - HTTP, DNS, NTP measurements03:57 - RIPE Atlas architecture06:54 - RIPE Atlas data: API, Google BigQuery, daily archives08:48 - RIPE Atlas software probes14:50 - APNIC dataset on network providers15:20 - Bias in Internet Measurement Infrastructure19:05 - RIPE Atlas community21:36 - RIPE Atlas sponsorship23:09 - RIPE Atlas Anchors25:20 - Data disconnects26:20 - Outage reporting26:40 - Looking inside outages with RIPE Atlas28:34 - RIPE NCC Country Reports28:55 - RIPE IPmap30:19 - Dataplane articleMusic: bensound.com Hosted on Acast. See acast.com/privacy for more information.
Show Resources Here were the resources we covered in the episode: Data Studio dashboard that Anna Shutko and AJ created together @AnnaShutko on twitter Marketing Analytics Show LinkedIN: Send DM and posting NEW LinkedIn Learning course about LinkedIn Ads by AJ Wilcox Contact us at Podcast@B2Linked.com with ideas for what you'd like AJ to cover. Show Transcript Have you heard of Supermetrics? If you're a LinkedIn advertiser, it's your new best friend. We're covering the capabilities on this week's episode of the LinkedIn Ads Show. Welcome to the LinkedIn Ads Show. Here's your host, AJ Wilcox. AJ Wilcox Hey there LinkedIn Ads fanatics. So we're highlighting another tool today in the LinkedIn advertisers arsenal. We're discussing a tool that I've been using now for years. It's absolutely indispensable for our team, because we're managing so many different accounts. And we're dealing with so much data. That tool is called Super metrics. And it's a very simple way of getting all of your ads data into a spreadsheet, or visualization tool for better reporting and analysis. I'm excited to welcome Anna Shutko from Supermetrics to answer my questions, and give you the inside scoop on what's coming. Anna and I go way back. And we've even collaborated on a free dashboard for LinkedIn advertisers, that you'll all get here in the show notes and I think you'll enjoy it. Without further ado, let's jump into the interview. Okay, I'm really excited here to have Anna Shutko from Supermetrics. She is the Brand Strategist at Supermetrics. She's also host of the awesome podcast, The Marketing Analytics Show, make sure you go and subscribe to that right now,if you're not already. She is based in Helsinki, Finland. And she was number seven on the Supermetrics team. She has been there over five and a half years she is one of the OGs for sure. She's also an avid cyclist and skier. Anna, I'm so excited to have you on the show. We've been friends for a long time. Thanks so much for joining us. Anna Shutko 1:42 I know, right?! Thank you so so much for having me. I'm super excited to be on the show. AJ Wilcox 1:49 I'm just as excited to have you I have so many great questions for you. Well, I'm the host. And so if I say that they're great, it's a little bit biased, but didn't really have some questions I think are gonna be really good for you. And for those of us who are listening, tell us anything about yourself that you want anything I may not have covered in your intro. Anna Shutko 2:05 Yeah, sure. So I think you nailed my bio. So I did a piece of furniture with Supermetrics. As I like to joke about it. I've been at the company for quite a while now more than five years. Wow. It's crazy when you think about it. So what people usually find interesting because I sometimes go in client calls and then when my colleagues introduce me, they're like, yeah, she's been here with us for such a long time. And people who knows Supermetrics are very, very surprised, because I guess not so many people like me have been with the company for a long time. We were a tiny team back then. Now we've grown fantastically. And we've grown so fast. Now we're over 250 people. It's crazy. So yeah, it's been a wild ride. And yeah, like I said, I've been moving between different areas of marketing between different departments. So I started as a growth marketer, then I went on to become a product marketing manager. I was also product manager, I was managing the relays and different connectors. And then I moved on to brand marketing. And now I'm super excited about my new role. So I'm building the brand measurement system, and I'm pretty sure we're gonna hear more about it. But yeah, that's a little bit about me. So many different areas. It's super exciting to see the company grow. It's super exciting to change all these different roles and learn more about connectors, including or favorite LinkedIn connectors. So that's a bit of a thing. AJ Wilcox 3:39 Oh, beautiful. And I am a fan of the LinkedIn connectors for sure. So thank you for your great work on those. Let's go ahead and start in here on the first question. So for those who are not already familiar with Supermetrics, what are the challenges that Supermetrics solve. Why is Supermetrics originally in business? Anna Shutko 3:58 Yeah, definitely. So Supermetrics essentially, is a data pipeline tool, as I like to call it. So we transfer data from a variety of different data sources or connectors, how we call them interchangeably. And these are LinkedIn Ads, Facebook Ads, Google Analytics, HubSpot, e-commerce platforms like Shopify. We cover over 80 different platforms. And like I said, literally any most popular and big marketing platform, you name it. And we transfer all this data to set up different data destinations. So we have sub categorized them into spreadsheets. So these are your Google Sheets and your Excel. Then there are different data visualization tools like Google Data Studio, or Power BI, Tableau, so data analytics/data visualization tools. And the third group is our data warehousing clusters. So we transfer data to different data warehouses like Azure, Google BigQuery, etc., etc., etc., And data links so you can combine your all your data in one place and store it securely. So that's a little bit about Supermetrics. So because we cover a lot of data destinations, and we connect to many different platforms and transfer the data, we cover a variety of different scenarios here. And this is actually historically has been the challenging part within product marketing, which also makes it very, very interesting that because we have so many products, we can help you cover multiple scenarios of what you want to do with your data. So you can have your reporting and dashboarding. So these are your client facing reports for agencies. And if you're an in house marketer, you can create your so called boss pacing reports, you have to create within your team. So these can be done in tools like Google Sheets, there are lots of writers who take advantage of Google Sheets and Excel and the formulas they can create there. And also, of course, do the visualization tools like Google Data Studio, super easy to add connectors there, it's super easy to create beautiful reports there and share them with your team. Then there are some cases of ad hoc analyses. And these are usually related to the questions you need to answer right now. So for example, why is my LinkedIn Ad spend so high and I have not seen the results? Or why does target audience A perform better than target audience B? So if you have this question and you really want to quickly acquire your data to answer the specific question, Supermetrics for Google Sheets can be a really good tool because we have the sideboard technology where you can select the metrics and dimensions you want to pick, and then it will pull the data into the spreadsheet, so that you can quickly answer very pressing questions. And another use case we have is the data warehousing use case where you can tie data across multiple different sources. So instead of just looking at one or two, or three or five data sources, you can create really complex models. But in order to create these models, you need to store your data in one place in order to join all this data. So here, you can pull all the data into your Google big query and then visualize it with a helpful view. But you can also perform the necessary data transformations within this data warehouse. So for example, you can compare the performance of your ad networks such as Google versus Facebook, again, you can segment and test different audiences, you can join data in the way you want to see sequel. And also you can report on the whole user journey. So for example, your clients started by clicking on your LinkedIn ads campaign, and they went to your website. And then you capture their website behavior with Google Analytics, maybe some of this data is coming from your CRM. And then with the help of a data warehouse, you can store all these data and then inquiry to connect the pieces together and see the whole user journey. So here are our most popular use cases. AJ Wilcox 8:20 Okay, so just out of curiosity for being able to track the whole buyer journey. What sort of software or tools do you need in place in order to I guess, get that journey across all the different platforms? Is that requiring that they're in a CRM or something that's already natively tracking all of that? Anna Shutko 8:36 Yeah, so usually depends on the company. Some companies might not necessarily have the budget or the need for a CRM. Of course, having a CRM is ideal. So if you have something like HubSpot, or it's Salesforce, it's really, really good also because their API's are so robust. They allow you to create custom dimensions and custom metrics to track your unique use cases. So for example, your users have, like I said, clicked on your LinkedIn Ads, and then they land on a website. And then they have to fill in a form. And the form might be somewhere, like on a different website, maybe it's an event and you're hosting a registration on Eventbrite, say for example. So when they put their data into Eventbrite or somewhere else, you also need to somehow capture this data within your CRM. So an alternative solution here could be to build a web page there where they would fill all their details. And then you can store all this data within the server ad. And then at the same time, you can combine it with the data coming from LinkedIn Ads. So it really depends. I would recommend starting with a combination of like ad networks, and reporting on that data and then combining it with data from Google Analytics. This can be the easiest when you're using a CRM, but at the same time, you can already start seeing a better overview of your user journey, and then connect to your CRM and then build on top these custom goals and metrics to create more of a reporting system. AJ Wilcox 10:16 Very cool. This is a topic that's been really top of mind for me, as I'm thinking about, as cookie apocalypse continues, and we losing our dependence on third party cookies. How effective are we as marketers going to be being able to track someone across the whole user journey when we know that, that cookies disappearing? This is fascinating to me So thanks for helping us out with that. So those are the problems that Supermetrics solves. Why is Supermetrics in such a good position to solve this problem? Why not, you know, Google themselves, just replace you with having a simple tool that just spits out their data. Same thing with LinkedIn? Like, why is Supermetrics solving this problem and not anyone else? Anna Shutko 10:57 Yeah, that's definitely a really, really good question. I can talk about this for hours. But we only have a limited amount of time, I guess. So first of all, Google is one of our partners. So Google, of course, is a massive, massive company and they have an amazing set of tools, but they are not the whole ecosystem. So there are lots of other players like Facebook, like LinkedIn, like Twitter, like HubSpot. And they also provide a set of really, really good API's, which allow you to export the data on the campaigns that are run. So super metrics, like mentioned, helps combine all this data and then we push this data to a number of different destinations. These are not just Google Sheets, these are Excel, for example. So for example, if your company is using Microsoft and you are required to use the office space, then Supermetrics can be a real really good tool because we can help push this data to your Excel desktop, say you set up your inquiries, and then you go offline completely. And then you can analyze your data using Excel. So like I said, we don't only connect to Google platforms. We help connect different players within the ecosystem together. And another good example is our Data Studio product. So Google Data Studio is a free data visualization tool for those of you who do not know. And they have native integrations with Google Analytics, Google Ads and other Google platforms, that is true. But for example, many marketers use Supermetrics to get data from Facebook or LinkedIn to the same data visualization tool. And if you're running ads on multiple different platforms, it would be a little bit silly to just analyze your Google Ads data without analyzing what you're doing on your LinkedIn as a platform, kind of like together. So we help marketers gain a very holistic view of their performance in the tools they already know, in the tools they already can use in the tools they already know how to use. So for example, if you know how to use Google Sheets, you can just install Supermetrics add on. And then you can continue using your favorite tool without this learning curve without the need to learn a new tool, and just query the data from the platform you want, say LinkedIn Ads, and just create the reports there. Or you can use these tools in combination. So we have a native Data Studio connector for our required products, for example, you've combined the data from multiple different sources in your inquiry project, and then you want to visualize this data. So you can use our Data Studio connector to do this. And the names of metrics and dimensions will have clear descriptions to to be very easy for you to understand what kind of metric you're visualizing, and how you can create a better report using all these tools together. AJ Wilcox 13:54 Oh, I like it. Alright, thanks for sharing that. I definitely think I'm of the same opinion you are that any of the other networks or channels could easily come out with a product that allows easier access to the data they have. But the Supermetrics advantage here is just being able to aggregate from all of these different connectors, regardless of whether or not any of them come out with an easier way for us to do it ourselves. Tell us about your relationship with LinkedIn, as well as the other platforms. What do you get from that partnership? How long have you been partners, that kind of thing? Anna Shutko 14:26 Yeah, sure, definitely. So with LinkedIn, we've been partners for quite some time. And there are many, many marketers, 1000s of marketers, is what we're talking about that reported their LinkedIn Ads. And then concrete budget pacing reports with Supermetrics. So we've recently demoed our product to your team. And like I said, we have really, really good and close collaboration there. And in addition to LinkedIn, we partner with Stack Adapt, HubSpot, Google, like I mentioned, AdRoll and many, many other data source and data destination companies. So when we partner with a company, we try to create the best possible value for the end user, of course. So we love creating templates for Data Studio especially. We've created some with the HubSpot. So we sit together with their product managers and think, Okay, what kind of metrics would be great to visualize for users. And we create completely free reporting templates with all the needed metrics and dimensions that our user can use as they are, or they can just take them as a blueprint for their own reports, they can tweak the metrics, they can tweak dimensions they can do that they want. And it's really amazing to partner with these companies, because we can combine the best of both both worlds so to say, so we have the reporting and analytics and data consolidation expertise, and the platform's bring their own know how, their knowledge to the table. So AJ you and I created this really, really good LinkedIn Ads dashboard. And this is one very good example of how your know how or some platform managers know how can be combined with Supermetrics know how. AJ Wilcox 16:20 Which is beautiful, we are going to link to that dashboard down below in the show notes so everyone can get access to it. But, Anna I'm so glad you brought this up. Because this has been a couple years ago, or a few years ago now. But we worked for about six months, I think on creating this dashboard, we call it the ultimate LinkedIn Ads dashboard. It's in Data Studio, it's totally free, like you mentioned, and anyone out there can go and get really complex analysis of their LinkedIn Ads. And that was because of you and I working so hard on that. I'm a big fan, I hope everyone goes and grabs that. Just to make it clear as to what Supermetrics is doing for LinkedIn advertisers. I mean, it's it's cool that you guys aggregate all of the different data and channels together into one spot. I think marketers who are responsible for more than one channel would love that. But for LinkedIn, specifically, you advertisers who are listening, you know how hard it is to get data out of LinkedIn. If every time you want to do a report, you have to go and click the export feature inside of campaign manager, and then put it into Excel, and then you know, do whatever formatting changes, you need to do, create pivot tables, then all of a sudden, all of that data is it's a snapshot of it, you can't do anything else with it. And so the next time your boss asks for a report, you're going and doing it again. What Supermetrics does is it will take, even on a schedule, this is my favorite part about it, you can say I want this data going into Excel or into Google Sheets, and I want it every day at 2am. I want it to go pull the next day's data. And then it's always there any report any pivot table that you build, all you have to do is just refresh. And now you'd never have to build that same report ever again. So, so cool. We'll talk about more of my favorite features of Supermetrics here a little bit later. But I just wanted everyone to know like, this is why it's so valuable for you. This is why I'm doing this tool spotlight on Supermetrics. Because there just is no other way to do with LinkedIn, what Supermetrics does. So I want to hear from you. What are the capabilities of Supermetrics, especially as it pertains to LinkedIn advertisers? Anna Shutko 18:28 Yeah, definitely. So I've mentioned a couple of capabilities. And like I said, because we have a product umbrella, it allows us to help customers solve multiple reporting and reporting related issues. So we have your ad hoc reporting, where we can acquire the data on the fly. And this is your Google Sheets, Excel products. We help with the data consolidation. These are your data warehousing products, building beautiful data visualizations, or you can have little exploration in Looker or Tableau or Power BI. Another really interesting thing I'd like to highlight here is that as you know, LinkedIn ads API constantly changes will always constantly change. LinkedIn is always coming up with new features. And he was right, you mentioned that it's challenging to export the data out of LinkedIn Ads. And this is something we're really help with, but it will also help you export this data in the right format. And we can help you create reports with really, really high data granularity. So what it means in practice is that you can test and report on many different pieces of a LinkedIn Ads campaign audiences. You have your campaign types, creatives, objectives, and you can break down your campaign into different pieces, pass these pieces individually or perform an AB test, and then make really, really smart optimizations. So this is one thing that I really, really like and would like to highlight here. And typically, we help achieve this with our Google Sheets product. And one very precise example is, again, the dashboard AJ and I have built. Belt. So you can report on not one, but four different types of spend. So there are formulas that help you calculate your total spend, projected spend, goal spend, as in the amount of you have to spend without under over spending. And then the cumulative spend to something you've spent overall. And here, we've taken one metric, which is spend, and then your budget goal, and then transformed into four different kinds of spend. And this brings me to my earlier point about data granularity, you can report on really granular data. So you can break down your spend by day, you know, Google Sheet, then create these calculations to have these four different types of spends. And then think about it holistically for not one, but four different viewpoints. And then create your budget pacer that can help you allocate budgets. Because LinkedIn Ads is a very costly platform, as we all know. So having these different types of spend calculating these different types of spend is really, really helpful. And the same thing goes with audiences and can campaign types. You can break all these spend down by multiple different dimensions, like what kind of audience brings the best ROI, what kind of campaign type performs better than the other campaign types. What kind of creative helps me get more clicks? So you can get really, really nerdy with your data. And this is something I really, really love. And another beautiful thing is that you can then combine this data. So if you don't want to look at it in a very granular way, you can also combine all this data in Google Data Studio report. And again, this is something that we've tried and tested, and it worked. So after you've analyzed all these types of spend, you can push them into the to see the dashboard to see bigger trends. So for example, you've noticed that your projected spend during this month is higher than your projected spend over the last couple of months. And you can start thinking, why you can understand what might be like bigger drivers behind this change. And in addition to this, you can add all different types of other data. For example, you can add your data on AB testing to see which campaigns have performed better historically. Or you can even add your data from LinkedIn Pages. Because if you use LinkedIn Ads and LinkedIn Pages in combination with can be a very, very powerful duo. It can help you uncover many sides on your audience. So there are a lot of different ways in which Supermetrics can help you slice and dice your LinkedIn Ads data. But also create really, really good reports that can help you get a general overview. AJ Wilcox 23:02 I love this, there's no data that you can get from campaign manager that you can't get within Supermetrics. And you own the data, you get to do whatever you want with it. So just like what Anna was talking about, with the ability to break down your spend by ad type and by audience, all these things are fantastic. But then you realize you could have a Google sheet or a page in your Data Studio dashboard that allows you to see the AB tests you're running, and another page that might show you just your budget, like what Anna was talking about. And another one that could be just your metrics at a glance like, hey, how are my general click through rates, or my general conversion rates, all of this you can do, it's super easy. And just in the dashboard that Anna and I built for you here a couple years ago, all of that is like already set up for you. So very, very cool. Anything else you want to share about the capabilities that we should go over? Anna Shutko 23:58 Yeah, definitely. Also, we have real really nice use cases. There is a tab on our website where you can read more about what other clients are doing. And I know it's useful for a fact because our customer success managers have found it very useful. So you can also learn from other people and you can check what some other guys are doing with their Facebook Ads campaign and apply the same ideas to your LinkedIn Ads reporting, which I think is super super exciting, because understand and basically steal ideas in the best possible way. Understand how others are running their reports. Another really, really good feature is the automated way of reporting. For example, once you've set your LinkedIn Ads budget tracker in spreadsheet, you can say, hey, I want to update my data and if my spend increases, and if it crosses you know the threshold to XYZ amount, send me an email. There is literally no human error unless you set up the query correctly. So you can easily get the data you want, whenever you want. And you can also set up rules and get customized alerts whenever something goes wrong. So you don't need to monitor your ad campaign on a daily basis. You don't need to worry about this, you set the report once, and then you forget about it. And then you can think about creatives, audience testing, whatever you want, whatever is on your table. So that allows you to focus on more interesting problems, which has always been the case for me, for example, when I'm using Supermetrics, I noticed that every single time I'm able to automate something, I can use this time on something else, which is something more exciting. And also, you can report in your campaigns faster, which of course, is a great thing, since you save a lot of time and then can spend it on something else. And yeah, like I said, we help cover pretty much a variety of reporting use cases, we also have a template gallery. So you can check it out on Supermetrics.com. We have our Google Sheets template gallery, we have our Data Studio template gallery, and we're gonna link to the dashboard AJ and I created so you can see how you can visualize your LinkedIn Ads data. AJ Wilcox 26:17 Oh, I love it. Thanks for sharing those. So what are some of the results that your customers have seen for their LinkedIn Ads because they are using Supermetrics? Anna Shutko 26:26 Yeah, definitely. So first of all, they are seeing improved targeting. Like I mentioned, once are able to really select your data in a variety of different ways. You can dig deeper into it, and then understand what exactly is working and what exactly is not working. So imagine, if you're diagnosing a patient, and you have only one, two, or maybe tools, that's not really going to give you enough information into what's wrong. And a campaign cannot really tell you what's wrong about it. So once you have a whole tool set being maybe Supermetrics for Google Sheets, data warehousing, etc, etc. You can slice and dice your data in a variety of different ways. Now you can diagnose your patient much better. You can pinpoint exactly what's wrong, whether it's the campaign type, or the spend, or the audience, or maybe creative, or maybe something else. And then you can really, really understand how exactly we're going to go about this. So of course, all that leads to increase ROI, time saved, and improved communication. What we've seen within the teams, because instead of arguing, you know, oh, you've adjusted this spend in a wrong way, no we should have increased these bids, you have much more intelligent conversations. And hopefully your team dynamic improves, because you can just look at the numbers. And this is something we also use internally. We just pull up a dashboard, we just check the numbers and the numbers never lie, and then they tell you the direction you need to take. And we just go from there. So it's very, very cool to use data to your advantage. AJ Wilcox 28:16 Amen to that. And how much does Supermetrics cost for these advertisers who want to use it for their LinkedIn campaigns and haven't used it before? Anna Shutko 28:24 Yeah, definitely. So it really depends on the product. So I don't want to provide inaccurate information. So the best way to check it is to go to Supetmetrics.com and then check the data destinations you want to use. And then check how you want to report on your LinkedIn Ads campaigns. So the price for Google Sheets is of course different from the price you are going to have for your data warehouse. But if you need a custom solution, our sales team is of course happy to help you. So you can select not just LinkedIn Ads, but a variety of different connectors. And this is what I normally would recommend. So don't just pick LinkedIn Ads, you can pick Google Analytics, or LinkedIn Ads, and LinkedIn Pages, for example. You can combine these data with our ad data plus google analytics connector for our Google Data Studio destination. There is a massive combination, all different data sources and the different data destinations you can potentially have so the price of course depends on that. And also, the pricing is relatively simple. You know, it might not sound as simple when I'm trying to describe it. But once you pick your destination, once you pick your connectors, you just pick the number of your accounts and how often you want to refresh your data. But that's about it. Once you know all these factors, once you understand which one wants to go with, then it's pretty simple. AJ Wilcox 29:54 And it is really reasonably priced. I've been using the tool now for years. Absolutely love. That's why I'm doing a tool spotlight on Supermetrics when there are plenty of other LinkedIn tools that I'm probably not going to cover. So thanks for providing such an awesome tool at good pricing. All right, here's a quick sponsor break, and then we'll dive right back in the LinkedIn Speaker 4 30:13 The LinkedIn Ads Show is proudly brought to you by B2Linked. com, the LinkedIn Ads experts. AJ Wilcox 30:22 If the performance of your LinkedIn Ads is important to you, B2Linked is the partner agency you'll want to work with. We've spent over $150 million on LinkedIn Ads, and no one outperforms us on getting you the lowest cost per lead and the utmost scale. We're official LinkedIn partners, and you'll deal only with LinkedIn Ads experts from day one. Fill out the contact form on any page of B2Linked.com to chat about your campaigns, and we'd absolutely love the opportunity to get to work with you. AJ Wilcox 30:51 Alright, let's jump right back into the interview. Let me ask you, we've talked a lot about the capabilities of the platform and the company. What's your favorite feature of Supermetrics? Like, you're obviously a marketer yourself and a dang good one? What is the most helpful aspect of it to you? Anna Shutko 31:06 Yeah, sure. So first of all, I really, really love that we collaborate with our data destination partners very closely. And that allows us to develop product which sits within a data destination so to say, in most cases. Not all our products need data destinations, but most do. And I'm talking about all our Google Data Studio we co developed together with Google's team, working very closely with their engineers. So you can go to Google Data Studio, you can create any kind of report with Supermetrics, without ever leaving Google Data Studio. And this is amazing. You don't have to go from one page to another page to the next page. You just go to your Data Studio, you select LinkedIn Ads as a connector, where you connect it to your dashboard. And that's it. You can basically query data and create beautiful reports. So the experience is very, very intuitive. It's very smooth. And same thing applies to our Excel and Google Sheet product. So we have a sidebar, where you can take metrics and dimensions you want to pull. And then some magic happens here and your data just appears within a spreadsheet. So the adoption is very, very fast. I remember when I first saw our Google Sheets product, I fell in love it it instant, and it happened more than five years ago. But it's still remember it because the experience was so good, even back then. And another really, really useful feature is perhaps the ability to pull data from and report on multiple accounts easily. So I'm not talking about data sources here. But accounts, for example, you are an agency, and you're running campaigns on 50,60, 70, 100, different LinkedIn Ads accounts. And you have a really, really big client. And then they have 70 accounts. And imagine connecting these accounts one by one to your dashboard would be a complete nightmare. With Supermetrics, you can just select them all at once or then pull them all into the same spreadsheet all into one database, to a one to one Data Studio report. And then with the drop down selection, you can just take which accounts you want to see date the data from and this data will appear. It's very, very, very helpful for our agency friends over there. And the same thing happens with all the other data sources. So Google's accounts and if you want to combine your LinkedIn Ads with Google Analytics data, it's very, very easy to report on. AJ Wilcox 33:55 Very nice, I'll tell you, I have several things that I absolutely love about Super metrics. I've played with a lot inside of Data Studio. And what I love is number one, it's fast. When you're in Data Studio, and you're using Supermetrics as your data source, the pages just load nearly instantly. It's super, super fast. It's also really easy to use, like I use LinkedIn API. And I know what that's like to be looking at these metrics on the back end that have a name, and you're going I don't know what that name is. Supermetrics calls them things like every column, every source, every metric, every KPI, they're all named in ways that even just a very, very basic marketer, like brand new to the industry could still understand what it was they were building. If you've ever tried to take data directly from LinkedIn. So you export it to a CSV, and then you try to put that into Data Studio. What you'll notice is the columns aren't of the right data types, and you have to keep going into your spreadsheet and making changes when you use Data Studio or Google Sheets, but especially Data Studio with the Supermetrics connector, everything already comes in and exactly the right data types, you're never going to have to worry about, oh, my dates aren't showing up because Excel didn't recognize it was a date. And then Data Studio didn't recognize that it was either, you never have to worry about that. Something else that I love, let's say you go into campaign manager, you do an export to CSV. And it's a, let's say, an ADS report or campaigns report, when you look at that column of like, click through rate or cost per click, as soon as you try to combine that or do some kind of like an average, those averages don't mean anything. If you try to do an average of a whole bunch of percentages, it will make some kind of an average, but it'll be wrong. And Supermetrics fixes all those like every time we export something with Supermetrics. All of the columns are accurate all the time in a way that they wouldn't be from LinkedIn directly. I'll also mention one more thing, which is there was a metric that I wanted to see inside of super metrics that I knew LinkedIn had access to it was a new one. And I mentioned something to you, Anna. And you said, Oh, let me message the engineering team. And I want to say it was within like, a couple hours, you've messaged me back and said, Hey, check it, we should have that data available now. And so it's fast, like Supermetrics is always on top of new changes. Anna Shutko 36:23 I think we should definitely hire you, AJ, if you're ready to move to Helsinki, just know, just let me know, I have a spot for you on the team. AJ Wilcox 36:32 I am very good in cold weather. So we should talk about it. Let me ask what's coming up in the future that you're super excited about with Supermetrics? Anna Shutko 36:40 Yeah, so there are so many things that are coming up. First of all, we have multiple new data warehousing destinations for all of you data nerds out there. So you can store your LinkedIn Ads data in more places. And also, we are always developing our data sources, and then pattern paths for now. So you can then combine this one connects data with a variety of data sources. And that's been on the product side. I am always very excited about the new product developments. But also, I'm very exciting about The Marketing Analytics Show. This is the podcast that I host. We're gonna interview really, really cool guest. You're going to hear more about the first party data. So something AJ and I briefly talked about at the beginning of this podcast, and how you can tag your data correctly before you cleanse your data warehouse, and many, many other cool topics. And every single time I talk to these guests on super, super excited because the share very interesting viewpoint about this industry. AJ Wilcox 37:52 I love it. I'm a subscriber of the podcast, make sure you all go back and listen to episode four, because yours truly was on there. Just kidding. You don't have to listen to that particular episode, go listen to something that you don't already know super well. If you're listening to the show, you probably get everything that we talked about. Something that you mentioned, that I think is so helpful is that if you're making all of your decisions, from the data that you get directly from LinkedIn, you will find that you're making the wrong decisions. What I mean by that is like the data you get from LinkedIn on things like even conversions, leads, means next to nothing, until you find out whether those are qualified leads, whether those leads are actually turning into sales. And so Anna, what you mentioned that is so cool is this direction of moving into the data warehousing solutions. So now you have access to what LinkedIn has. But then with other CRMs and other data partners and data warehouses, you're able to then combine that with the data that you can find only from your CRM, or other sources that can report to you on number of qualified leads and other elements of lead quality, how many proposals sent, closed deals, what the deal closed for, and you can actually report on what really matters. So that makes me really excited. So final question. This might be something you've already answered. But what are you most excited about either personally or professionally, yourself? Anna Shutko 39:15 Yeah, sure. I am excited about a few things in general. And like mentioned, there have been really, really exciting product developments at Supermetrics. But one thing I wanted to pinpoint is that right now I'm building the brand measurement system. And this is basically a series of data transformations and dashboards that help combine all the data about our brand and how it's performing. And I'd like to say we drink our own champagne, it's a Supermetrics. So of course, we're using Supermetrics to consolidate all this data. So it's really exciting to work in this project. And it's really exciting to see how our product works from the client perspective. And of course, whenever I'm ready, I'm happy to share all the insights and all the learnings. AJ Wilcox 40:09 Wonderful. Well, as you're coming out with that stuff, how can people follow you? How can people obviously I would say, make sure you subscribe to The Marketing Analytics Show. But how else can people find out this stuff as you're releasing it? Anna Shutko 40:21 Yeah, sure. So I am on Twitter. So it's @AnnaShutko on Twitter. And you can just follow me there. I promise you, I really promise that I will post more and I will post more updates on the podcast and insights that are learned after building this system. And another way to connect with me is to follow me on LinkedIn, you can connect with me there, you can send me a DM and I will also be posting some of the updates there. AJ Wilcox 40:54 Oh, I love it. Okay, we'll put all those links here in the show notes below. So make sure you do follow Anna, reach out to her if you have questions. Anna, thanks so much for coming on the show. I think it's very obvious that I'm a huge Supermetrics fan. I really just appreciate our collaboration in the past, and everything you've shared. Is there anything else that you want to share with us before we jump off? Anna Shutko 41:15 Yeah, sure. Thank you so much, AJ, for inviting me, it was a very, very interesting conversation, great questions. And I love being interviewed by fellow podcast host. So one more thing before we leave. So I'll ask AJ to link the article we co-wrote in the show notes. So you can follow how exactly we came up with these four different types of spend you can monitor, and how you can report on AB test for your LinkedIn Ads campaign, there was a lot of good stuff there. So also, this article contains really practical instructions and how you can connect your LinkedIn Ads, and then create this superpower spreadsheet and then connect that spreadsheet to Data Studio dashboards so you can also use different charts and visualizations. So not only will you learn how to approach LinkedIn Ad spend reporting. You will also learn a bunch of different tools, hopefully. So check it out. I really hope you enjoy it. If you have any questions, don't hesitate to reach out to us. And I'm really, really happy to be part of this LinkedIn Ads community. AJ Wilcox 42:27 Wonderful. Well, thanks, Anna. I will definitely link to all of that. And just a big shout out to you everything that you're building is awesome for us marketers. So a huge thank you from the LinkedIn Ads community. Anna Shutko 42:37 Thank you so much. I'm so happy you are enjoying. AJ Wilcox 42:39 All right, I've got the episode resources for you coming right up. So stick around Speaker 4 42:49 Thank you for listening to the LinkedIn Ads Show. Hungry for more AJ Wilcox, take it away. AJ Wilcox 43:00 Okay, like we talked about during the show, I have the Data Studio dashboard that Anna Shutko and I created together. So check the link there in the show notes, you'll absolutely love that I'm sure. There's a killer template for budget tracking inside of Google Sheets, as well as the full Data Studio dashboard that we created together. You'll also see the link to Anna Shutko on Twitter, as well as her LinkedIn profile. So as she said, send her a DM, follow her, connect with her all that good stuff. She's also the host of The Marketing Analytics Show so you will see a link to that. Because all of you are podcast listeners, obviously, you'll definitely want to go check that out and get subscribed. Any of you who are looking to learn more about LinkedIn Ads, or maybe you have a colleague that you're training or something like that, check out the course that I did with LinkedIn on LinkedIn Learning. It's by far the least expensive and the best quality training out there and it's next to no dollars. It's pretty cheap compared to any other training. You'll enjoy it. Please do look down whatever podcast player you're listening on and make sure you hit that subscribe button. We'd love to have you back here next week. Also, please rate and review the podcast. Honestly, I say it way too much. But it really means a lot. It makes a difference to me. So please, please, please go leave us a review. We'd love that with any suggestions, questions, feedback, anything like that. Reach out to us at Podcast@B2Linked.com. And with that being said, we'll see you back here next week, cheering you on in your LinkedIn Ads initiatives.
Stephanie Wong and Debi Cabrera are learning all about BigLake from guests Gaurav Saxena and Justin Levandoski of the BigQuery team. BigLake offers unified data management from both data warehouses and data lakes. What exactly is the difference between a data warehouse and a data lake? Justin explains what a data lake is, how they came to be, and the benefits. Each data option has its cons too, like the limitations of data lakes for enterprise use. Enter BigLake built on BigQuery, which helps enterprise clients manage and analyze their data from both data warehouses and data lakes. The best features of BigQuery are now available for Google Cloud Storage and across multi-cloud solutions. Guarav describes BigLake behind the scenes and how the principles of BigQuery's data management can now be used for open file formats in BigLake. It's BigQuery for more data formats, Justin explains. BigLake solves many data problems quickly with a special emphasis on improving security. Our guests talk specifically about clients who gain the most from using BigLake, especially those looking to analyze distributed data and those who need easy and fast security and compliance solutions. With tightened security, BigLake offers access delegation and secure APIs that work over object storage. We hear about the user experience and how easy it is to get started, especially for customers already familiar with and using other GCP products. Google's advocacy of open source projects means many clients are coming in with workloads built with open source software. BigLake supports multi-cloud projects so that tables can be built on top of any data system. No matter the format of your data, you can run analytics with BigLake. We talk more about the security features of BigLake and how easy it is to unify data warehouses and data lakes with optimal data security. The customers have helped shape BigLake, and Gaurav describes how these clients are using this data software. We hear about integration with BigQuery Omni and Dataplex and how BigLake is different. In the future, Google will continue to make simple, effective solutions for data management and analytics, building further off of BigQuery. Gaurav Saxena Gaurav Saxena is a product management lead at Google BigQuery. He has 12+ years of experience building products at the intersection of cloud, data and AI. Before Google, Gaurav led product management at Microsoft Azure and Amazon Web Services for some of the most widely used cloud offerings in storage and data. Justin Levandoski Justin is a tech lead/manager in BigQuery leading BigLake and other projects pushing the frontier of BigQuery. Prior to Google, just worked on Amazon Aurora and was part of the Database research group at Microsoft Research. Cool things of the week Your ultimate guide to Speech on Google Cloud blog Announcing the Climate Innovation Challenge—grants to support cutting-edge earth research blog Interview BigLake site BigQuery site Cloud Storage site Spark site Apache Ranger site BigQuery Omni docs Apache Iceberg site Delta Lake site Presto site TensorFlow site Dataplex site What's something cool you're working on? Debi is working on a series about automatic DLP. Cloud Data Loss Prevention is now automatic and allows you to scan data across your whole org with the click of one button! Hosts Stephanie Wong and Debi Cabrera
In this episode, Warner is joined by Megha Bedi, one of Pythian's youngest Cloud Consultants to discuss cloud migrations, Google BigQuery, certifications, work from home and more!!
In this episode, we cover: 00:00:00 - Introduction 00:03:20 - VMWare Tanzu 00:07:50 - Gustavo's Career in Security 00:12:00 - Early Days in Chaos Engineering 00:16:30 - Catzilla 00:19:45 - Expanding on SRE 00:26:40 - Learning from Customer Trends 00:29:30 - Chaos Engineering at VMWare 00:36:00 - Outro Links: Tanzu VMware: https://tanzu.vmware.com GitHub for SREDocs: https://github.com/google/sredocs E-book on how to start your incident lifecycle program: https://tanzu.vmware.com/content/ebooks/establishing-an-sre-based-incident-lifecycle-program Twitter: https://twitter.com/stratus TranscriptJason: Welcome to Break Things on Purpose, a podcast about chaos engineering and building reliable systems. In this episode, Gustavo Franco, a senior engineering manager at VMware joins us to talk about building reliability as a product feature, and the journey of chaos engineering from its place in the early days of Google's disaster recovery practices to the modern SRE movement. Thanks, everyone, for joining us for another episode. Today with us we have Gustavo Franco, who's a senior engineering manager at VMware. Gustavo, why don't you say hi, and tell us about yourself.Gustavo: Thank you very much for having me. Gustavo Franco; as you were just mentioning, I'm a senior engineering manager now at VMware. So, recently co-founded the VMware Tanzu Reliability Engineering Organization with Megan Bigelow. It's been only a year, actually. And we've been doing quite a bit more than SRE; we can talk about like—we're kind of branching out beyond SRE, as well.Jason: Yeah, that sounds interesting. For folks who don't know, I feel like I've seen VMware Tanzu around everywhere. It just suddenly went from nothing into this huge thing of, like, every single Kubernetes-related event, I feel like there's someone from VMware Tanzu on it. So, maybe as some background, give us some information; what is VMware Tanzu?Gustavo: Kubernetes is sort of the engine, and we have a Kubernetes distribution called Tanzu Kubernetes Grid. So, one of my teams actually works on Tanzu Kubernetes Grid. So, what is VMware Tanzu? What this really is, is what we call a modern application platform, really an end-to-end solution. So, customers expect to buy not just Kubernetes, but everything around, everything that comes with giving the developers a platform to write code, to write applications, to write workloads.So, it's basically the developer at a retail company or a finance company, they don't want to run Kubernetes clusters; they would like the ability to, maybe, but they don't necessarily think in terms of Kubernetes clusters. They want to think about workloads, applications. So, VMWare Tanzu is end-to-end solution that the engine in there is Kubernetes.Jason: That definitely describes at least my perspective on Kubernetes is, I love running Kubernetes clusters, but at the end of the day, I don't want to have to evaluate every single CNCF project and all of the other tools that are required in order to actually maintain and operate a Kubernetes cluster.Gustavo: I was just going to say, and we acquired Pivotal a couple of years ago, so that brought a ton of open-source projects, such as the Spring Framework. So, for Java developers, I think it's really cool, too, just being able to worry about development and the Java layer and a little bit of reliability, chaos engineering perspective. So, kind of really gives me full tooling, the ability common libraries. It's so important for reliable engineering and chaos engineering as well, to give people this common surface that we can actually use to inject faults, potentially, or even just define standards.Jason: Excellent point of having that common framework in order to do these reliability practices. So, you've explained what VMware Tanzu is. Tell me a bit more about how that fits in with VMware Tanzu?Gustavo: Yeah, so one thing that happened the past few years, the SRE organization grew beyond SRE. We're doing quite a bit of horizontal work, so SRE being one of them. So, just an example, I got to charter a compliance engineering team and one team that we call ‘Customer Zero.' I would call them partially the representatives of growth, and then quote-unquote, “Customer problems, customer pain”, and things that we have to resolve across multiple teams. So, SRE is one function that clearly you can think of.You cannot just think of SRE on a product basis, but you think of SRE across multiple products because we're building a platform with multiple pieces. So, it's kind of like putting the building blocks together for this platform. So then, of course, we're going to have to have a team of specialists, but we need an organization of generalists, so that's where SRE and this broader organization comes in.Jason: Interesting. So, it's not just we're running a platform, we need our own SREs, but it sounds like it's more of a group that starts to think more about the product itself and maybe even works with customers to help their reliability needs?Gustavo: Yeah, a hundred percent. We do have SRE teams that invest the majority of their time running SaaS, so running Software as a Service. So, one of them is the Tanzu Mission Control. It's purely SaaS, and what teams see Tanzu Mission Control does is allow the customers to run Kubernetes anywhere. So, if people have Kubernetes on-prem or they have Kubernetes on multiple public clouds, they can use TMC to be that common management surface, both API and web UI, across Kubernetes, really anywhere they have Kubernetes. So, that's SaaS.But for TKG SRE, that's a different problem. We don't have currently a TKG SaaS offering, so customers are running TKG on-prem or on public cloud themselves. So, what does the TKG SRE team do? So, that's one team that actually [unintelligible 00:05:15] to me, and they are working directly improving the reliability of the product. So, we build reliability as a feature of the product.So, we build a reliability scanner, which is a [unintelligible 00:05:28] plugin. It's open-source. I can give you more examples, but that's the gist of it, of the idea that you would hire security engineers to improve the security of a product that you sell to customers to run themselves. Why wouldn't you hire SREs to do the same to improve the reliability of the product that customers are running themselves? So, kind of, SRE beyond SaaS, basically.Jason: I love that idea because I feel like a lot of times in organizations that I talk with, SRE really has just been a renamed ops team. And so it's purely internal; it's purely thinking about we get software shipped to us from developers and it's our responsibility to just make that run reliably. And this sounds like it is that complete embrace of the DevOps model of breaking down silos and starting to move reliability, thinking of it from a developer perspective, a product perspective.Gustavo: Yeah. A lot of my work is spent on making analogies with security, basically. One example, several of the SREs in my org, yeah, they do spend time doing PRs with product developers, but also they do spend a fair amount of time doing what we call in a separate project right now—we're just about to launch something new—a reliability risk assessment. And then you can see the parallels there. Where like security engineers would probably be doing a security risk assessment or to look into, like, what could go wrong from a security standpoint?So, I do have a couple engineers working on reliability risk assessment, which is, what could go wrong from a reliability standpoint? What are the… known pitfalls of the architecture, the system design that we have? How does the architectural work looks like of the service? And yeah, what are the outages that we know already that we could have? So, if you have a dependency on, say, file on a CDN, yeah, what if the CDN fails?It's obvious and I know most of the audience will be like, “Oh, this is obvious,” but, like, are you writing this down on a spreadsheet and trying to stack-rank those risks? And after you stack-rank them, are you then mitigating, going top-down, look for—there was an SREcon talk by [Matt Brown 00:07:32], a former colleague of mine at Google, it's basically, know your enemy tech talk in SREcon. He talks about this like how SRE needs to have a more conscious approach to reliability risk assessment. So, really embraced that, and we embraced that at VMware. The SRE work that I do comes from a little bit of my beginnings or my initial background of working security.Jason: I didn't actually realize that you worked security, but I was looking at your LinkedIn profile and you've got a long career doing some really amazing work. So, you said you were in security. I'm curious, tell us more about how your career has progressed. How did you get to where you are today?Gustavo: Very first job, I was 16. There was this group of sysadmins on the first internet service provider in Brazil. One of them knew me from BBS, Bulletin Board Systems, and they, you know, were getting hacked, left and right. So, this guy referred me, and he referred me saying, “Look, it's this kid. He's 16, but he knows his way around this security stuff.”So, I show up, they interview me. I remember one of the interview questions; it's pretty funny. They asked me, “Oh, what would you do if we asked you to go and actually physically grab the routing table from AT&T?” It's just, like, a silly question and they told them, “Uh, that's impossible.” So, I kind of told him the gist of what I knew about routing, and it was impossible to physically get a routing table.For some reason, they loved that. That was the only candidate that could be telling them, “No. I'm not going to do it because it makes no sense.” So, they hired me. And the student security was basically teaching the older sysadmins about SSH because they were all on telnet, nothing was encrypted.There was no IDS—this was a long time ago, right, so the explosion of cybersecurity security firms did not exist then, so it was new. To be, like, a security company was a new thing. So, that was the beginning. I did dabble in open-source development for a while. I had a couple other jobs on ISPs.Google found me because of my dev and open-source work in '06, '07. I interviewed, joined Google, and then at Google, all of it is IC, basically, individual contributor. And at Google, I start doing SRE-type of work, but for the corporate systems. And there was this failed attempt to migrate from one Linux distribution to another—all the corporate systems—and I tech-led the effort making that successful. I don't think I should take the credit; it was really just a fact of, like you know, trying the second time and kind of, learned—the organization learned the lessons that I had to learn from the first time. So, we did a second time and it worked.And then yeah, I kept going. I did more SRE work in corp, I did some stuff in production, like all the products. So, I did a ton of stuff. I did—let's see—technical infrastructure, disaster recovery testing, I started a chaos-engineering-focused team. I worked on Google Cloud before we had a name for it. [laugh].So, I was the first SRE on Google Compute Engine and Google Cloud Storage. I managed Google Plus SRE team, and G Suite for a while. And finally, after doing all this runs on different teams, and developing new SRE teams and organizations, and different styles, different programs in SRE. Dave Rensin, which created the CRE team at Google, recruited me with Matt Brown, which was then the tech lead, to join the CRE team, which was the team at Google focused on teaching Google Cloud customers on how to adopt SRE practices. So, because I had this very broad experience within Google, they thought, yeah, it will be cool if you can share that experience with customers.And then I acquired even more experience working with random customers trying to adopt SRE practices. So, I think I've seen a little bit of it all. VMware wanted me to start, basically, a CRE team following the same model that we had at Google, which culminated all this in TKG SRE that I'm saying, like, we work to improve the reliability of the product and not just teaching the customer how to adopt SRE practices. And my pitch to the team was, you know, we can and should teach the customers, but we should also make sure that they have reasonable defaults, that they are providing a reasonable config. That's the gist of my experience, at a high level.Jason: That's an amazing breadth of experience. And there's so many aspects that I feel like I want to dive into [laugh] that I'm not quite sure exactly where to start. But I think I'll start with the first one, and that's that you mentioned that you were on that initial team at Google that started doing chaos engineering. And so I'm wondering if you could share maybe one of your experiences from that. What sort of chaos engineering did you do? What did you learn? What were the experiments like?Gustavo: So, a little bit of the backstory. This is probably because Kripa mentioned this several times before—and Kripa Krishnan, she actually initiated disaster recovery testing, way, way before there was such a thing as chaos engineering—that was 2006, 2007. That was around the time I was joining Google. So, Kripa was the first one to lead disaster recovery testing. It was very manual; it was basically a room full of project managers with postIts, and asking teams to, like, “Hey, can you test your stuff? Can you test your processes? What if something goes wrong? What if there's an earthquake in the Bay Area type of scenario?” So, that was the predecessor.Many, many years later, I work with her from my SRE teams testing, for my SRE teams participating in disaster recovery testing, but I was never a part of the team responsible for it. And then seven years later, I was. So, she recruited me with the following pitch, she was like, “Well, the program is big. We have disaster recovery tests, we have a lot of people testing, but we are struggling to convince people to test year-round. So, people tend to test once a year, and they don't test again. Which is bad. And also,” she was like, “I wish we had a software; there's something missing.”We had the spreadsheets, we track people, we track their tasks. So, it was still very manual. The team didn't have a tool for people to test. It was more like, “Tell me what you're going to test, and I will help you with scheduling, I'll help you to not conflict with the business and really disrupt something major, disrupt production, disrupt the customers, potentially.” A command center, like a center of operations.That's what they did. I was like, “I know exactly what we need.” But then I surveyed what was out there in open-source way, and of course, like, Netflix, gets a lot of—deserves a lot of credit for it; there was nothing that could be applied to the way we're running infrastructure internally. And I also felt that if we built this centrally and we build a catalog of tasks ourselves, and that's it, people are not going to use it. We have a bunch of developers, software engineers.They've got to feel like—they want to, they want to feel—and rightfully so—that they wanted control and they are in control, and they want to customize the system. So, in two weeks, I hack a prototype where it was almost like a workflow engine for chaos engineering tests, and I wrote two or three tests, but there was an API for people to bring their own test to the system, so they could register a new test and basically send me a patch to add their own tests. And, yeah, to my surprise, like, a year later—and the absolute number of comparison is not really fair, but we had an order of magnitude more testing being done through the software than manual tests. So, on a per-unit basis, the quality of the ultimate tasks was lower, but the cool thing was that people were testing a lot more often. And it was also very surprising to see the teams that were testing.Because there were teams that refused to do the manual disaster recovery testing exercise, they were using the software now to test, and that was part of the regular integration test infrastructure. So, they're not quite starting with okay, we're going to test in production, but they were testing staging, they were testing a developer environment. And in staging, they had real data; they were finding regressions. I can mention the most popular testing, too, because I spoke about this publicly before, which was this fuzz testing. So, a lot of things are RPC or RPC services, RPC, servers.Fuzz testing is really useful in the sense that, you know, if you send a random data in RPC call, will the server crash? Will the server handling this gracefully? So, we fought a lot of people—not us—a lot of people use or shared service bringing their own test, and fuzz testing was very popular to run continuously. And they would find a ton of crashes. We had a lot of success with that program.This team that I ran that was dedicated to building this shared service as a chaos engineering tool—which ironically named Catzilla—and I'm not a cat person, so there's a story there, too—was also doing more than just Catzilla, which we can also talk about because there's a little bit more of the incident management space that's out there.Jason: Yeah. Happy to dive into that. Tell me more about Catzilla?Gustavo: Yeah. So, Catzilla was sort of the first project from scratch from the team that ended up being responsible to share a coherent vision around the incident prevention. And then we would put Catzilla there, right, so the chaos engineering shared service and prevention, detection, analysis and response. Because once I started working on this, I realized, well, you know what? People are still being paged, they have good training, we had a good incident management process, so we have good training for people to coordinate incidents, but if you don't have SREs working directly with you—and most teams didn't—you also have a struggle to communicate with executives.It was a struggle to figure out what to do with prevention, and then Catzilla sort of resolved that a little bit. So, if you think of a team, like an SRE team in charge of not running a SaaS necessarily, but a team that works in function of a company to help the company to think holistically about incident prevention, detection, analysis, and response. So, we end up building more software for those. So, part of the software was well, instead of having people writing postmortems—a pet peeve of mine is people write postmortems and them they would give to the new employees to read them. So, people never really learned the postmortems, and there was like not a lot of information recovery from those retrospectives.Some teams were very good at following up on extra items and having discussions. But that's kind of how you see the community now, people talking about how we should approach retrospectives. It happened but it wasn't consistent. So then, well, one thing that we could do consistently is extract all the information that people spend so much time writing on the retrospectives. So, my pitch was, instead of having these unstructured texts, can we have it both unstructured and structured?So, then we launch postmortem template that was also machine-readable so we could extract information and then generate reports for to business leaders to say, “Okay, here's what we see on a recurring basis, what people are talking about in the retrospectives, what they're telling each other as they go about writing the retrospectives.” So, we found some interesting issues that were resolved that were not obvious on a per retrospective basis. So, that was all the way down to the analysis of the incidents. On the management part, we built tooling. It's basically—you can think of it as a SaaS, but just for the internal employees to use that is similar to externally what would be an incident dashboard, you know, like a status page of sorts.Of course, a lot more information internally for people participating in incidents than they have externally. For me is thinking of the SRE—and I manage many SRE teams that were responsible for running production services, such as Compute Engine, Google Plus, Hangouts, but also, you know, I just think of SRE as the folks managing production system going on call. But thinking of them a reliability specialists. And there's so many—when you think of SREs as reliability specialists that can do more than respond to pages, then you can slot SREs and SRE teams in many other areas of a organization.Jason: That's an excellent point. Just that idea of an SRE as being more than just the operation's on-call unit. I want to jump back to what you mentioned about taking and analyzing those retrospectives and analyzing your incidents. That's something that we did when I was at Datadog. Alexis Lê-Quôc, who's the CTO, has a fantastic talk about that at Monitorama that I'll link to in the [show notes 00:19:49].It was very clear from taking the time to look at all of your incidents, to catalog them, to really try to derive what's the data out of those and get that information to help you improve. We never did it in an automated way, but it sounds like with an automated tool, you were able to gather so much more information.Gustavo: Yeah, exactly. And to be clear, we did this manually before, and so we understood the cost of. And our bar, company-wide, for people writing retrospectives was pretty low, so I can't give you a hard numbers, but we had a surprising amount of retrospectives, let's say on a monthly basis because a lot of things are not necessarily things that many customers would experience. So, near misses or things that impact very few customers—potentially very few customers within a country could end up in a retrospective, so we had this throughput. So, it wasn't just, like, say, the highest severity outages.Like where oh, it happens—the stuff that you see on the press that happens once, maybe, a year, twice a year. So, we had quite a bit of data to discuss. So, then when we did it manually, we're like, “Okay, yeah, there's definitely something here because there's a ton of information; we're learning so much about what happens,” but then at the same time, we were like, “Oh, it's painful to copy and paste the useful stuff from a document to a spreadsheet and then crunch the spreadsheet.” And kudos—I really need to mention her name, too, Sue [Lueder 00:21:17] and also [Yelena Ortel 00:21:19]. Both of them were amazing project program managers who've done the brunt of this work back in the days when we were doing it manually.We had a rotation with SREs participating, too, but our project managers were awesome. And also Jason: As you started to analyze some of those incidents, every infrastructure is different, every setup is different, so I'm sure that maybe the trends that you saw are perhaps unique to those Google teams. I'm curious if you could share the, say, top three themes that might be interesting and applicable to our listeners, and things that they should look into or invest in?Gustavo: Yeah, one thing that I tell people about adopting the—in the books, the SRE books, is the—and people joke about it, so I'll explain the numbers a little better. 70, 75% of the incidents are triggered by config changes. And people are like, “Oh, of course. If you don't change anything, there are no incidents, blah, blah, blah.” Well, that's not true, that number really speaks to a change in the service that is impacted by the incident.So, that is not a change in the underlying dependency. Because people were very quickly to blame their dependencies, right? So meaning, if you think of a microservice mesh, the service app is going to say, “Oh, sure. I was throwing errors, my service was throwing errors, but it was something with G or H underneath, in a layer below.” 75% of cases—and this is public information goes into books, right—of retrospectives was written, the service that was throwing the errors, it was something that changed in that service, not above or below; 75% of the time, a config change.And it was interesting when we would go and look into some teams where there was a huge deviation from that. So, for some teams, it was like, I don't know, 85% binary deploys. So, they're not really changing config that much, or the configuration issues are not trigger—or the configuration changes or not triggering incidents. For those teams, actually, a common phenomenon was that because they couldn't. So, they did—the binary deploys were spiking as contributing factors and main triggers for incidents because they couldn't do config changes that well, roll them out in production, so they're like, yeah, of course, like, [laugh] my minor deploys will break more on my own service.But that showed to a lot of people that a lot of things were quote-unquote, “Under their control.” And it also was used to justify a project and a technique that I think it's undervalued by SREs in the wild, or folks running production in the wild which is canary evaluation systems. So, all these numbers and a lot of this analysis was just fine for, like, to give extra funding for the scene that was basically systematically across the entire company, if you tried to deploy a binary to production, if you tried to deploy a config change to production, will evaluate a canary if the binary is in a crash loop, if the binary is throwing many errors, is something is changing in a clearly unpredictable way, it will pause, it will abort the deploy. Which back to—much easier said than done. It sounds obvious, right, “Oh, we should do canaries,” but, “Oh, can you automate your canaries in such a way that they're looking to monitoring time series and that it'll stop a release and roll back a release so a human operator can jump in and be like, ‘oh, okay. Was it a false positive or not?'”Jason: I think that moving to canary deployments, I've long been a proponent of that, and I think we're starting to see a lot more of that with tools such as—things like LaunchDarkly and other tools that have made it a whole lot easier for your average organization that maybe doesn't have quite the infrastructure build-out. As you started to work on all of this within Google, you then went to the CRE team and started to help Google Cloud customers. Did any of these tools start to apply to them as well, analyzing their incidents and finding particular trends for those customers?Gustavo: More than one customer, when I describe, say our incident lifecycle management program, and the chaos engineering program, especially this lifecycle stuff, in the beginning, was, “Oh, okay. How do I do that?” And I open-sourced a very crufty prototype which some customers pick up on it and they implement internally in their companies. And it's still on GitHub, so /google/sredocs.There's an ugly parser, an example, like, of template for the machine-readable stuff, and how to basically get your retrospectives, dump the data onto Google BigQuery to be able to query more structurally. So yes, customers would ask us about, “Yeah. I heard about chaos engineering. How do you do chaos engineering? How can we start?”So, like, I remember a retail one where we had a long conversation about it, and some folks in tech want to know, “Yeah, instant response; how do I go about it?” Or, “What do I do with my retrospectives?” Like, people started to realize that, “Yeah, I write all this stuff and then we work on the action items, but then I have all these insights written down and no one goes back to read it. How can I get actionable insights, actionable information out of it?”Jason: Without naming any names because I know that's probably not allowed, are there any trends from customers that you'd be willing to share? Things that maybe—insights that you learned from how they were doing things and the incidents they were seeing that was different from what you saw at Google?Gustavo: Gaming is very unique because a lot of gaming companies, when we would go into incident management, [unintelligible 00:26:59] they were like, “If I launch a game, it's ride or die.” There may be a game that in the first 24, or 48 hours if the customers don't show up, they will never show up. So, that was a little surprising and unusual. Another trend is, in finance, you would expect a little behind or to be too strict on process, et cetera, which they still are very sophisticated customers, I would say. The new teams of folks are really interested in learning how to modernize the finance infrastructure.Let's see… well, tech, we basically talk the same language, with the gaming being a little different. In retail, the uniqueness of having a ton of things at the edge was a little bit of a challenge. So, having these hubs, where they have, say, a public cloud or on-prem data center, and these of having things running at the stores, so then having this conversation with them about different tiers and how to manage different incidents. Because if a flagship store is offline, it is a big deal. And from a, again, SaaS mindset, if you're think of, like, SRE, and you always manage through a public cloud, you're like, “Oh, I just call with my cloud provider; they'll figure it out.”But then for retail company with things at the edge, at a store, they cannot just sit around and wait for the public cloud to restore their service. So again, a lot of more nuanced conversations there that you have to have of like, yeah, okay, yeah. Here, say a VMware or a Google. Yeah, we don't deal with this problem internally, so yeah, how would I address this? The answers are very long, and they always depend.They need to consider, oh, do you have an operational team that you can drive around? [laugh]. Do you have people, do you have staffing that can go to the stores? How long it will take? So, the SLO conversation there is tricky.a secret weapon of SRE that has definitely other value is the project managers, program managers that work with SREs. And I need to shout out to—if you're a project manager, program manager working with SREs, shout out to you.Do you want to have people on call 24/7? Do you have people near that store that can go physically and do anything about it? And more often than not, they rely on third-party vendors, so then it's not staffed in-house and they're not super technical, so then remote management conversations come into play. And then you talk about, “Oh, what's your network infrastructure for that remote management?” Right? [laugh].Jason: Things get really interesting when you start to essentially outsource to other companies and have them provide the technology, and you try to get that interface. So, you mentioned doing chaos engineering within Google, and now you've moved to VMware with the Tanzu team. Tell me a bit more about how do you do chaos engineering at VMware, and what does that look like?Gustavo: I've seen varying degrees of adoption. So, right now, within my team, what we are doing is we're actually going as we speak right now, doing a big reliabilities assessment for a launch. Unfortunately, we cannot talk about it yet. We're probably going to announce this on October at VMworld. As a side effect of this big launch, we started by doing a reliability risk assessment.And the way we do this is we interview the developers—so this hasn't launched yet, so we're still designing this thing together. [unintelligible 00:30:05] the developers of the architecture that they basically sketch out, like, what is it that you're going to? What are the user journeys, the user stories? Who is responsible for what? And let's put an architecture diagram, a sketch together.And then we tried to poke or holes on, “Okay. What could go wrong here?” We write this stuff down. More often than not, from this list—and I can already see, like, that's where that output, that result fits into any sort of chaos engineering plan. So, that's where, like—so I can get—one thing that I can tell you for that risk assessment because I participated in the beginning was, there is a level of risk involving a CDN, so then one thing that we're likely going to test before we get to general availability is yeah, let's simulate that the CDN is cut off from the clients.But even before we do the test, we're already asking, but we don't trust. Like, trust and verify, actually; we do trust but trust and verify. So, we do trust the client is actually another team. So, we do trust the client team that they cache, but we are asking them, “Okay. Can you confirm that you cache? And if you do cache, can you give us access to flush the cache?”We trust them, we trust the answers; we're going to verify. And how do we verify? It's through a chaos engineering test which is, let's cut the client off from the CDN and then see what happens. Which could be, for us, as simple as let's move the file away; we should expect them to not tell us anything because the client will fail to read but it's going to pick from cache, it's not reading from us anyways. So, there is, like, that level of we tell people, “Hey, we're going to test a few things.”We'll not necessarily tell them what. So, we are also not just testing the system, but testing how people react, and if anything happens. If nothing happens, it's fine. They're not going to react to it. So, that's the level of chaos engineering that our team has been performing.Of course, as we always talk about improving reliability for the product, we talked about, “Oh, how is it that chaos engineering as a tool for our customers will play out in the platform?” That conversation now is a little bit with product. So, product has to decide how and when they want to integrate, and then, of course, we're going to be part of that conversation once they're like, “Okay, we're ready to talk about it.” Other teams of VMWare, not necessarily Tanzu, then they do all sorts of chaos engineering testing. So, some of them using tools, open-source or not, and a lot of them do tabletop, basically, theoretical testing as well.Jason: That's an excellent point about getting started. You don't have a product out yet, and I'm sure everybody's anticipating hearing what it is and seeing the release at VMworld, but testing before you have a product; I feel like so many organizations, it's an afterthought, it's the, “I've built the product. It's in production. Now, we need to keep it reliable.” And I think by shifting that forward to thinking about, we've just started diagramming the architecture, let's think about where this can break. And how we can build those tests so that we can begin to do that chaos engineering testing, begin to do that reliability testing during the development of the product so that it ships reliably, rather than shipping and then figuring out how to keep it reliable.Gustavo: Yeah. The way I talked to—and I actually had a conversation with one of our VPs about this—is that you have technical support that is—for the most part, not all the teams from support—but at least one of the tiers of support, you want it to be reactive by design. You can staff quite a few people to react to issues and they can be very good about learning the basics because the customers—if you're acquiring more customers, they are going to be—you're going to have a huge set of customers early in the journey with your product. And you can never make the documentation perfect and the product onboarding perfect; they're going to run into issues. So, that very shallow set of issues, you can have a level of arterial support that is reactive by design.You don't want that tier of support to really go deep into issues forever because they can get caught up into a problem for weeks or months. You kind of going to have—and that's when you add another tier and that's when we get to more of, like, support specialists, and then they split into silos. And eventually, you do get an IC SRE being tier three or tier four, where SRE is a good in-between support organizations and product developers, in the sense that product developers also tend to specialize in certain aspects of a product. SRE wants to be generalists for reliability of a product. And nothing better than to uncover reliability for product is understanding the customer pain, the customer issues.And actually, one thing, one of the projects I can tell you about that we're doing right now is we're improving the reliability of our installation. And we're going for, like, can we accelerate the speed of installs and reduce the issues by better automation, better error handling, and also good—that's where I say day zero. So, day zero is, can we make this install faster, better, and more reliable? And after the installs in day one, can we get better default? Because I say the ergonomics for SRE should be pretty good because we're TKG SREs, so there's [unintelligible 00:35:24] and SRE should feel at home after installing TKG.Otherwise, you can just go install vanilla Kubernetes. And if vanilla Kubernetes does feel at home because it's open-source, it's what most people use and what most people know, but it's missing—because it's just Kubernetes—missing a lot of things around the ecosystem that TKG can install by default, but then when you add a lot of other things, I need to make sure that it feels at home for SREs and operators at large.Jason: It's been fantastic chatting with you. I feel like we can go [laugh] on and on.Gustavo: [laugh].Jason: I've gone longer than I had intended. Before we go, Gustavo, I wanted to ask you if you had anything that you wanted to share, anything you wanted to plug, where can people find you on the internet?Gustavo: Yeah, so I wrote an ebook on how to start your incident lifecycle program. It's not completely out yet, but I'll post on my Twitter account, so twitter.com/stratus. So @stratus, S-T-R-A-T-U-S. We'll put the link on the [notes 00:36:21], too. And so yeah, you can follow me there. I will publish the book once it's out. Kind of explains all about the how to establish an incident lifecycle. And if you want to talk about SRE stuff, or VMware Tanzu or TKG, you can also message me on Twitter.Jason: Thanks for all the information.Gustavo: Thank you, again. Thank you so much for having me. This was really fun. I really appreciate it.Jason: For links to all the information mentioned, visit our website at gremlin.com/podcast. If you liked this episode, subscribe to the Break Things on Purpose podcast on Spotify, Apple Podcasts, or your favorite podcast platform. Our theme song is called “Battle of Pogs” by Komiku, and it's available on loyaltyfreakmusic.com.
About NipunNipun Agarwal is Vice President, MySQL HeatWave and Advanced Development, Oracle. His interests include distributed data processing, machine learning, cloud technologies and security. Nipun was part of the Oracle Database team where he introduced a number of new features. He has been awarded over 170 patents.Links:HeatWave: https://oracle.com/heatwave TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: You could build you go ahead and build your own coding and mapping notification system, but it takes time, and it sucks! Alternately, consider Courier, who is sponsoring this episode. They make it easy. You can call a single send API for all of your notifications and channels. You can control the complexity around routing, retries, and deliverability and simplify your notification sequences with automation rules. Visit courier.com today and get started for free. If you wind up talking to them, tell them I sent you and watch them wince—because everyone does when you bring up my name. Thats the glorious part of being me. Once again, you could build your own notification system but why on god's flat earth would you do that?Corey: This episode is sponsored in part by our friends at VMware. Let's be honest—the past year has been far from easy. Due to, well, everything. It caused us to rush cloud migrations and digital transformation, which of course means long hours refactoring your apps, surprises on your cloud bill, misconfigurations and headache for everyone trying manage disparate and fractured cloud environments. VMware has an answer for this. With VMware multi-cloud solutions, organizations have the choice, speed, and control to migrate and optimizeapplications seamlessly without recoding, take the fastest path to modern infrastructure, and operate consistently across the data center, the edge, and any cloud. I urge to take a look at vmware.com/go/multicloud. You know my opinions on multi cloud by now, but there's a lot of stuff in here that works on any cloud. But don't take it from me thats: VMware.com/go/multicloud and my thanks to them again for sponsoring my ridiculous nonsense.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Today's promoted episode is slightly off the beaten track. Normally in tech, we tend to find folks that have somewhere between an 18 to 36-month average tenure at companies. And that's great, however, let's do the exact opposite of that today. My guest is Nipun Agarwal, who's the VP of MySQL HeatWave and Advanced Development at Oracle, where you've been an employee for 27 years, is it?Nipun: That's absolutely right. 27 years and that was my first job out of school. So, [laugh] yes.Corey: First, thank you for joining me. It is always great to talk to people who have focused on an area that I only make fun of from a distance, in this case, databases which, you know, DNS works well enough for most use cases, but occasionally customers have other constraints. You are clearly at or damn near at the top of your field. In my pre-show research, I was able to unearth that you have—what is it now, 170, 180 filed patents that have been issued?Nipun: That's right. 180 issued patents. [laugh].Corey: You clearly know what you're doing when it comes to databases.Nipun: Thank you for the opportunity. Yes, thank you.Corey: So, being a VP at Oracle, but starting off as your first job as almost a mailroom to the executive suite style story, we don't see those anymore. In most companies, it very much feels like the path to advance is to change jobs to other companies. It's still interesting seeing that that's not always the path forward, for some folks. I think that the folks who have been in companies for a long time need more examples and role models to look at in that sense, just because it is such an uncommon narrative these days. You're not bouncing around between four companies.Nipun: Yeah. I've been lucky enough to have joined Oracle, and although I had been at Oracle, I've been on multiple teams at Oracle and there has been a great opportunity of talent, colleagues, and projects, where even to this day, I feel that I have a lot more to learn. And there are opportunities within the company to learn and to grow. So no, I've had an awesome ride.Corey: Let's dive in a little bit to something that's been making the rounds recently, specifically you've released something called HeatWave, which has been boasting some, frankly, borderline unbelievable performance benchmarks, and of course, everyone loves to take a crack at Oracle for a variety of reasons, so Twitter is very angry. But I've learned at some point, through the course of my career, to disambiguate Twitter's reactions from what's actually happening out there. So, let's start at the beginning. What is HeatWave?Nipun: HeatWave is an in-memory query accelerator for MySQL. It accelerates complex, long-running, analytic queries. The interesting thing about HeatWave is, with HeatWave we now have a single MySQL database which can run all your applications, whether they're OLTP, whether they're mixed workloads, or whether they're analytics, without having to move the data out of MySQL. Because in the past, people would need to move the data from MySQL to some other database running analytics, so people would end up with two different databases. With this single database, no need for moving the data, and all existing tools and applications which worked with MySQL continue to work, except they will be much faster. That's what HeatWave is.Corey: The benchmarks that you are publishing are fairly interesting to me, specifically, the ones that I've seen are, you've classified HeatWave as six-and-a-half times faster than Amazon Redshift, seven times faster than Snowflake, nine times faster than BigQuery, and a number of other things, and fourteen hundred times faster than Amazon Aurora. And what's interesting to me about the things that you're naming is they're not all data-warehouse style stuff. Aurora, for example, is Amazon's interpretation of an in-house developed managed database service named after a Disney Princess. And it tends to be aimed at things that are not necessarily massive scale. What is the sweet spot, I guess, of HeatWaves data sizes when it comes to really being able to shine?Nipun: So, there are two aspects where our customers are going to benefit from HeatWave. One characteristics is the data size, but the other characteristics is the complexity of the queries. So, let's first do the comparison with Aurora—and that's a very good question—the 1400 times comparison we have shown, yes, if you take the TPC-H queries on a four terabyte workload and if you run them, that's what you're going to see. Now, the interesting thing is this: not only is it 1400 times faster it's also at half the price because for most of these systems, if you throw more gear, if you throw more hardware, the performance would vary. So, it's very important to go with how much of performance and at what price.So, for pure analytics—say, for four terabytes—is 1400 times faster at half the price. So, if it provides truly 800 times better price performance compared to Aurora for pure analytics. Now, let's take the other extreme. 100 gigabytes—which is a much smaller, your bread and butter database—and this is for mixed workloads. So, something like a CH-benCHmark, which has a combination of say, some TPC-C transactions, and then some added IPP-CH queries, which—the CH benCHmark.Here we have 42 times advantage price performance over Aurora because we are 42% of the cost, less than half the cost of Aurora and for the complex queries, we are about 18 times faster, and for pure OLTP, we are at par. So, the aggregate comes out to be about 42 times better. So, the mileage varies depending upon the data size and depending upon the complexity of the queries. So, in the case of Aurora, it will be anywhere from 42 times better price performance all the way to 2800.Corey: Does this have an upper bound, for example? Like, if we take a look at something like Redshift or something like Snowflake, where they're targeting petabyte-scale workloads at some point, that becomes a very different story for a lot of companies out there. Is that something that this can scale to, or is there a general reasonable upper bound of, okay, once you're above X number of terabytes, it's probably good to start looking at tiering data out or looking at a different solution?Nipun: We designed HeatWave primarily for those customers who had to move the data out of MySQL database into some other database for running analytics. The upper bound for the data in the MySQL database is 64 terabytes. Based on the demand and such we are seeing, we support 32 terabytes processing in HeatWave at any given point in time. You can still have 64 terabytes in the MySQL database, but the amount of data you can load into the HeatWave cluster at any given point in time is 32 terabytes.Corey: Which is completely reasonable. I would agree with you from not having much database exposure myself in the traditional sense, but from a cloud economics standpoint alone, anytime you have to move data to a different database for a different workload, you're instantly jacking costs through the roof. Even if it's just the raw data volumes, you now have to store it in two different places instead of one. Plus, in many cases, the vaguearities of data transfer pricing in many places wind up meaning that you're paying money to move things out, there's a replication story, there's a sync factor, and then it just becomes a management overhead problem. If there's a capacity to start using the data where it is in more intelligent ways, that alone has a massive economic wind, just from a time it takes your team to not have to focus on changing infrastructure and just going ahead to run the queries. If you want to start getting into the weeds of all the different ways something like this is an economic win, there's a lot of angles to look at it from.Nipun: That's an excellent point and I'm very glad you brought it up. So, now let's take the other set of benchmarks we were talking about: Snowflake. So, HeatWave is seven times faster and one-fifth the cost; it's about 35 times better price performance. Compared to let's say Redshift AQUA, six-and-a-half times faster at half the cost, so 13 times better price performance. And it goes on and on.Now, these numbers I was quoting is for 10 terabytes TPC-H queries. And the point which you said is very, very valid. When we are talking about the cost for these other systems, it's only the cost for analytics without including the cost of the source database or without including the cost of moving the data or managing to different databases. Whereas when you're talking about the cost of HeatWave, this is the cost which includes the cost of both transaction processing as well as the analytics. So, it's a single database; all the cost is included, whereas, for these other vendors, it's only the cost of the analytic database. So, the actual cost to a user is probably going to be much higher with these other databases. So, the price performance advantage with HeatWave will perhaps be even higher.Corey: Tell me a little bit about how it works. I mean, it's easy to sit here and say, “Oh, it's way faster and it's better in a bunch of benchmark stuff,” and we will get into that in a little bit, but it's described primarily as an in-memory query accelerator. Naively, I think, “Oh, it's just faster because instead of having data that lives on disk, it winds up having some of it live in RAM. Well, that seems simple and straightforward.” Like, oh, yeah, I'm going to go on a limb and assume that there aren't 160 patents tied to the idea that RAM is faster than disk. There's clearly a lot more going on. How does this work? What is it foundationally?Nipun: So, the thing to realize is HeatWave has been built from the ground up for the cloud and it is optimized for the Oracle Cloud. So, let's take these things one at a time. When I say designed from the ground up for the cloud, we have actually invented and implemented new algorithms for distributed query processing, which is what gives us such a good advantage in terms of operations like joint processing, window functions, aggregations. So, we have come up—invented, implemented new algorithms for distributed query processing. Secondly, we have designed it for the cloud.And by that what I mean is, A, we have a lot of emphasis on scalability, that it scales to thousands of cores with a very, very good scale factor, which is very important for the cloud. The next angle about the cloud is that not only have we optimized it for the cloud, but we have gone with commodity cloud services, meaning, for instance, when you're looking at the storage, we are looking at the least expensive price. So, for instance, we use object store; you don't use, for instance, locally attached SSDs because that will be expensive. Similarly, for compute: instead of using Intel, we use AMD chips because they are less expensive. Similarly, networking: standard networking.And all of this has been optimized for the specific Oracle Cloud infrastructure shapes we have, for the specific VMs we use, for the specific networking bandwidth we get, for the object store bandwidth and such; so that's the third piece, optimized for OCI. And the last bit is pervasive use of machine learning in the service. So, a combination of these four things: designed for the cloud, using commodity cloud services, optimized for the quality cloud infrastructure, and finally the pervasive use of machine learning is what gives us very good performance, very good scale, at a very inexpensive price.Corey: I want to dig into the idea of the pervasive use of machine learning. In many cases, machine learning is the answer to how do I wind up bilking a bunch of VCs out of money? And Oracle is not a venture-backed company at this stage of its existence, it is a very large, publicly-traded entity; you have no need to do that. And I would also further accept that this is one of those bounded problem spaces where something that looks machine-learning-like could do very well. Is that based upon what it observes and learns from data access patterns? Is it something that it learns based from a specific workload in question? What is the gathering, and is it specific to individual workloads that a given customer has, or is it holistically across all of the database workloads that you see in Oracle Cloud?Nipun: So, there are multiple parts to this question. The first thing is—and I think as you're noting—that with the cloud, we have a lot more opportunity for automation because we know exactly what is the hardware stack, we know the software stack, we know the configuration parameters.Corey: Oh yes, hell is other people's data centers, for sure.Nipun: [laugh]. And the approach we have taken for automation is machine-learning-based automation because one of the big advantages is that we can have a model which is tailored to a specific instance and as you run more queries, as you run more workloads, the system gets more intelligent. And we can talk about that maybe later about, like, specific things which make it very, very compelling. The third thing, I think, which you were alluding to, is that there are two aspects in machine learning: data, and the models or the algorithms. So, the first thing is, we have made a lot of enhancements, both to the MySQL engine as well as HeatWave, to collect new kinds of data.And by new kinds of data, I mean, that not only do we collect statistics of data, but we collect statistics of, say, the queries: what was the compilation time? What was the execution time? And then, based on this data which we're collecting, we have then come up with very advanced algorithms—machine learning algorithms—which are, again, a lot of them, there is, like, you know, patterns or [IP 00:14:13] which we have built on top of the existing state of art. So, for instance, taking these statistics and extrapolating them on larger data sizes. That's completely an innovation which we did in-house.How do we sample a very small percentage of the data and still be accurate? And finally, how do we come up with these machine learning models which are accurate without hiring an army of engineers? That's because we invented our AutoML, which is very efficient. So, that's basically the ecosystem of the machine learning which we have, which has been used to provide this.Corey: It's easy for folks to sit there and have a bunch of problems with Oracle for a variety of reasons, some of which are no longer germane, some of which are, I'm not here to judge. But I think it's undeniable—though it sometimes gets eclipsed by people's knee-jerk reactions—the reason that Oracle is in so many companies that it is in is because it works. You folks have been pioneers in the database space for a very long time and that's undeniable. If it didn't deliver performance that was untouchable for a long time, it would not have gotten to the point where you now are, where it is the database of record for an awful lot of shops. And I know it's somehow trendy, sometimes, for the startup set to think, “Oh, big companies are slow and awful. All innovation comes out of small, scrappy startups here.”But your customers are not fools. They made intelligent decisions based upon constraints that they're working within and problems that they need to solve. And you still have an awful lot of customers that are not getting off of Oracle anytime soon because it works. It's one of those things that I think is nuanced and often missed. But I do feel the need to ask about the lock-in story. Today, HeatWave is available only on the managed MySQL service in Oracle Cloud, correct?Nipun: Correct.Corey: Is there any licensing story tied to that? In other words, “Well, if I'm going to be using this, I need to wind up making a multi-year commitment. I need to get certain support things, as well,” the traditional on-premises Oracle story. Or is this an actual cloud service, in that you pay for what you use while you use it, and when you turn it off, you're done? In theory. In practice, we know in cloud economics, no one ever turns anything off until the company goes out of business.Nipun: So, it's exactly the letter what you said that this is a managed service. It's pay as you go, you pay only for what you consume, and if you decide to move on, there's absolutely no license or anything that is holding you back. The second thing—and I'm glad you brought it up—about the vendor lock-in. One of the very important things to realize about HeatWave is, A, it's just an accelerator for MySQL, but in the process of doing so, we have not introduced any proprietary syntax. So, if customers have the MySQL application running on some other cloud, they can very easily migrate to OCI and try MySQL HeatWave.But for whatever reason, if they don't like it, and they want to move out, there is absolutely nothing which is holding them back. So, the ease of which they can come in with the same ease they can walk out because we don't have any vendor lock-in. There is absolutely no proprietary extensions to HeatWave.Corey: There is the counter-argument as far as lock-in goes, and we see this sometimes with companies we talk to that were considering Google Cloud Spanner, as an example. It's great, and you can use it in a whole bunch of different places and effectively get ACID-compliance-like behavior across multiple regions, and you don't have to change any of the syntax of what it is you're using except the lock-in there is one of a strategic or software architecture lock-in because there's nothing else quite like that in the universe, which means that if you're going to migrate off of the single cloud where that's involved, you have to re-architect a lot, and that leads to a story of lock-in. I'm curious as to whether you're finding that customers are considering that as far as the performance that you're giving for MySQL querying is apparently unparalleled in the rest of the industry; that leads to a sort of lock-in itself when people get used to that kind of responsiveness and build applications that expect that kind of tolerances. At some point, if there's nothing else in the industry like it, does that means that they find themselves de-facto locked in?Nipun: If you were to talk about some functionality which we are offering which no one else is offering, perhaps you could, kind of, make that case. But that's not the case for performance because when we are so much faster—so suppose I said, okay, we are so much faster; we are six-and-a-half times faster than Redshift at half the cost. Well, if someone wanted the same performance, they can absolutely do it Redshift on a much larger cluster, and pay a lot more. So, if they want the best performance at the best price, they can come to Oracle Cloud; if they want the same performance but they will have to pay more, they can go anywhere else. So, I don't think that's a vendor lock-in at all.That's a value which we are bringing in that for the same performance, we are much cheaper. Or you can have that kind of a balance that we are faster and cheaper. So, there is no lock-in. So, it's not to say that, okay, we have made some extensions to MySQL which are only available in our cloud. That is not at all the case.Now, for some other vendors and for some other applications—you brought up Spanner; that's one. But we have had multiple customers of MySQL who, when they were trying Google BigQuery, they mentioned this aspect that, okay, Google BigQuery had these proprietary extensions and they feel locked in. That is not the case at all with HeatWave.Corey: This episode is sponsored by our friends at Oracle HeatWave is a new high-performance accelerator for the Oracle MySQL Database Service. Although I insist on calling it “my squirrel.” While MySQL has long been the worlds most popular open source database, shifting from transacting to analytics required way too much overhead and, ya know, work. With HeatWave you can run your OLTP and OLAP, don't ask me to ever say those acronyms again, workloads directly from your MySQL database and eliminate the time consuming data movement and integration work, while also performing 1100X faster than Amazon Aurora, and 2.5X faster than Amazon Redshift, at a third of the cost. My thanks again to Oracle Cloud for sponsoring this ridiculous nonsense.Corey: I do want to call out, just because it seems like there's a lies, damned lies, and database benchmarks story here where, for example, Azure for a while was doing a campaign where they were five times less expensive for database workloads than AWS until you scratched beneath the surface and realize it's because they're playing ridiculous games with licensing, making it very expensive to run a Microsoft SQL Server on anything that wasn't Azure. Customers are not necessarily as credulous as they once were when it comes to benchmarking. And Oracle for a long time hasn't really done benchmarking, and in fact, has actively discouraged it. For HeatWave, you've not only published benchmarks, which okay, vendors can say anything they want, and I'm going to wait until I see independent returns, but you put not just the benchmarks, but data sets, and your entire methodology onto GitHub as well. What led to that change? That seems like the least Oracle-like thing I could possibly imagine.Nipun: I couldn't take credit for the idea. The idea actually was from our Chief Marketing Officer, that was really his idea. But here is the reason why it makes a lot more sense for us to do it for MySQL HeatWave. MySQL is pervasive; pretty much any cloud vendor you can think about has a MySQL-based managed service. And obviously, MySQL runs on premise, like a lot of customers and applications do it.Corey: That's one of the baseline building blocks of any environment. I don't even need to be in the cloud; I can get MySQL working somewhere. Everyone has it, and if not, why don't you? And I can build it in a VM myself in 20 minutes.Nipun: That's right.Corey: It is a de-facto standard.Nipun: That's right. So, given that is the case and many other cloud vendors are innovating on top of it—which is great—how do you compare the innovation or the value proposition of Cloud Vendor A with us? So, for that, what we felt was that it is very important and very fair that we publish our scripts so that people can run those same scripts with a HeatWave, as well as with other cloud offerings, and make a determination for themselves. So, given the popularity of MySQL and given that pretty much all cloud vendors provide an offering of MySQL, and many of them have enhanced it, in order for customers to have an apples-to-apples comparison, it is imperative that we do this.Corey: I haven't run benchmarks myself just yet, just because it turns out, there's a lot of demands on my time and also, as mentioned, I'm not a deep database expert, unless it comes to DNS. And we keep waiting for people to come back with, “Aha. Here's why you're completely comprised of liars.” And I haven't heard any of that. I've heard edges and things here about, “Well, if you add an index over here, it might speed things up a bit,” but nothing that leads me to believe that it is just a marketing story.It is a great marketing story, but things like this fall apart super quickly in the event that it doesn't stand up to engineering scrutiny. And it's been out long enough that I would have fully expected to have heard about it. Lord knows if anyone is listening and has thoughts on this, I will be getting some letters after this episode, I expect. But I've come to expect those; please feel free to reach out. I'm always thrilled to do follow-up episodes and address things like this.When does it make sense from your perspective for someone to choose HeatWave on top of the Oracle Cloud MySQL service instead of using some of the other things we've talked about: Aurora, Redshift, Snowflake, et cetera? When does that become something that a customer should actively consider? Is it for net-new workloads? Should they consider it for migration stories? Should they run their database workloads in Oracle Cloud and keep other stuff elsewhere? What is the adoption path that you see that tends to lead to success?Nipun: All customers of MySQL, or all customers of any open-source database, those would be absolutely people who should consider MySQL HeatWave. For the very simple reason: first, regardless of the workload, whether it is OLTP only, or mixed workloads, or analytics, the cost is going to be significantly lower. I'll say at least it's going to be half the cost. In most of the cases, it's probably going to be less than half the cost. So, right off the bat, customers save half the cost by moving to MySQL HeatWave.And then depending upon the workload you have, as you have more complex queries, the performance advantage starts increasing. So, if you were just running only OLTP, if you only had transactions and you didn't have any complex queries—which is very unlikely for real-world applications, but even if that was the case, you're going to save 60% by going to MySQL HeatWave. But as you have more complex queries you will start finding that the net advantage you're going to get with performance is going to keep increasing and will go anywhere from 10 times aggregate to as much as 1400 times. So, all open-source, MySQL-based applications, they should consider moving. Then you mentioned about Snowflake, Redshift, and such; for all of them, it depends on what the source database is and what is it that they're trying to do.If they are moving data from, say, some open-source databases, if they are ETL-ing from MySQL, not only will MySQL HeatWave be much faster and much cheaper, but there's going to be a tremendous value proposition to the application because they don't need to have two different applications for two different databases. They can come back to MySQL, they can have a single database on which they can run all their applications. And then again, you have many of these cloud-native applications are born in the cloud where people may be looking for a simple database which does the job, and this is a great story—both in terms of cost as well as in terms of performance—and it's a single database for all your applications, significantly reduces the complexity for users.Corey: To turn the question around a little bit, what sort of workloads is MySQL HeatWave not a fit for? What sort of workloads are going to lead to a poor customer experience? Where, yeah, this is not a fit for that workload?Nipun: None, except in terms of the data size. So, if you have data sizes which are more than 64 terabytes, then yes, MySQL HeatWave is not a good fit. But if your data size is under 64 terabytes, you're going to win in all the cases by moving to MySQL HeatWave, given the functionality and capabilities of MySQL.Corey: I'd also like to point out that recently, HeatWave gained the MySQL Autopilot capability, which I believe is a lot of the machine learning technologies that you were speaking about a few minutes ago. Are there plans to continue to expand what HeatWave does and offer additional functionality? And—if you can talk about any of that. I know that roadmap is always something that is difficult to ask about, but it's clear that you're investing in this. Is your area of investment looking more like it's adding additional features? Is it continuing to improve existing performance? Something else entirely? And of course, we also accept you can't tell me any of [laugh] that has a valid answer.Nipun: Well, we just got started, so we just had our first [GF 00:27:03] HeatWave in December, and you saw that earlier this week we had our second major release of HeatWave. We are just getting started, so absolutely we are investing a lot in this area. But we are pretty much going to attempt all the things that you said. We have feedback from existing customers which is very high up on the priority list. And some of these are just one, say, class of enhancements which [unintelligible 00:27:25], can HeatWave handle larger sizes of data? Absolutely, we have done that; we will continue doing that.Second is, can HeatWave accelerate more constructs or more queries? Absolutely, we will do that. And then you have other kinds of capabilities which customers are asking which you can think of are, like you know, bigger features, which for instance, we announced the support for scale-out data storage which improves recovery time. Well, you're going to improve the recovery time or you're going to improve the time it takes to restart the database. And when I say improve, we are talking about not an improvement of 2X or 3X, but it's 100 times improvement for, let's say, a 10 terabyte data size.And then we have a very good roadmap which, I mean, it's a little far out that I can't say too much about it, but we will be adding a lot of very good new capabilities which will differentiate HeatWave even more, compared to the competitive services.Corey: You have very clearly forgotten more about databases than most of us are ever going to know. As you've been talking to folks about HeatWave, what do you find is the most common misunderstanding that folks like me tend to come away with when we're discussing the technology? What is it that is, I guess, a nuance that is often being missed in the industry's perspective as they evaluate the new technology?Nipun: One aspect is that many times, people just think about a service to be here some open-source code or some on-premise code which is being hosted as a managed service. Sure, there's a lot of value to having a managed service, don't get me wrong, but when you have innovations, particularly when you have spent years in years or decades of innovation for something which is optimized for the cloud, you have an architectural advantage which is going to pay dividends to customers for years and years to come. So, there is no substitute for that; if you have designed something for the cloud, it is going to do much better whether it's in terms of performance, whether it's in terms of scalability, whether it's in terms of cost. So, that's what people have to realize that it takes time, it takes investment, but when we start getting the payoff, it's going to be fairly big. And people have to think that okay, how many technologies or services are out there which have made this kind of investment?So, what I'm really excited about is, MySQL is the most popular database amongst developers in the world; we spend a lot of time, a lot of person-years investing over the last, you know, decade, and now we are starting to see the dividends. And from what we have seen so far, the response has been terrific. I mean, it's been really, really good response, and we are very excited about it.Corey: I want to thank you for taking so much time to speak with me today. If people want to learn more, where can they go?Nipun: Thank you very much for the opportunity. If they would like to know more, they can go to oracle.com/heatwavewhere we have a lot of details, including a technical brief, including all the details of the performance numbers we talked about, including a link to the GitHub where they can download the scripts. And we encourage them to download the scripts, see that they're able to reproduce the results we said, and then try their workloads. And they can find information as to how they can get free credits to try the service for free on their own and make up their mind themselves.Corey: [laugh]. Kicking the tires on something is a good way to form an opinion about it, very often. Thank you so much for being so generous with your time. I appreciate it.Nipun: Thank you.Corey: Nipun Agarwal, Vice President of MySQL HeatWave and Advanced Development at Oracle. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an insulting comment formatted as a valid SQL query.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
La transformación digital sucede de manera más fluida cuando las empresas trabajan juntas. Y la integración entre SAP y Google Cloud es un claro ejemplo de ello. Es posible aprovechar las mejores funcionalidades que ofrece la nube, sin pérdidas ni interrupciones, y brindar mayor resiliencia a tu empresa. A fin de cuentas, para que los clientes de SAP logren triunfar en la digitalización, la velocidad de la innovación en sus operaciones se torna crítica, y la migración hacia la nube es el camino más recomendado para la modernización de los sistemas. En un nuevo episodio de Voces de la Nube, Manuel Guadarrama, Customer Engineer de Google Cloud, analiza la integración entre SAP y Google Cloud para infraestructuras inteligentes, para que tu empresa pueda aumentar la velocidad de innovación, la eficiencia operativa y la optimización de costos, con solo modernizar tus sistemas. Voces de la nube es el podcast oficial de Google Cloud para América Latina, donde debatiremos sobre la transformación digital y la transición a la nube cada quince días, con ejecutivos y expertos de nuestro equipo e invitados especiales. Consulta los enlaces mencionados en este episodio: SAP en Google Cloud: https://cloud.google.com/solutions/sap Guía de planificación de alta disponibilidad de SAP HANA: https://cloud.google.com/solutions/sap/docs/sap-hana-ha-planning-guide Guía de operaciones de SAP HANA: https://cloud.google.com/solutions/sap/docs/sap-hana-operations-guide Visión general de SAP en Google Cloud: https://cloud.google.com/solutions/sap/docs/overview-of-sap-on-google-cloud Más información sobre la administración de API de Apigee: https://cloud.google.com/apigee Más información sobre BigQuery: https://cloud.google.com/bigquery Exporta datos de sistemas SAP a Google BigQuery: https://cloud.google.com/solutions/sap/docs/bigquery-sap-export-using-sds?hl=es_419 ¿Tu empresa obtiene más valor cuando las aplicaciones de SAP se ejecutan en Google Cloud?: https://inthecloud.withgoogle.com/solving-for-innovation-21/sap_on_google_cloud_business_value_infographic_oct_twentytwenty_SPLA.pdf El impacto económico total del SAP: https://inthecloud.withgoogle.com/solving-for-innovation-21/SP_The_Total_Economic_Impact_Of_SAP_On_Google_Cloud.pdr Guía para CIO sobre la modernización de aplicaciones: https://inthecloud.withgoogle.com/solving-for-innovation-21/SPA_wp_cios_guide_to_application_migraton.pdf ¿Te gustó el episodio o tienes alguna sugerencia? Comunícate con nosotros por correo electrónico escribiendo a vocesdelanube@google.com
The Hashmap RTE team sits down with Hashmap On Tap host Kelly Kohlleffel to provide a sneak-peek into their recent 2021 Cloud Data Platform Benchmarking and Analysis where they spent 400+ hours comparing Snowflake, AWS Redshift, Azure Synapse, Google BigQuery, & Databricks using the industry-standard TPC-DC dataset measuring query performance, load rates, transformation speed, and overall cost across 33 dimensions to provide an unbiased perspective and see how they shape up. Listen in to find out how they achieved their in-depth technical analysis. Show Notes: 2021 Cloud Data Platform Benchmark Analysis Workshop: https://www.hashmapinc.com/snowflake-benchmarking On tap for today's episode: Matcha Green Tea, Yogi Blueberry Slim Life, Lipton Black Tea, & Chai Tea Contact Us: https://www.hashmapinc.com/reach-out
This week we try to make sense of Snowflake’s stance on open source and review the State of Serverless. Plus, some advice on parking cars in Amsterdam. Rundown Striking a balance with ‘open’ at Snowflake (https://www.infoworld.com/article/3617938/striking-a-balance-with-open-at-snowflake.html) Matt Asay’s Take on Snowflake (https://twitter.com/mjasay/status/1395809597806366720?s=21) Data Warehouse Wars: Snowflake Vs. Google BigQuery (https://seekingalpha.com/article/4429909-data-warehouse-wars-snowflake-vs-google-bigquery) The State of Serverless (https://www.datadoghq.com/state-of-serverless/) Relevant to your interests Tracking the San Francisco Tech Exodus (https://sfciti.org/sf-tech-exodus/) SolarWinds CEO reveals much earlier hack timeline, regrets company blaming intern (https://www.cyberscoop.com/solarwinds-ceo-reveals-much-earlier-hack-timeline-regrets-company-blaming-intern/) U.S. Treasury calls for stricter cryptocurrency compliance with IRS, says they pose tax evasion risk (https://www.cnbc.com/2021/05/20/us-treasury-calls-for-stricter-cryptocurrency-compliance-with-irs.html) The Full Story of the Stunning RSA Hack Can Finally Be Told (https://www.wired.com/story/the-full-story-of-the-stunning-rsa-hack-can-finally-be-told/) Coinbase suffers outages as cryptos plummet in massive sell-off (https://nypost.com/2021/05/19/coinbase-suffers-outages-as-cryptos-plummet-in-massive-sell-off/) Snapchat's partner summit this year was done entirely in augmented reality (https://www.axios.com/snapchat-ar-spectacles-partner-summit-8bb68470-aa48-402f-a531-aa6c5a78b17b.html) Snap says it now has 500 million monthly active users (https://www.axios.com/snapchat-500-users-developer-tools-92932ae6-a26c-445f-87b6-92775a454f71.html?utm_source=newsletter&utm_medium=email&utm_campaign=newsletter_axioslogin&stream=top) Snap buys WaveOptics, a company that makes parts for augmented reality glasses, in $500 million deal (https://www.cnbc.com/2021/05/21/snap-buys-augmented-reality-company-waveoptics-in-500-million-deal.html) Azure services fall over in Europe, Microsoft works on fix (https://www.theregister.com/2021/05/20/microsoft_azure_outage/) Twitter previews Ticketed Spaces, says it’ll take a 20 percent cut of sales (https://www.theverge.com/2021/5/21/22447328/twitter-ticketed-spaces-monetization-stripe-approval) Oracle insiders say there is a ‘culture of fear’ under the leadership of its key cloud unit (https://nullednow.in/trending/oracle-insiders-say-there-is-a-culture-of-fear-under-the-leadership-of-its-key-cloud-unit/2021/) It took 'over 80 different developers' to review and fix 'mess' made by students who sneaked bad code into Linux (https://www.theregister.com/2021/05/21/linux_5_13_patches/) The Unstoppable Battery Onslaught (https://caseyhandmer.wordpress.com/2021/05/20/the-unstoppable-battery-cavalcade/) Netflix Reportedly Wants To Get Into The Video Games Industry (https://www.forbes.com/sites/carlypage/2021/05/23/netflix-reportedly-wants-to-get-into-the-video-games-industry/) Auto Makers Retreat From 50 Years of ‘Just in Time’ Manufacturing (https://www.wsj.com/articles/auto-makers-retreat-from-50-years-of-just-in-time-manufacturing-11620051251?mod=searchresults_pos1&page=1) Overwork Killed More Than 745,000 People In A Year, WHO Study Finds (https://www.npr.org/2021/05/17/997462169/thousands-of-people-are-dying-from-working-long-hours-a-new-who-study-finds) Oracle intros Arm-powered cloud, includes on-prem option for big spenders (https://www.theregister.com/2021/05/25/oracle_ampere_cloud/) Twilio invests in adaptive communications platform Hyro (https://techcrunch.com/2021/05/25/twilio-invests-in-adaptive-communications-platform-hyro/) Microsoft Build 2021 Book of News (https://news.microsoft.com/build-2021-book-of-news/) Announcing General Availability of Microsoft Build of OpenJDK (https://devblogs.microsoft.com/java/announcing-general-availability-of-microsoft-build-of-openjdk/) Microsoft has built an AI-powered autocomplete for code using GPT-3 (https://www.theverge.com/2021/5/25/22451144/microsoft-gpt-3-openai-coding-autocomplete-powerapps-power-fx) Green Software Foundation (https://greensoftware.foundation/) The 17 Ways to Run Containers on AWS - Last Week in AWS (https://www.lastweekinaws.com/blog/the-17-ways-to-run-containers-on-aws/) Please fix the AWS Free Tier before somebody gets hurt (https://cloudirregular.substack.com/p/please-fix-the-aws-free-tier-before?ck_subscriber_id=512830314) Colonial Pipeline Ransomware Attack: CISOs React (https://www.bankinfosecurity.com/colonial-pipeline-ransomware-attack-cisos-react-a-16698) Privacy on iPhone | Tracked | Apple (https://www.youtube.com/watch?v=8w4qPUSG17Y) Happy Blurpthday to Discord, a Place for Everything You Can Imagine (https://blog.discord.com/happy-blurpthday-to-discord-a-place-for-everything-you-can-imagine-fc99ee0a77c0) Nonsense The Ford F-150 Lightning Is the Electric Vehicle of Dystopia (https://www.wired.com/story/ford-lightning-f150-electric-vehicle-dystopia/) Apple cofounder Steve Wozniak has an unusual approach to his finance (https://twitter.com/robaeprice/status/1396483218954543113?s=21) Twitch launches a dedicated "hot tubs" category after advertiser pushback (https://www.theverge.com/2021/5/21/22447898/twitch-hot-tub-category-launches-amouranth-advertising) Nonstop flights to Amsterdam back on Austin's horizon (https://www.bizjournals.com/austin/news/2021/05/24/nonstop-flights-to-amsterdam-rescheduled-for-2022.html) Sponsors CBT Nuggets — Training available for IT Pros anytime, anywhere. Start your 7-day Free Trial today at cbtnuggets.com/sdt (https://cbtnuggets.com/sdt) strongDM — Manage and audit remote access to infrastructure. Start your free 14-day trial today at: strongdm.com/SDT (http://strongdm.com/SDT) Jobs Work with Coté (https://twitter.com/tiffanyfayj/status/1397339215021547521) Conferences RabbitMQ Summit (https://rabbitmqsummit.com), July 13-14, 2021. SpringOne (https://springone.io), Sep 1st to 2nd. June 3rd modernization webinar for EMEA (https://twitter.com/cote/status/1394655403468804105) THAT July 26,- July 29, in Wisconsin, Submissions Open Through (https://that.us/activities/call-for-counselors/wi/2021) June 14 (https://that.us/activities/call-for-counselors/wi/2021) SDT news & hype Join us in Slack (http://www.softwaredefinedtalk.com/slack). Send your postal address to stickers@softwaredefinedtalk.com (mailto:stickers@softwaredefinedtalk.com) and we will send you free laptop stickers! Follow us on Twitch (https://www.twitch.tv/sdtpodcast), Twitter (https://twitter.com/softwaredeftalk), Instagram (https://www.instagram.com/softwaredefinedtalk/) and LinkedIn (https://www.linkedin.com/company/software-defined-talk/). Brandon built the Quick Concall iPhone App (https://itunes.apple.com/us/app/quick-concall/id1399948033?mt=8) and he wants you to buy it for $0.99. Use the code SDT to get $20 off Coté’s book, (https://leanpub.com/digitalwtf/c/sdt) Digital WTF (https://leanpub.com/digitalwtf/c/sdt), so $5 total. Become a sponsor of Software Defined Talk (https://www.softwaredefinedtalk.com/ads)! Recommendations Brandon: Last Breath | Netflix (https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&ved=2ahUKEwj5q_Tm9ufwAhVFG80KHUbWCZUQFjAKegQIDBAD&url=https%3A%2F%2Fwww.netflix.com%2Ftitle%2F80215139&usg=AOvVaw2apUD-Dg8ldP5UFF1a7D5H) Coté: American Gods (https://en.wikipedia.org/wiki/American_Gods_(season_3)), season 3 (https://en.wikipedia.org/wiki/American_Gods_(season_3)) laylacodesit (https://www.twitch.tv/laylacodesit)’s Twitch Channel The Leprechauns of Software Engineering (https://leanpub.com/leprechauns) Photo Credit (https://unsplash.com/photos/vMneecAwo34) Photo Credit (https://unsplash.com/photos/44EOhICreKo)
How to Power Enterprises with Intelligent Applications with Jordan Tigani of SingleStore * Jordan Tigani is the Chief Product Officer at SingleStore. He was the co-founding engineer on Google BigQuery. He also led engineering teams then product teams at BQ. * SingleStore powers Comcast with their streaming analytics to drive proactive care and real-time recommendations for their 300K events per second. Since switching to SingleStore, Nucleus Security converted its first beta account to a paying customer, increased the number of scans Nucleus can process in one hour by 60X, and saw speed improvement of 20X for the slowest queries. * To be more competitive in our new normal, organizations must make real-time data-driven decisions. And to create a better customer experience and better business outcomes, data needs to tell customers and users what is happening right now. * With the pandemic accelerating digitization, and new database companies going public (Snowflake) and filing IPOs (Couchbase), the database industry will continue to grow exponentially, with new advanced computing technologies emerging over the next decade. Companies will begin looking for infrastructure that can give real-time analytics -- they can no longer afford to use technology that cannot handle the onslaught of data brought by the pandemic. * True Digital in Thailand utilizes SingleStore’s in-the-moment analytics to develop heat maps around geographies with large COVID-19 infection rates to see where people are congregating, pointing out areas to be avoided, and ultimately, flattening the curve of COVID-19. In two weeks’ time, SingleStore built a solution that could perform event stream processing on 500K anonymized location events every second for 30M+ mobile phones. * Businesses need to prioritize in-app analytics: This will allow you to influence customer's behaviors within your application or outside of it based on data. Additionally, businesses must utilize a unified database that supports transactions and analytics to deliver greater value to customers and business. * Enterprises must access technology that can handle different types of workloads, datasets and modernize infrastructure, and use real-time analytics. Shownotes Links: - https://www.linkedin.com/in/jordantigani - https://twitter.com/jrdntgn - www.SingleStore.com ( http://www.singlestore.com ) - https://www.linkedin.com/company/singlestore/ - https://www.singlestore.com/media-hub/releases/research-highlights-spike-in-data-demands-amid-pandemic/ - https://www.singlestore.com/media-hub/releases/businesses-reconsidering-existing-data-platforms/ *About HumAIn Podcast* The HumAIn Podcast is a leading artificial intelligence podcast that explores the topics of AI, data science, future of work, and developer education for technologists. Whether you are an Executive, data scientist, software engineer, product manager, or student-in-training, HumAIn connects you with industry thought leaders on the technology trends that are relevant and practical. HumAIn is a leading data science podcast where frequently discussed topics include ai trends, ai for all, computer vision, natural language processing, machine learning, data science, and reskilling and upskilling for developers. Episodes focus on new technology, startups, and Human Centered AI in the Fourth Industrial Revolution. HumAIn is the channel to release new AI products, discuss technology trends, and augment human performance. Advertising Inquiries: https://redcircle.com/brands Privacy & Opt-Out: https://redcircle.com/privacy
On The Cloud Pod this week, the team is feeling nostalgic and a little nerdy, as you can see from the show title — a throwback to Serial Console and its ability to add a ton of characters when you didn't want it to. A big thanks to this week's sponsors: Foghorn Consulting, which provides full-stack cloud solutions with a focus on strategy, planning and execution for enterprises seeking to take advantage of the transformative capabilities of AWS, Google Cloud, and Azure. This week's highlights Amazon should be singing a different tune. Google has astonished us all by actually sharing something interesting. Azure is the strict school principal that just canceled lunch. General News: Justin Said It First VentureBeat predicts industry clouds could be the next big thing. Justin will take the royalties check anytime, VentureBeat. Amazon Web Services: Please Don't Keep It To Yourself Red Hat OpenShift Service on AWS is now generally available. Surprising because we don't remember it going into beta. AWS Distro for OpenTelemetry adds StatsD and Java support. We're glad to see the continued investment in OpenTelemetry. AWS DevOps Monitoring Dashboard solution is now generally available. The solutions library is a Rube Goldberg machine. Amazon Lookout for Metrics is now generally available — perfect for Ryan, who has no machine learning experience. Amazon Elasticsearch Service announces a new Auto-Tune feature for improved performance and application availability. We wish Amazon would open source this. AWS SSO credential profile support is now available in the AWS Toolkit for VS Code. Thank you, Jesus. Amazon is developing a chip to power the hardware switches that shuttle data around networks. Apparently Google and Apple are also doing this. Troubleshoot boot and networking issues with new EC2 Serial Console. Must be useful for someone, maybe the people using enclaves? Google Cloud Platform: Blame Active Directory Google wants customers to rethink their cloud migration strategy. Actually quite an interesting blog post — no, this is not sarcasm! Google BigQuery was named a leader in the 2021 Forrester Wave: Cloud Data Warehouse. We actually agree with this; it really is a great product. Google announces Cloud SQL for SQL Server now comes with Active Directory authentication. Helpful only if you are on GCP Active Directory. Azure: Pay To Play Azure has released several new compliance management capabilities to the Azure Security Center. We think this is really, really cool. Microsoft named a leader in the 2021 Forrester Wave: Function-as-a-Service Platforms. Congratulations to Microsoft. TCP Lightning Round Justin cuts through the awkward silence and takes this week's point, leaving scores at Justin (4), Ryan (3), Jonathan (5). Other headlines mentioned: Backup for Azure Managed Disk is now generally available Amazon EKS now supports Elastic Fabric Adapter Amazon WorkDocs offers additional sharing controls throughout its Android app Amazon SageMaker now supports private Docker registry authentication Amazon API Gateway now provides IAM condition keys for governing endpoint, authorization and logging configurations Amazon Timestream now supports Amazon VPC endpoints Create forecasting systems faster with automated workflows and notifications in Amazon Forecast AWS Config adds pagination support for advanced queries that contain aggregate functions AWS WAF adds support for Request Header Insertion Amazon DocumentDB (with MongoDB compatibility) now supports Event Subscriptions Announcing AWS Step Functions' integration with Amazon EMR on EKS Amazon EMR now supports Amazon EC2 Instance Metadata Service v2 AWS Security Hub integrates with Amazon Macie to automatically ingest sensitive data findings for improved centralized security posture management Amazon SageMaker Autopilot adds Model Explainability Things Coming Up Public Sector Summit Online — April 15–16 Discover cloud storage solutions at Azure Storage Day — April 29 AWS Regional Summits — May 10–19 AWS Summit Online Americas — May 12–13 Microsoft Build — May 19–21 (Digital) Google Financial Services Summit — May 27th Harness Unscripted Conference — June 16–17 Google Cloud Next — Not announced yet (one site says Moscone is reserved June 28–30) Google Cloud Next 2021 — October 12–14, 2021 AWS re:Invent — November 29–December 3 — Las Vegas Oracle Open World (no details yet)
Willem Pienaar is the co-creator of Feast, the leading open source feature store, which he leads the development of as a tech lead at Tecton. Previously, he led the ML platform team at Gojek, a super-app in Southeast Asia. Learn more: https://twitter.com/willpienaar (https://twitter.com/willpienaar) https://feast.dev/ (https://feast.dev/) Every Thursday I send out the most useful things I've learned, curated specifically for the busy machine learning engineer. Sign up here: https://www.cyou.ai/newsletter (https://www.cyou.ai/newsletter) Follow Charlie on Twitter: https://twitter.com/CharlieYouAI (https://twitter.com/CharlieYouAI) Subscribe to ML Engineered: https://mlengineered.com/listen (https://mlengineered.com/listen) Comments? Questions? Submit them here: http://bit.ly/mle-survey (http://bit.ly/mle-survey) Take the Giving What We Can Pledge: https://www.givingwhatwecan.org/ (https://www.givingwhatwecan.org/) Timestamps: 02:15 How Willem got started in computer science 03:40 Paying for college by starting an ISP 05:25 Willem's experience creating Gojek's ML platform 21:45 Issues faced that led to the creation of Feast 26:45 Lessons learned building Feast 33:45 Integrating Feast with data quality monitoring tools 40:10 What it looks like for a team to adopt Feast 44:20 Feast's current integrations and future roadmap 46:05 How a data scientist would use Feast when creating a model 49:40 How the feature store pattern handles DAGs of models 52:00 Priorities for a startup's data infrastructure 55:00 Integrating with Amundsen, Lyft's data catalog 57:15 The evolution of data and MLOps tool standards for interoperability 01:01:35 Other tools in the modern data stack 01:04:30 The interplay between open and closed source offerings Links: https://github.com/feast-dev/feast (Feast's Github) https://blog.gojekengineering.com/data-science/home (Gojek Data Science Blog) https://www.getdbt.com/ (Data Build Tool (DBT)) https://www.tensorflow.org/tfx/data_validation/get_started (Tensorflow Data Validation (TFDV)) https://feast.dev/post/a-state-of-feast/ (A State of Feast) https://cloud.google.com/bigquery (Google BigQuery) https://www.amundsen.io/ (Lyft Amundsen) https://www.cortex.dev/ (Cortex) https://www.kubeflow.org/ (Kubeflow) https://mlflow.org/ (MLFlow)
On this week's episode of Location Matters we dive into the world of location intelligence in marketing with Marketing Manager (and usual podcast host), Sarah Butler and CARTO team lead at Liveli, Dion Fleming. We discuss how marketers are using location intelligence platforms and spatial data to help glean insights into buying behaviour of their customers and, in particular, look into a webinar Liveli recently hosted, alongside Google Cloud partner, Data Runs Deep and experiential gift company, Red Balloon. In this episode we go a bit deeper into the sorts of data marketers collect, how this can be supplemented with third party data streams, how we can ingest the data into Google BigQuery and then visualise this information using CARTO. We also look at how this location data can dictate where we allocate advertising spend and how, in a world of Covid-19, we can use this information to make business decisions. On the Location Matters podcast, we cover all topics geospatial, whether that's new technologies, partnerships or inspiring people in the industry. If you want to be updated on when the latest Location Matters podcast episodes come out - please hit 'subscribe' on Apple Podcasts, 'follow' on Spotify or Stitcher. Liveli - https://liveli.com.au/ Webinar: Understand your product demand: Your Google Analytics data visualised with CARTO and BigQuery - https://content.liveli.com.au/drdcartowebinar CARTO 5 Newsletter - https://go.carto.com/subscribe-to-the-carto-5 Blog: Google Analytics spatial data visualised with BigQuery: https://newsroom.liveli.com.au/google-analytics-spatial-data-visualized-with-bigquery You can follow Liveli on Facebook, LinkedIn and Twitter.
The MapScaping Podcast - GIS, Geospatial, Remote Sensing, earth observation and digital geography
What BigQuery GIS is and why Google does not talk about big data anymore Get More Involved Leave A Review on iTunes: https://podcasts.apple.com/us/podcast/the-mapscaping-podcast/id1452297085 Join the email list and I will send you the show notes from this episodes https://mapscaping.com/podcast Happy to connect with you on LinkedIn https://www.linkedin.com/in/
En estas semanas se está llevando a cabo el evento de presentación de nuevos servicios de Google, en esta ocasión de manera completamente virtual. Uno de los productos más interesantes hasta lo presentado ahora es BigQuery Omni. Esta solución trabaja sobre Google Anthos, que revisamos en un episodio anterior. La idea central de BigQuery Omni es procesar la información en la fuente para evitar el traslado de datos de un proveedor de nube a otro. Ya que uno de los mayores costos de procesar información en ambientes multicloud es la salida de datos. Es importante destacar que en los últimos tiempos parece que la estrategia de Google es convertirse en un integrador de servicios Multi-cloud, proponiendo soluciones que no necesariamente deben estar hospedadas en su propia nube, al contrario de AWS y Azure que promueven sus propios servicios. Si gustan revisar este servicio, les dejó el enlace en la notas del episodio: https://cloud.withgoogle.com/next/sf/onair Mi correo: — jonatan@simplementenube.com Para más información puedes visitar: — https://academiadenube.com
Mark Rittman is joined by special guests Drew Banin, co-founder of Fishtown Analytics and maintainer of dbt (data build tool) and Stewart Bryson, long-time friend of the show and CEO/Co-founder of Red Pill Analytics to talk about scaling modern data stack projects from startups to the enterprise; how do you deal with data quality issues when there's no central record of customers, how do we introduce data governance, enterprise requirements and meet the needs of enterprise architects and how do we scale concepts such as agile and analytics as engineering beyond our initial champion and data team?The dbt ViewpointFishtown Analytics and Drew BaninMulti-Channel Marketing Attribution using Segment, Google BigQuery, dbt and LookerGetting Started with dbt.Red Pill Analytics
Mark Rittman is joined by special guests Drew Banin, co-founder of Fishtown Analytics and maintainer of dbt (data build tool) and Stewart Bryson, long-time friend of the show and CEO/Co-founder of Red Pill Analytics to talk about scaling modern data stack projects from startups to the enterprise; how do you deal with data quality issues when there's no central record of customers, how do we introduce data governance, enterprise requirements and meet the needs of enterprise architects and how do we scale concepts such as agile and analytics as engineering beyond our initial champion and data team?The dbt ViewpointFishtown Analytics and Drew BaninMulti-Channel Marketing Attribution using Segment, Google BigQuery, dbt and LookerGetting Started with dbt.Red Pill Analytics
Big data sounds great, but how can marketers extract insights and put together reports without spending all of their time crunching numbers? This week on The Inbound Success Podcast, Anna Shutko of Supermetrics talks about how marketers today are dealing with data. From juggling data from 5+ sources, to wrangling spreadsheets and figuring out how to continuously monitor your data pipeline, Anna shares how Supermetrics clients are taking on these challenges while saving themselves considerable time - and how you can, too. Highlights from my conversation with Anna include: Supermetrics is a marketing automation tool that transfers data from a variety of sources to the marketer's destination of choice. In addition, Supermetrics offers data warehousing through Supermetrics for BigQuery. Supermetrics' goal is to make marketers' lives better and easier so they can focus on what actually matters. Anna says that marketers today need to be technologists who know their business, know their platforms, know at which stage of the funnel they want to use the platforms, and know how to use data from all those platforms together to create a comprehensive narrative from their data. According to Anna, the best KPI for any marketer is revenue. If revenue is growing, then marketing is doing its job. One of Supermetrics' customers was able to cut the time they spend on reporting down from three to four days a week to a few hours. With a platform like Supermetrics, which allows you to continuously keep your data updated in real time, you can simply check the data once a day, knowing that its up to date, and then go about your business. You can also simply provide your stakeholders (ex. board) with a link to view your data at their convenience. Anna says that the biggest mistake marketers make is to focus on vanity metrics like impressions. Resources from this episode: Marketing Technology Landscape Supergraphic Supermetrics Reporting Template Gallery Supermetrics Customer Success Stories Sleeping Giant Media Success Story Supermetrics HubSpot connector Supermetrics for BigQuery Inbound Success Podcast episode 111 with Jake Neill This Won't Scale playbook by Drift SaaStr Podcast for all things SaaS The Growth Hub Podcast for marketing topics Julian Shapiro's guides Listen to the podcast to learn more about how marketers are cutting their time spent on reporting using Supermetrics. Transcript Kathleen Booth (Host): Welcome back to the Inbound Success Podcast. I'm your host, Kathleen Booth. And Today my guest is Anna Shutko, who is a product marketing manager with Supermetrics. Welcome, Anna. Anna: Hey, Kathleen, and thank you so much for having me on the show. It's such a pleasure to be here. Kathleen: Yeah. And I think you might actually qualify as my guest, one of the guests who is coming from the furthest away because you are in Finland right now. Correct? Anna: Correct. Yes, we are based in Helsinki, Finland. And yeah, so originally from Russia, and I moved to Finland and I've been living here for about seven years now. Kathleen: All right, and how -- just because the weather is changing here, so I'm currently kind of obsessed with weather -- how cold is it where you are? Anna: Basically, it's plus seven degrees Celsius. I'm sorry, I don't know what it's like in Fahrenheit. Kathleen: Cold, cold. I know that's cold. Anna: Kind of cold yeah. It usually drops to minus 20. So it's- Kathleen: Oh my gosh. I don't know how you do it, I would not survive in that climate. Well, it is getting colder here and the seasons are changing. But I'm so excited to have you on and to pick your brain because we're going to talk a little bit about analytics, which is something that's very near and dear to my heart. But it's one of those topics I think people talk a lot about, but they don't get very specific on and so I am actually really excited to get specific with you. Anna: Yes please. About Anna and Supermetrics Kathleen: So before we dive into this, though, can you just talk a little bit about, first of all, yourself and what you do and also what Supermetrics does? Anna: Yeah, sure. So I'm Anna Shutko and I've been working in Supermetrics for three years now. So I am one of the first employees of the company, I joined as employee number seven in 2016. And since then we've had a really, really rapid growth. So it's indeed an exciting journey. And I'm still continuing as you can imagine, the company is not the same as it was, not the same at all. Now we're hitting 70 like headcount. So it's been quite a wild ride. And I started as a marketing generalist, because as you can imagine, we're a team of seven, and everybody was doing everything, I was the second employee on the marketing team. And as the company grew I realized that product that's Supermetrics does is my passion and I want to devote more and more time to it. Now as we are hiring more people, I'm actually able to concentrate in product more and more as we go so I'm very excited about it. And in the future, I will be leading integrations marketing, which means, and I will explain everything how Supermetrics works and what integrations are in a minute, like integrations as their own stream as their own branch of marketing, so to say, so yeah, pretty excited about it. And like I mentioned, I fell in love with the product from day one. I remember how I was applying to Supermetrics, and I opened the website, and I saw this amazing product in the website was look really, really bad, but the idea was there. And yeah, since then, we changed the website and we added many more new and far more amazing products but I'm continuously in love with the company and products that we do so this is where my passion as a product marketing comes from. Kathleen: I have to just say, as a marketer, I have to laugh when I hear you say that you came in and you had a bad website because this -- I have experienced that in my career. And I never know whether to be excited or sad, because sad that you're coming in and the website stinks but excited that you get to come in and like change it and immediately show such big results of your marketing efforts. Like a website redesign is an awesome opportunity to just make a huge impact on a company's marketing so there's great opportunity there as a marketer, but it's also like "aargh." Anna: Yeah, I totally feel you on that we had a huge redesign project, but actually now the website really matches the company's identity of the company's products and shows how amazing they are. So I would prefer to see it as an opportunity. Kathleen: Yeah, you guys have a great website. So if you're listening and you have not checked out the Supermetrics site, definitely take a look at it. It's really well done and very cohesive from a visual branding standpoint. I've always liked your site. Anna: Thank you so much. Yeah, so a couple of words on what is Supermetrics and what do we actually do in this little red box. So, Supermetrics is a marketing automation tool and we started by developing a tool, which transfers data from different data sources, or as we call them "Integrations", those things, which transfer data from different APIs to different data destinations. So we transfer data from platforms like Google Ads, Google Analytics, Facebook ads, Twitter Ads, and now new ones for example Quora Ads, name it to spreadsheet tools and we started from transferring all this data to Excel then we move into G Suite. So next product was Supermetrics for Google Sheets aka transferring data from now it's 50 plus sources to Google Sheets. Then as Data Studio got rolled out, we partnered up with Google and we're actually the first ones to develop Connectors, which work entirely in Google Data Studios UI. So transferring all these different data to Google Data Studio. And now we enter the data warehousing space with our newest product Supermetrics for BigQuery and this is a completely new product game changer. So marketers can take advantage of BigQuery and store a lot of historical data there without necessarily learning how to code, really like hardcore, so everything is pretty intuitive. You can set transfers, and then visualize the data in big powerhouses that we're calling Tableau, Power BI for example. So that's the evolution of Supermetrics. In short, I love to describe it as a data pipeline, just easy to imagine, right, pipeline, we transfer data as if it's like water, for example, to all those different data destinations, and keep the work flowing. So previously, without Supermetrics marketers had to copy, paste, or download CSVs. So imagine, if you need a report for your client tomorrow, you have to go to every single platform like Facebook ads, Google Ads like I mentioned, ecetera, copy, paste, or then download all those different CSVs and compile them into one file. Edit every single data type and make sense out of the data and it was nightmare. I cannot even imagine how people did it without Supermetrics before. So we basically automate the whole thing so there is a really smooth sidebar or engage Google Data Studio there is this selection tool where you can very easily connect to all the sources you need. And you can select, which data do I want. For example, I want clicks from yesterday's clicked by campaigning for example, I want Facebook ads campaigns. And boom, this data just appears in your spreadsheet. It's really easy. I think it's the easiest if you watch the video, and I will add all the links to the video. So then people can pause the podcast, follow along or check our site out if they want to. So yeah, you will just really see how easy it is to create a marketing report and our motto, so our idea is to make marketers' life better and easier so they can focus on what actually matters like talking to the client, analyzing this data, spotting trends, sharing this report with their colleagues. If it's a collaborative tool, like Data Studio, it's super easy to do. And because we're a data pipeline, it gives us this flexibility. So we don't really have a fixed data destination where we transfer everything. People already know how to use Excel, so they can just transfer their data there and just go ahead and continue their work. So that's who we are. How marketers are taking on big data Kathleen: I love that. This whole topic is so interesting to me, and I was just having this conversation with somebody the other day, because my company is also in the data space, but we just happened to be in cyber security but there's a similar problem with marketing and with cyber security, namely that, there's all this sort of excitement around the availability of big data. And data is wonderful but what winds up happening I think, a lot of the time is there's a lot of noise and not a lot of signal. And meaning there's a ton of data, but you don't necessarily need to look at all data, right? You need to get to the data that matters the most. And the most important thing isn't the data itself it's the insights you source from it. And so, I would love to just kind of get your thoughts on especially for marketers. Do you see marketers successfully dealing with that challenge right now and how do they do that? It is such a big, hairy kind of area of I could be measuring all the things and tracking all the things. I guess this is like 10 questions in one I want to ask you so many things, like what are the most important metrics? How are marketers winnowing it down to what matters the most? Like, you guys work with a lot of companies, how many exactly is it? Anna: So yeah, indeed we do and I think I already previously mentioned to you, so it's 400k, 400,000 people who've tried or are using Supermetrics across all the different products, so huge numbers. Kathleen: That's interesting, it must give you some pretty fascinating insight into what information marketers are tracking and what they're looking at and what sources they're drawing data from. So let's start out actually by a lot of the people who are listening to this podcast, a lot of them tend to be practicing marketers and they're senior enough that they deal with strategy, but they're also kind of deep in the weeds with some tactical execution. And if somebody is listening and thinking I need to set up a reporting framework and I need to decide what are the most important KPIs to track? Can you share a little bit of, through what you see in the platform, like, what are those top KPIs that you tend to see marketers looking at? Anna: Yeah, so of course every single marketing reporting framework is unique and it depends on the company, there is no right or wrong, there is no one framework or one approach I could share and then everybody would apply it and then I would be in a very happy place. I wish that would be possible. But it's an art, it's science and everybody has to use their own judgment. Of course, I can pinpoint some things for example, nowadays you're completely right -- marketing is becoming more and more and more data driven. And marketing is actually becoming more and more technical. So there was this one chart I love referring to which is called the MarTech 5000. Not sure if you've heard of it. And it just shows on a larger scale, how the MarTech space has transformed over the years. So in 2011, there were something about, if I remember correctly, 150 solutions. And right now there are over 7,000 solutions. So imagine all those platforms and every single marketer is using maybe in their own platform, or some unique custom setups in the same HubSpot or Salesforce in the same platform everybody's using. So like I mentioned, is becoming more data driven, it's becoming more unique and is becoming increasingly complex. And what I see is that the profession is changing so we're not just more curious anymore, we have to be marketing technologists to successfully implement all those strategies. So knowing the platform and knowing at which stages of your funnel, you should use a particular platform, maybe it's a new platform, like Quora Ads for example. And it's an entirely new set of metrics because the nature of platform is different. You also have to take that into consideration. So basically to sum it up, knowing your business, knowing the platform, knowing at which stage of the funnel you want to use this platform, and knowing -- and this is where Supermetrics comes into play very nicely -- how we can use data from all those platforms together to create a comprehensive narrative from your data. Say you want to use, for example, Search Ads as top of the funnel, this is what we see commonly happening, people using Search Ads, maybe display ads to attract attention so they will be metrics like impressions, to impact your further questions like impressions clicks, in a way micro conversions or conversions as in their positioning to the website or going into down the funnel. Then in the bottom of the funnel, people are already more familiar with the company. So there can be many different other platforms coming into play that continue handling data so they can go on the website track. So then there is Google Analytics. They continue with another platform. Quora Ads again is a very good example because there you can have different targeting levels and you can target different questions now that people have already got their food for thought about your company. And in the end, you can, again, hammer them with more maybe brand-related content now that they're already familiar with your brand and then lead them gradually to closure. And again, this is where understanding of the product comes in handy. I will give our own Supermetrics example. So we have Supermetrics templates, basically, those are free to use files, which people can use and they work with our Connectors. So it works like this, you get this file, you click three buttons, and it all happens in Data Studio UI or, for example, Google Sheets UI and this is gets populated with your data as you use Supermetrics Connectors. But the trick is that you have to use Supermetrics Connectors to automate this dashboard. Of course, you can put your own numbers and the formulas would work, there is no problem with that you can also use it manually. But the beauty of those templates is to use them in an automated manner. So by knowing that those templates, activate trials, again, if we talking about SaaS, you know that in the bottom of your funnel, you can put this specific lead magnet, like in our case, this is the Landscape, there can be some our tool and then usually tracking through Custom Code or through Google Analytics, how those things convert and then afterwards I think that at this point, people start using more and more complicated platforms to track this post-purchase journey to accurately predict what kind of people convert? How do those people behave? And are there any like rookie purchases? So this is, again, where HubSpot comes in very handy. The platform has expanded a lot. Or Salesforce, then you can connect this data from Salesforce to top of the funnel, or middle of the funnel content data and then see how people who click on your ads and search literally through the whole journey have converted and what kind of people are there and based on that data, then you craft an improved marketing journey. Now that was a really long explanation but yeah, just hope to get the general idea out there so that you should know the business you're in. You should know the tool, you should know how to use those tools together, how to use this data together. And yeah, just focus on metrics like ROI that's my personal belief because marketing cannot function separately or completely separately from overall business, it has to bring results, it has to bring insights. So I think revenue is a very solid indicator of whether something working or not working, and in our case, this will be ROIs. Marketing tool sprawl Kathleen: Yeah, that makes sense and you touched upon something I wanted to ask further about, which is you have to know your platforms and I think you said you need to be a technologist these days, which I think is really so accurate. There are so many different platforms and you can't just be a strategist anymore you have to know how to get in and make these software tools sing for you, because that's where a lot of the value gets unlocked. Do you have a sense? Well, let me back up how many different data sources or platforms does Supermetrics integrate with right now? Anna: It depends on the data destination. So for example, for BigQuery, it's far more complex to add a data source, so we have less of them there. But I would say that more than 50 if we don't count those in detail, or like early access, fully integrated, fully developed platforms, there are around 50 and I have to say that our engineers did a great job because not only do we provide the basic of I call them the basic metrics for some platforms like HubSpot, for example, or Adobe analytics, we also provide the Custom metrics. So if people have created their own metrics, they are also able to fish them out with our tool and like visualize them. Kathleen: So there's about 50 different fully integrated platforms and plenty more kind of in development. Do you have any sense from the way that you all have seen customers using Supermetrics of, on average, how many different sources the typical marketer is pulling in? I'm just curious. Anna: Yeah, of course, I will give you a very, very rough number because there is no generalization to be made. Some people prefer to use one platform very heavily others prefer to use a bundle. But I would say that around maybe like five would be something like an accurate number. Kathleen: Yeah, it's so interesting, because just from my own experience even in small organizations, like, my company is small and in early stage, hopefully will be very big in a year. But, we still, I feel like we have a lot of different platforms. We have marketing automation, we have our website, we have Google Analytics, we have our CRM, like our video marketing platform, our SEO add-ons, there's just so much and pulling it all together is a little bit of a nightmare. And I imagine without a tool like this is super time consuming, and I think that that's probably one of the biggest pain points marketers have, is the amount of time they spend on reporting. Like you said, you work with a lot of different companies I know you and I talked and you have some examples of companies that have used the platform and some stories about how it's helped them save time. Can you maybe share some of that with us? Supermetrics customer stories Anna: Yeah, definitely, and I love sharing those stories because the clients are amazing and some of them have been with us through like absolutely everything. So they started using Grabber, which is now our legacy product so the tool pulls data into Excel. And now they want to try or are already trying Supermetrics for BigQuery you can imagine some of them have used all five of our products, so definitely an evolution there. But coming back to your question one of my favorite client success stories is Sleeping Giant Media. These guys- Kathleen: It's a great name, side note, I just like the company name. Anna: Yeah, they're great and the people they're amazing. So the team is based in Britain, and they've been using Supermetrics like I mentioned for a while. They started with Supermetrics for Google Sheets and now they're looking into Supermetrics for BigQuery. So Sam, big shout out to Sam is our one big Supermetrics fan and he even talked about us at Brighton SEO, which was just amazing we never asked him to but he just went out there and spoke about us. It was really heartwarming. So he told a story that they used to spend around three to four days just on marketing reporting, aka copy, pasting numbers, collecting- Kathleen: Three to four days a month, right? Anna: Three to four days a week. Kathleen: Ah, oh my goodness. Anna: Imagine well, I guess they were not doing it exactly like every week, but maybe like every other week let's say. They are a fully functioning marketing agency providing a wide range of services. So he would get in Monday morning and start collecting data and then they're emailing all the cc's. By Wednesday evening, he would finish all reporting for one maybe two clients, depending on the scope of the project, of course. And then he had Thursday and Friday. So Thursday the client meeting to discuss how campaigns are going, whether there is some adjustments have to be made, et cetera, et cetera. And then it would just leave basically Friday and well, if he's not doing reporting next week, then the next week to implement all the changes. Which to me sounds crazy, because this is something you should not be spending that much time on. This is not a very highly intellectual job like copy, pasting numbers feels so basic - people doing this and he's started using Supermetrics so he's time basically time he spent on reporting cut down to something like an hour or maybe like an hour and a half and if he needed to do a reporting for absolutely all the clients in the agency that would be in one day. Kathleen: So what does he do with all his newfound free time? Anna: Great question. So he's already talking, well, obviously you started sharing those results with the clients. So he started talking to the clients more and this I think even further reinforces the idea that we help inbound marketers because then we encourage with this free time you can have more human connection. You can ask more relevant questions, you have more time to even think or like process the client's needs. And, in addition to this, he was able to make more relevant analysis now that he had more time. So he could actually process the numbers in his head and think, "Aha, what would our next steps be?" And then react accordingly? So we usually have two types of reports people are doing with Supermetrics. So one type of reporting is this for example, monthly reports where people pull together numbers from all those different sources to assess their monthly progress to see what kind of plans do they have to make for the next month, and then so on and so forth. And the second type of reporting that we commonly see is the ad hoc reporting. So say, okay, this campaign, this bid is acting wild I did not know what happened. Some numbers are going down they're not normal compared to the benchmark or this is someone unusual behavior. Let me just quickly pull out a few numbers and compare them and figure out what's the root cause? Is it something seasonal or is some competitor in the picture, like to understand what's happening. And I really loved one comment, this is from a different client the agency is also based in the UK, they said that it's much, much faster and much easier to pull those numbers with Supermetrics rather than going through the whole Facebook ads UI trying to dig into campaigns and figure out what exactly went wrong. So there you go. So you can also do this ad hoc kind of very quick analysis to see whether some immediate action has to take it and I think this makes you very, very proactive versus being a reactive reporter. You look at the numbers, it's like, "Oh, my God." The moment is gone, things have already happened. But this way, you can very quickly act upon those changes and as a result make your clients happy and avoid some potential setbacks. If you for example, have Black Friday and say something's going wrong then you don't have much time to react. You're losing money basically. So yeah, it really is- Kathleen: Do you have any sense for how often, because Supermetrics really gives a continuous flow of data, correct? Anna: Yeah. How often are marketers reviewing data? Kathleen: And so you could theoretically be checking it all the time. But do you have a sense for how often at least in best practice cases, marketers are looking at that data? Anna: Yeah, so they can set triggers that would refresh data automatically. So I would say that people do so that they set up a reporting dashboard, then they set it up to refresh, so that the data is there for the next day, usually. Of course, they can do like hourly refresh again, if it's a fast pacing, budget campaign, but usually they you do this, I come in to the office, I see fresh data in my dashboard. So every morning, we can do a quick catch up with my colleagues, look at this internal report and see how all of our different clients are doing. If it's an agency, if it's an internal team, then just see how campaigns are performing and then see what we're doing during the day. So that's the usual, I would say, very typical scenario, or according to my experience. Kathleen: And then it seems like, for reporting, like if you're somebody like me, who has to put together a report once a month for your board of directors, you could just really kind of screenshot and paste the graphs into a PowerPoint or something along those lines if you wanted to, or you could distill the data in some other way for like a monthly report. Anna: Yeah, definitely, you can do this. What I would do personally, if I was the one doing this, I would use Google Data Studio because this way you don't have to copy paste anything and you can share this file with really nice dashboards they've updated their design and they're rolling out as far as I know, more comprehensive and even better looking design soon. So you can just connect all the sources put all the numbers and like I mentioned also provide those templates so you can get some inspiration from there. Our designers also do a very nice job creating those lovely designs. For example, we have some Supermetrics for HubSpot templates there in our gallery and I will also give the link to all the materials and the gallery so people can check them out or if they listen to the episodes and try everything themselves. Check out the Supermetrics reporting template gallery But yeah, I would do something like this. And then at the same time, you would not need to refresh the data because the data will be refreshed automatically there. And the board of directors can see new numbers and in addition, you can also connect your custom data source, aka if you have revenue numbers in a database, many companies do have those. So especially if it's a board of directors, they would be very interested in the impact marketing has made on their revenue and other business metrics. So you can pull this data from the database and you can show it side by side with the marketing spend, for example, to give them an even bigger picture. The biggest mistakes marketers are making with data Kathleen: That's great. So any thoughts on, you know, what you see the marketers doing as far as the biggest mistakes they're making with tracking data reporting on it, et cetera? Anna: That's an interesting one. I actually have never thought about this. Mistakes. Well, maybe one thing that comes to my mind is maybe like focusing too much on the vanity metrics as I call them, aka like a lot of clicks or like impressions or worse like it's a impressions. Metrics that give you ... I would say these are maybe like unrelated metrics in a way that they're not very directly related to the business metrics, because for example, in some cases, sales cycle can be quite long. So you cannot accurately assess how much the campaign will generate in the future just simply because people have to go through multiple steps and multiple touch points to even get to the discussion about purchasing your product or tool or license. And so yeah, focusing too much on impressions, focusing too much on metrics then, like I said, not maybe necessarily related. This comes back to the product. You should know your sales cycle and I would suggest breaking it down into different steps and basically monitoring and benchmarking each step and see the conversion rates. I don't exactly remember, a gentleman did an episode with you and he suggested a very good framework for this. There was even Excel spreadsheet. So this is maybe something we could also pulling back to this episode in the comments. Kathleen: I'll have to figure out which one that was. Anna: Yeah, unfortunately, I don't remember. Kathleen: We'll figure it out. Anna: We'll figure it out. Check out the episode Anna references here Kathleen: I know we can do it and we'll put the link in the show notes. Yeah, I know that I've had so many great guests it's interesting who've contributed so many great ideas that oftentimes I was thinking and in fact as I listened to you talk, that I need to go back and listen to some of my earlier episodes, because now I'm on I think I just published Episode Number 117 when we're talking about this, and there's so many earlier ones that are still great in terms of the information they deliver. Who is Supermetrics right for? Kathleen: I imagine that this type of reporting isn't right for everybody because some marketers might have much simpler platforms or maybe not. Maybe it is for everyone can you talk through who do you generally see using a solution like this? Anna: So our most common user personas, so to say, are marketing agencies, so somebody who is doing marketing reporting consecutively and then they have to do it almost every day or at least monthly to put together those good looking reports for their clients. But of course, those marketing agencies can be of different size. There can be a five person as we are now seeing with required there can be a five person very tech savvy small team, which focus on marketing technology and purely some maybe hardcore analytics with the elements of normal distribution and some predictive analytics even or they can be a very big marketing agency like TBWA who want to work client success stories. So yeah, agencies are very typical for us. Then we have internal teams so basically marketing departments, which want to monitor their own campaign, how they're progressing. Then even if they don't have a client, like you just mentioned, reporting to their board of directors and showing what impact marketing has made on their sales et cetera. And also, we've added HubSpot Connector, which is not only marketing, but it's also CRM. So then they connect their marketing data together with the CRM data to give more background information and make a 360 degree analysis. So these are very, very diverse I have to say. Kathleen: Great. So really it sounds like anybody, regardless of size, who has a strong focus on data, tracking data, analyzing data and reporting on data? Anna: Yeah, I would say so. Well maybe there is some categorization, I would say that smaller teams tend to use Google Sheets and Excel aka Spreadsheet tools. If the team is very tech savvy, or they have a lot of historical data, then I would straightaway advise them to use Google BigQuery because they would immediately otherwise hit that cell limit and the reports will be bulky, the reports will be slow. This is just not the right data destination, if you want to store terabytes and even more like 2, 3, 4 years of historical data to see different trends. So to summarize, bigger marketing agencies who have many clients, many big clients like big brands want to own their data because imagine those big brands spend a lot of dollars collecting this data, cleaning this data up. And they want a place where they can successfully store the older data so they can store data in BigQuery as their database and then they can instantly connect data from their Facebook ads, et cetera, to BigQuery through Supermetrics, and then visualize it, for example, in Tableau or Power BI to get the full picture of their marketing reporting. And yeah, smaller teams tend to use Data Studio, Google Sheets, which are completely free tools, so they are not paying per usage for them. So for them that would be cheaper and therefore more suitable option. How to learn more about marketing analytics Kathleen: Okay. Now I'm going to spring a question on you that I didn't tell you I was going to ask you and you may not have the answer because this is totally off the cuff. But as you spoke about this, you talked about, like, when you start to do more, you should move over to BigQuery. And I imagine for some marketers that could seem kind of intimidating, especially if they don't come from a highly analytical background. So are there certain places that you know of, or can recommend if somebody's listening, and they're thinking, "Oh, my God, that sounds really complicated." I need to get up to speed and learn more about analytics and how to use something like BigQuery. Is there somewhere online they can go to learn and become better at analytics? Anna: Yes, and I actually do have to say that we're working on this. We're very well aware of this worry that people have that, oh, I've been using maybe more simple UIs for my whole life. And now there was this whole like jobs and transfers and the whole different environment, which is coming with this BigQuery. So first of all, I do have to say that we're working on creating a bunch of materials for BigQuery specifically that will show how can use Supermetrics products if you're a marketer like videos, where do you click? How do you create different kind of transfers? How to use different kind of joins? So this is something that we're really hoping to provide and also we do have natively build Data Studio Connectors so after a marketer has gathered all the data in BigQuery, they can use our connector to visualize their data in just a few clicks. And, again, as we publish a video you'll see it's very, very simple and what I really love about BigQuery, although it does sound intimidating, but Google does provide learning resources for that as well. And if you look at the UI, you will notice that it's very, very intuitive. So to say, well at first it's maybe a little bit challenging, but once you get a hang of it, it's actually pretty nice, it's quite clear. From our side, we also provide this monitoring suite where you can see how your transfers are performing. Is your data flowing all in nicely? Is there something to worry about or not? Usually all our transfers are fine. So people have mentioned and you can also see from the client success stories that data flowing in nicely and we haven't experienced that much challenges with Supermetrics for BigQuery. But yeah, more resources coming up. Google does provide their own resources and I think it's important for marketers to at least look into this if it's relevant for them, because this is the general trend. This is where the world is going and you want to be ahead. You definitely want to at least understand what kind of technologies are there. I really liked the quote one of our clients have mentioned. So they said, "It feels like Google BigQuery compared with other providers is built with agencies and with marketers in mind." So that sounds reassuring to me at least that people do say that it's actually feels like it's built for marketers. So I would say, yeah, wait for ours resources and then go and explore on your own and try not to be intimidated by this very techie sounding word. Kathleen's two questions Kathleen: Yeah it can be a lot to think about. But that's great that you guys are working on creating some resources. All right, we can talk about data forever but I have two questions I always ask all of my guests at the end of my interviews, and I would love to get your answers. The first one is, we talked about how the focus on this podcast is about inbound marketing. Who can you think of that whether it's a company or an individual who's just doing inbound marketing really well right now? Anna: Yeah, I will say quite a common answer and I'm pretty sure other guests have already mentioned this company. I think Drift is doing a fantastic job when it comes to inbound marketing, so they have not only created their own category, but when they interact with the people, with their clients, it feels very, very human, which I think they got this trend. This is something many of us need as marketing is becoming more and more techie. We need this kind of catalyst, we need this human connection to feel welcomed. And like I mentioned, they're doing a fantastic job there and one very good example is this one scale playbook, those 41 or 42 plays. As you read through this playbook, you can literally see that the company's trying show their best and make people feel welcomed and warm if they're using their product. Kathleen: Now, that's great. A lot of people have mentioned them, but that's because they're doing great things. Anna: Exactly. Kathleen: Second question is where do you personally go to learn and keep up so that you are able to stay abreast of the cutting edge developments in marketing? Anna: Yeah, so I prefer not to have a one stop shop. So depending on the topic I want to learn more about I go to a variety of different resources. So if I want to learn something more general about what's going on in the world of SaaS marketing, I listen to the SaaStr Podcast. Another amazing podcast I can recommend is the Growth Hub Podcast, and my colleague Edward is a proud host of this podcast. I really love his interviewing style and the guests, which have been on this podcast are simply amazing. So go check it out the Growth Hub Podcast, by Advanced B2B. A couple of other things. So of course I go to MarTech Today and SEJ if I want to learn about news and recent updates, and for us it's especially relevant, because we need to keep up what's going on with all the data source companies. Julian Shapiro, I'm not exactly sure if I'm pronouncing his name correctly, has a couple of fantastic guides on how to write a great copy, how to build a really nice landing page, how to A/B test. So one really good resource there as well and yeah. How to connect with Anna Kathleen: There's a couple new ones there that I haven't heard about. So we'll definitely check those out and put the links in the show notes. If someone wants to reach out to you, if they have a question about what you've talked about, or they want to learn more about Supermetrics, what is the best way for them to connect with you? Anna: Yeah, so definitely the best way is to reach out to me directly, maybe not through the company Twitter, but I'm @superpoweranna on Twitter. Kathleen: That's such a great handle. Anna: I love it as well. It's like Supermetrics plus me. So yeah, @superpoweranna on Twitter, and yeah, just hit me up with anything. And I also am very actively checking LinkedIn messages so Anna Shutko on LinkedIn, please don't hesitate to connect and I'm very happy to have discussions, answer the questions about anything there. So yeah, LinkedIn and Twitter, I would say, are the two go places. You know what to do next... Kathleen: Great. Well, I will put links to all of your various social accounts in the show notes so people can reach out to you and thank you so much for joining me. This was really fun just to talk about analytics and to geek out for a little bit. If you are listening and you liked what you heard or you learned something new as always, I would love it if you would leave the podcast a five star review on Apple Podcasts. That is how people find us and hear about us. And of course if you know someone else who's doing kick ass in non-marketing work, tweet me @WorkMommyWork and I would love to make them my next interview. Thanks, Anna. Anna: Thank you so much Kathleen. Kathleen: So fun.
Happy Independence Day to our American listeners! Mark Mandel is back today as he and Gabi Ferrara interview Bill Creekbaum of Informatica to learn how they work with Google Cloud for a better big data user experience. Mark Mirchandani is hanging around the studio as well, bringing some cool things of the week and helping with the question of the week! Informatica provides data managing products that offer complete solutions focusing on metadata management, integration, governance, security, data quality, and discoverability. Bill’s job at Informatica is to ensure these products really take advantage of the strengths of Google Cloud Platform. One such example is a product that allows customers to design in Informatica and push their projects to Cloud Dataproc. Informatica also offers similar capabilities in BigQuery. When moving data from on-prem to the cloud, customers can use Informatica and Google Cloud together for a seamless transition, cost savings, and easier data control. Together, Informatica and Google Cloud can also facilitate the acquisition of high quality data. To have better, more trustworthy output, data inputed needs to be safe to access, have few or no duplicates and null values, and be complete. To achieve this, developers usually use a combination of the Informatica tools Intelligent Cloud Services, Enterprise Data Catalog, and Big Data Management, and the Google tools BigQuery, Cloud Storage, Analytics, Dataproc, and Pub/Sub. Bill’s closing advice for companies comes in three parts: take stock of the data you’ve got, set goals, and develop a well-rounded team. Bill Creekbaum Bill Creekbaum is Sr. Director of Product Management for Cloud, Big Data, and Analytic Ecosystems at Informatica. He is focused on delivering market leading unified data management platforms and services that help customers take advantage of their greatest assets, data. Bill has been in product management and product marketing for more than 20 years and for the past 10 has been focused on successfully delivering SaaS and Cloud Applications to the market. Prior to joining Informatica, Bill has worked at SnapLogic, GoodData, Oracle, Microsoft, Mindjet, and more. See more of Bill’s experience on LinkedIn. Cool things of the week Google Cloud + Chronicle: The security moonshot joins Google Cloud blog GCP Podcast Episode 135: VirusTotal with Emi Martinez podcast Introducing Equiano, a subsea cable from Portugal to South Africa blog Kubernetes 1.15: Extensibility and Continuous Improvement blog Future of CRDs: Structural Schemas blog See how your code actually executes with Stackdriver Profiler, now GA blog Interview Informatica site Informatica for GCP site BigQuery site Cloud Storage site Cloud Dataproc site Intelligent Cloud Services site Enterprise Data Catalog site Big Data Management site Google Analytics site Pub/Sub site Google Cloud & Informatica: Accelerate your Data-Driven Digital Transformation webinar Informatica for Google BigQuery data sheet Informatica Intelligent Cloud Services for Google BigQuery site Question of the week If I want to have my App Engine Application serve any subdomain on my custom domain, how do I do that? Where can you find us next? Gabi is done traveling. Mark Mirch’ is working on Stack Chat. Mark Mandel is going to Tokyo Next, Open Source in Gaming Day , and the North American Open Source Summit. Sound Effect Attribution “small group laugh 6.flac” by tim.kahn of Freesound.org “Chewing, Carrot, A” by Inspector J of Freesound.org “Testtone1000hz” by Jobro of Freesound.org
Avi explains Super query as an optimization and tool for extracting data for thousands of business analyst at the same time. Data is the new gold
Mark Rittman is joined by CEO and Founder of Supermetrics, Mikael Thuneberg, to tell the story of how a mention in the official Google Analytics Blog and a prize of a t-shirt led to him founding and bootstrapping a €2M ARR marketing analytics business that's probably the most important software vendor the Drill to Detail audience has never heard of, and who recently moved into the data pipelines-as-a-service market in collaboration with Google Cloud Platform and the Google BigQuery team.Announcing Supermetrics for BigQuery: Get a marketing data warehouse up and running in minutes (Supermetrics blog)Supermetrics, Google BigQuery and Data Pipelines for Digital Marketers (Rittman Analytics blog)
Mark Rittman is joined by CEO and Founder of Supermetrics, Mikael Thuneberg, to tell the story of how a mention in the official Google Analytics Blog and a prize of a t-shirt led to him founding and bootstrapping a €2M ARR marketing analytics business that's probably the most important software vendor the Drill to Detail audience has never heard of, and who recently moved into the data pipelines-as-a-service market in collaboration with Google Cloud Platform and the Google BigQuery team.Announcing Supermetrics for BigQuery: Get a marketing data warehouse up and running in minutes (Supermetrics blog)Supermetrics, Google BigQuery and Data Pipelines for Digital Marketers (Rittman Analytics blog)
Mark Rittman is joined in this episode by Jordan Tigani, Director of Product Management at Google for Google BigQuery, to talk about the history of BigQuery and its technology origins in Dremel; how BigQuery has evolved since its original release to handle a wide range of data warehouse workloads; and new features announced for BigQuery at Google Next'19 including BI Engine, Storage API, Connected Sheets and a range of new data connectors from Supermetrics and Fivetran.Modern Data Warehousing with BigQuery (Cloud Next '19)Modern data warehousing with BigQuery: a Q&A with Engineering Director Jordan TiganiIntroduction to BigQuery BI EngineBigQuery Storage API OverviewSupermetrics and Fivetran BigQuery connectorsDrill to Detail Ep.2. 'Future of SQL on Hadoop', With Special Guest Dan McClaryDrill to Detail Ep.31 'Dremel, Druid and Data Modeling on Google BigQuery' With Special Guest Dan McClary
Mark Rittman is joined in this episode by Jordan Tigani, Director of Product Management at Google for Google BigQuery, to talk about the history of BigQuery and its technology origins in Dremel; how BigQuery has evolved since its original release to handle a wide range of data warehouse workloads; and new features announced for BigQuery at Google Next'19 including BI Engine, Storage API, Connected Sheets and a range of new data connectors from Supermetrics and Fivetran.Modern Data Warehousing with BigQuery (Cloud Next '19)Modern data warehousing with BigQuery: a Q&A with Engineering Director Jordan TiganiIntroduction to BigQuery BI EngineBigQuery Storage API OverviewSupermetrics and Fivetran BigQuery connectorsDrill to Detail Ep.2. 'Future of SQL on Hadoop', With Special Guest Dan McClaryDrill to Detail Ep.31 'Dremel, Druid and Data Modeling on Google BigQuery' With Special Guest Dan McClary
Engineers can build applications faster by using tools that abstract away infrastructure. Major cloud providers offer this tooling in the form of functions-as-a-service, as well as managed services such as Google BigQuery or Azure Container Instances. The term “serverless” refers to these functions-as-a-service and the managed services–because when you use these tools, you are not The post JAM Stack with Phil Hawksworth appeared first on Software Engineering Daily.
Mark Rittman is joined in this episode by Jonathan Palmer from King Games to talk about the role of analytics in the development of Candy Crush Saga and other King games, their use of Looker along with Google BigQuery and Exasol to provide analytics capabilities to their game designers and product owners and his approach to doing all of this in a fast-moving, technology-driven internet business.- Candy Crush Saga article on Wikipedia- King Games company website- “How King Games is Crushing Games Data” DataIQ article- Looker and King Games case study- Jonathan Palmer on LinkedIn
Mark Rittman is joined in this episode by Jonathan Palmer from King Games to talk about the role of analytics in the development of Candy Crush Saga and other King games, their use of Looker along with Google BigQuery and Exasol to provide analytics capabilities to their game designers and product owners and his approach to doing all of this in a fast-moving, technology-driven internet business.- Candy Crush Saga article on Wikipedia- King Games company website- “How King Games is Crushing Games Data” DataIQ article- Looker and King Games case study- Jonathan Palmer on LinkedIn
Deep Kapadia and JP Robinson from New York Times join Mark and Francesc to discuss how they use Google Cloud Platform to serve the New York Times to its readers. About JP Robinson JP Robinson maintains NYT's internal and open source tools and frameworks that are related to the Go programming language. He also lead backend development of NYT's games platform. Recently his team completely rewrote our backend with Go and GCP tools. In doing so they've managed to significantly lower request latencies and cut costs in half. About Deep Kapadia Deep Kapadia manages the Infrastructure and Delivery Engineering, Site Reliability and Test Automation teams at The New York Times. His teams are responsible for providing other engineering teams with tools and processes needed to get their jobs done on a day to day basis. His teams recently have been working on building the GKE deployment pipeline and enabling other teams to migrate to the Cloud from our physical datacenters and also moving their entire edge and routing caching architecture from internally hosted varnish to Fastly. They also helped move most of their site behind HTTPS. Cool things of the week Cutting cluster management fees on Google Kubernetes Engine blog Coming in 2018: GCP's Hong Kong region blog Introducing an easy way to deploy containers on Google Compute Engine virtual machines blog Interview New York Times Crossword site Moving The New York Times Games Platform to Google App Engine blog New York Times in 1996 webarchive Google App Engine site docs Cloud Datastore site docs Kubernetes Engine site docs Cloud Pub/Sub site docs Google BigQuery site docs Cloud Endpoints site docs Drone GAE github Drone GKE github Marvin github openapi2proto github gRPC site New York Time Open site Question of the week What best practices are there for securing a Kubernetes Engine Cluster? Precious cargo: Securing containers with Kubernetes Engine 1.8 blog Where can you find us next? Mark will be Montreal in December to speak at Montreal International Games Summit. Melanie will be at NIPS (Neural Information Processing Systems) in Long Beach and will also be attending Black in AI on December 8th.
Mark is joined in this episode by Avi Zloof from Evaluex to talk about the new world of elastically-provisioned cloud-hosted analytic databases such as Google BigQuery and Amazon Athena, how their pricing model and vendor strategy differs from the traditional database vendors, and how machine learning can be used to automate performance tuning and optimize workloads in this new world of large-scale distributed query and storage.
Mark is joined in this episode by Avi Zloof from Evaluex to talk about the new world of elastically-provisioned cloud-hosted analytic databases such as Google BigQuery and Amazon Athena, how their pricing model and vendor strategy differs from the traditional database vendors, and how machine learning can be used to automate performance tuning and optimize workloads in this new world of large-scale distributed query and storage.
Drill to Detail Ep.41 'Developing with Google BigQuery and Google Cloud Platform' With Special Guest Felipe Hoffa Mark is joined in this episode by Google Cloud Platform Developer Advocate Felipe Hoffa, talking about getting started as a developer using Google BigQuery along with Google Cloud Dataflow, Google Cloud Dataprep and Google Cloud Platform's machine learning APIs.
Drill to Detail Ep.41 'Developing with Google BigQuery and Google Cloud Platform' With Special Guest Felipe Hoffa Mark is joined in this episode by Google Cloud Platform Developer Advocate Felipe Hoffa, talking about getting started as a developer using Google BigQuery along with Google Cloud Dataflow, Google Cloud Dataprep and Google Cloud Platform's machine learning APIs.
This summer (aka Australian winter) a new Cloud Region was announced in Australia and today Francesc and Mark talk to two Australian engineers, Andrew Walker founder of 3wks and Graham Polley, about how this new region has changed the way they think about the cloud down under. About Andrew Walker Andrew is the founder of 3wks who have delivered 190 projects on Google Cloud platform for enterprise customers in Australia. He loves everything serverless, from App Engine through to BigQuery. About Graham Polley Graham is a senior software engineer based out of Melbourne Australia, and works for Shine Solutions. Shine are a enterprise digital consultancy with offices in Melbourne & Sydney. Being an official Google Developer Expert, he's passionate about promoting the adoption of cloud technologies into software development, and regularly blogs and gives presentations. He has extensive experience in building big data solutions for clients using the Google technology stack, and in particular with BigQuery & Dataflow. Graham works very closely with the Google cloud engineering teams in the US, where he is a member of their cloud platform trusted tester program, and the solutions he helps build are used as internal exemplars of developer use cases. Cool things of the week How we built a brand new bank on GCP and Cloud Spanner: Shine blog post Now shipping: Compute Engine machine types with up to 96 vCPUs and 624GB of memory announcement Google Cloud Dataprep - Data Handling Made Easier Medium Interview Sydney Cloud Region docs Google Cloud Platform expands to Australia with new Sydney region - open now announcement Google Cloud Platform Geography and Regions docs Google Cloud Dataflow docs Google BigQuery docs Question of the week Is Tensorflow good for general math computation? Yes! It's great for any linear algebra programs. Linear Algebra Shootout: NumPy vs. Theano vs. TensorFlow blog post Where can you find us next? Francesc just released the second part of this #justforfunc code review. Next week he will be presenting at Go Meetup London, Velocity London, and Google Cloud Summit Paris. Mark is heading to Australia for GDG Devfest Melbourne and Game Connect Asia Pacific and will be hanging out at Unite Melbourne and PAX Australia.
Have you ever wanted to know what powers BigQuery under the hood? Tino Tereshko and Jordan Tigani sit in front of the microphone with co-hosts Mark and Francesc to talk all about it! About Tino Tereshko Tino is the Big Data Lead for Office of the CTO at Google Cloud, focusing on building strategic relationships with the world's top Enterprises in the interest of sharing and accelerating technological innovation. Tino hails from the BigQuery team, where he solved difficult cloud-native product problems, enabled Googlers and customers, and built programs like BigQuery Pacific. In earlier years Tino held various positions of leadership in several Silicon Valley startups, and could be found working as a quant developer on the floor of the Chicago Board of Equities at a boutique market making firm. Tino holds a Bachelor's degree in Applied Mathematics and Economics from University of California - Davis. When not at work, you can usually find him playing beach volleyball, cycling, skiing, paddle boarding, or enjoying a nice glass of wine. About Jordan Tigani Jordan was one of the founding engineers on Google BigQuery, wrote the first book on the subject, and is now the engineering lead of the product. Before Google, he worked at a number of star-crossed startups, and also spent time at Microsoft in the Windows kernel team and MSR. Cool things of the week This week in Google Cloud Platform medium This week in Google Cloud — “Premium and Standard networking tiers, NYT Games on App Engine, Puppet for GCP, and a firewall for App Engine” blog Creating a GCP type provider in 6 (well 7) easy steps blog Aja Hammerly's Battleship blog series Interview BigQuery site docs BigQuery under the hood blog Dremel paper Borg and Kubernetes with John Wilkes podcast Question of the week I want to talk to to my phone like it's J.A.R.V.I.S. and make it do things. How can I build a bot to do this? Cloud Speech API site docs Cloud Natural Language Processing API site docs API.AI site docs Intents docs Go library github BigQuery Where can you find us next? Francesc will be presenting at Google Cloud Summit in Sydney and Google Cloud Summit in Chicago in September. In October, he'll be presenting at Velocity London, Google Cloud Summit Paris and Devfest Nantes Mark is speaking at Austin Game Conference and attending Strangeloop in September. He is also heading to Australia in October for GDG Devfest Melbourne and Game Connect Asia Pacific and will be hanging out at Unite Melbourne and PAX Australia.
Mark and Francesc welcome the incredible Greg DeMichillie into their studio this week, to talk all about Google Cloud's Office of the CTO, and how it works with enterprise companies. About Greg DeMichillie Greg has 20 years experience in creating great computing platforms for developers and IT alike. He has been at Google since before the inception of Google Cloud Platform and as Director of Product he lead the product teams for App Engine, Compute Engine, Kubernetes & Container Engine, as well as the Developer Console, SDKs, and Billing system. He has delivered keynote presentations and product demos at events such as Google I/O and Google Cloud NEXT as well as interviews with the New York Times, Wall St Journal, and other publications. Prior to joining Google, he had leadership roles at variety of companies including Adobe and Amazon, as well as a decade at Microsoft where he was a developer on the first version of Visual C++, the development manager for Microsoft's Java tools, and lead the product team for the creation of C#. Cool things of the week Cloud SQL for PostgreSQL updated with new extensions blog docs issue tracker discussion group Celebrating Six Months of Open Access, plus The Met on Google BigQuery blog Deploying Clojure applications to Google Cloud blog Announcing price cuts on Local SSDs for on-demand and preemptible instances blog Interview How the queen of Silicon Valley is helping Google go after Amazon's most profitable business article Lush migrating to Google Cloud in 22 days blog Evernote migrating to Google Cloud blog Google Cloud Summit Sydney site Google Cloud Summit Paris site Google Cloud Summit Seattle site Google Cloud Summit Chicago site Google Cloud Summit Stockholm site Look out for more Summits in: Singapore, Kuala Lumpur, Bangalore, Munich, and Sau Paulo Question of the week Is there a way to access the Kubernetes dashboard without running kubectl proxy? Such as, if I wanted to view or control my Kubernetes cluster from my phone? Kubernetes UI docs kubectl proxy docs Creating Authorized Networks for Master Access docs Google Cloud Shell site docs Where can you find us next? Francesc is going on holidays!!! But he just released a justforfunc episode on Contributing to the Go project, and will be presenting at Google Cloud Summit in Sydney in September. Mark is entering crazy season, and will be presenting at Play NYC, then speaking at Pax Dev and then attending Pax West right after. He'll then be speaking at Gameacon and Austin Game Conference and attending Strangeloop once he's done with all that.
Mark is joined by returning special guest Dan McClary to talk about data modeling and database design on distributed query engines such as Google BigQuery, the underlying Dremel technology and Capacitor storage format that enables this cloud distributed data warehouse-as-a-service platform to scale to petabyte-size tables spanning tens of thousands of servers, and techniques to optimize BigQuery table joins using nested fields, table partitioning and denormalization .
Mark is joined by returning special guest Dan McClary to talk about data modeling and database design on distributed query engines such as Google BigQuery, the underlying Dremel technology and Capacitor storage format that enables this cloud distributed data warehouse-as-a-service platform to scale to petabyte-size tables spanning tens of thousands of servers, and techniques to optimize BigQuery table joins using nested fields, table partitioning and denormalization .
Stewart Bryson returns to the show to join Mark Rittman to discuss new-world BI and data warehousing development using Google BigQuery and Amazon Athena, Apache Kafka and StreamSets, and talks about his experiences with Looker, the cloud-native BI tool that brings semantic modeling and modern development practices to the world of business intelligence.
Stewart Bryson returns to the show to join Mark Rittman to discuss new-world BI and data warehousing development using Google BigQuery and Amazon Athena, Apache Kafka and StreamSets, and talks about his experiences with Looker, the cloud-native BI tool that brings semantic modeling and modern development practices to the world of business intelligence.
Mark Rittman is joined by Daniel Mintz from Looker to talk about BI and analytics on Google BigQuery, data modelling on the new generation of cloud-based distributed-data warehousing platforms, and Looker's re-introduction of semantic models to big data analytics developers
Mark Rittman is joined by Daniel Mintz from Looker to talk about BI and analytics on Google BigQuery, data modelling on the new generation of cloud-based distributed-data warehousing platforms, and Looker's re-introduction of semantic models to big data analytics developers
This week brings your hosts Francesc and Mark doing DAILY EPISODES from Cloud Next! Today's episode brings interviews straight from the Cloud Next Community Summit! Interviews Kalev Leetaru Kalev Leetaru is the creator of the GDELT project, a global database of society, powered by Google Bigquery, Machine Learning APIs, and many other Google Cloud products. Tim Kelton Tim Kelton works at Descartes Labs and is here to show off his demo. Learn more about it on the blog post. Verónica López Verónica López joins us to talk about all the cool things she saw during the Community Summit and all the sessions she's excited about day 1. More about Cloud Next You can watch the live stream! More daily episodes to come - stay tuned! Come find us on the ground floor at Moscone!
Mark Rittman is joined by Alex Olivier from Qubit to talk about their platform journey from on-premise Hadoop to petabytes of data running in Google Cloud Platform, using Google Cloud Dataflow (aka Apache Beam), Google PubSub and Google BigQuery along with machine learning and analytics to deliver personalisation at-scale for digital retailers around the world.
Mark Rittman is joined by Alex Olivier from Qubit to talk about their platform journey from on-premise Hadoop to petabytes of data running in Google Cloud Platform, using Google Cloud Dataflow (aka Apache Beam), Google PubSub and Google BigQuery along with machine learning and analytics to deliver personalisation at-scale for digital retailers around the world.
Today Amruta Gulanikar and Chris Sells, experts from the Windows and .NET community and part of the Google Cloud team, join your cohosts Francesc and Mark to discuss why you should run your Windows and .NET work loads on Google Cloud. About Amruta Gulanikar Prior to joining Google Amruta spent 5+ years as a PM in the Office division at Microsoft working on many different products. Just before she left, she worked on launching a new service and supporting apps - “O365 Planner” which offers people a simple and visual way to organize teamwork. At Google, Amruta owns Windows on GCE which includes support for premium OS & Microsoft Server product images, platform improvements to support Windows workloads on GCE. About Chris Sells Before joining Google, Chris was a contributing member of the Windows developer community for more than 20 years, including 8 years at Microsoft. He's written a number of books in this area and still maintains a blog that he started in 1995 about his various technical adventures, although he's more active on Twitter these days. At Google, Chris is the Lead PM for Cloud Developer Tools, which includes driving our tooling and libraries efforts around Windows and .NET. Cool thing of the week 15 Awesome things you probably didn't know about Google BigQuery blog post A Google SRE explores GitHub reliability with BigQuery blog post Interview Windows on Google Cloud Platform docs Windows on Google Compute Engine docs .NET on Google Cloud Platfom docs SQL Server on Google Cloud Platform docs Windows RDP: Remoted Desktop Protocol wikipedia Running .NET applications on Linux with Mono blog .NET core runtime docs Announcing Docker Container Platform for Windows Server 2016 Docker Update on Kubernetes for Windows Server Containers blog post Powershell docs PowerShell is open sourced and is available on Linux announcement Google Cloud Platform is a first class Windows cloud Question of the week How can I diagnose and understand a problem that only occurs in production? - Stackdriver Debugger is now GA announcement Were will we be? Francesc finally released one more episode of justforfunc, and now everything is ready before he goes to Brazil next month for GopherCon Brasil and GCPNext Brazil. Mark will be at GAMEACON in Atlantic City on October 28th. He will then attend Unite, the Unity conference in Los Angeles, CA on November 1st.
In this episode, we dive deep on a 1988 classic: Tom Hanks, under the direction of Penny Marshall, was a 12-year-old in a 30-year-old's body... Actually, that's a different "Big" from what we actually cover in this episode. In this instant classic, the star is BigQuery, the director is Google, and Michael Healy, a data scientist from Search Discovery, delivers an Oscar-worthy performance as Zoltar. In under 48 minutes, Michael (Helbling) and Tim drastically increased their understanding of what Google BigQuery is and where it fits in the analytics landscape. If you'd like to do the same, give it a listen! Technologies, books, and sites referenced in this episode were many, including: Google BigQuery and the BigQuery API Libraries, Google Cloud Services, Google Dremel, Apache Drill, Amazon Redshift (AWS), Rambo III (another 1988 movie!), Hadoop, Cloudera, the Observepoint Tag Debugger, Our Mathematical Universe by Max Tegmark, A Brief History of Time by Stephen Hawking, and a video of math savant Scott Flansburg.
Arfon Smith from GitHub, and Felipe Hoffa & Will Curran from Google joined the show to talk about BigQuery — the big picture behind Google Cloud’s push to host public datasets, the collaboration between the two companies to expand GitHub’s public dataset, adding query capabilities that have never been possible before, example queries, and more!
Arfon Smith from GitHub, and Felipe Hoffa & Will Curran from Google joined the show to talk about BigQuery — the big picture behind Google Cloud’s push to host public datasets, the collaboration between the two companies to expand GitHub’s public dataset, adding query capabilities that have never been possible before, example queries, and more!
Analytics is an essential part of many platforms, but it is specially important for gaming. Today we discuss how Google Cloud makes analytics simpler and super powerful. Kir Titievsky from Google Cloud Pub/Sub, Eric Anderson from Google Cloud Dataflow, and Tino Tereshko from Google BigQuery will tell your co-hosts Francesc and Mark how those three products get together to power an amazing analytics solution for gaming. About Kir Kir Titievsky is product manager on Google Cloud Pub/Sub which helps users build analytics pipelines & integrate services, massive and small. He has come to GCP after building mobile enterprise apps for Googlers as well as products for advertising & media agencies at DoubleClick. Before Google, Kir designed advertising recommendation engines as a data scientist. Kir once took a detour to get a PhD in Chemical Engineering from MIT. About Eric Eric Anderson is a product manager on Dataflow, a stream and batch data processing service. Before Dataflow, he started a growth analytics team in Google Cloud. Previously he worked at AWS and GE. Eric holds an engineering degree from the University of Utah and MBA from Harvard. About Tino Tino Tereshko is a Technical Program Manager for BigQuery, the world's only serverless analytics Data Warehouse. Before joining the BigQuery team, Tino worked with strategic cloud customers as a Solutions Architect. Prior to Google, Tino worked in stream processing, financial analytics, and even spent some time as a quant developer. Cool thing of the week BigQuery 1.11, now with Standard SQL, IAM, and partitioned tables! post Interview Building a Mobile Gaming Analytics Platform - a Reference Architecture docs Google Cloud Pub/Sub docs Google Cloud Dataflow docs Google BigQuery docs ETL: Extract, transform, load Wikipedia Apache Beam Project docs MapReduce: Simplified Data Processing on Large Clusters research paper The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing research paper Pushing the Limits of BigQuery video Question of the week Google BigQuery Public Datasets docs The first TB is free! pricing
Join us today for a conversation with Andrew Gerrand and Chris Broadfoot from the Go team. They will discuss with your hosts Francesc Campoy and Mark Mandel why Go is so successful for all the things cloud and how you can use it with Google Cloud Platform. About Andrew Andrew has worked on Go at Google since almost the beginning, and has written tons of blog posts and talks on Go. He spends most of his time making Go easier to use. About Chris Chris joined the Go and Cloud teams last year to improve the experience of writing Go applications for Google Cloud Platform. Before that, he worked at Google on the Maps APIs for around five years. Cool thing of the week EVE Fanfest 2016 - Kubernetes and Google Cloud video Interviews The Go programming language web Go on Google App Engine docs Google Santa Tracker web Tweak The Turkey with a Go powered Doodle doodle gofmt command docs goimports command docs Rails Conf 2012 Keynote: Simplicity Matters by Rich Hickey YouTube Bookshelf tutorial for Go on App Engine tutorial The Go Tour Go Samples repo Question of the week Google Cloud Dataflow docs Google BigQuery docs MapReduce wikipedia Where can you find us next? We'll both be speaking at Google I/O next week! Mark will then be at Change the Game SF Francesc will be riding the AIDS/Lifecycle and if you want you can donate. The Go gopher, by Renee French
In the fourteenth episode of this podcast, your hosts Francesc and Mark interview Paul Newson. Paul is now a Developer Advocate for Google Cloud Platform but was a Software Engineer in the Cloud Storage team. Together they discuss the multiple options available for data storage on the cloud and the trade offs to be taken into account while choosing one. About Paul Paul currently focuses on helping developers harness the power of Google Cloud Platform to solve their big data problems. Previously, he was an engineer on Google Cloud Storage. Before joining Google, Paul founded a startup which was acquired by Microsoft, where he worked on DirectX, Xbox, Xbox Live, and Forza Motorsport, before spending time working on machine learning problems at Microsoft Research. Follow Paul on Twitter @newsons_nybbles. Cool thing of the week Opinionated Deployment Tools & Kubernetes blog post and Github repository Interview When to Pick Google Bigtable vs Other Cloud Platform Databases blog post. Where Should I Store My Stuff? - slightly outdated video. Choosing a Storage Option docs. Google Drive docs. Google Cloud Storage docs. Google Cloud Datastore docs. Google Cloud SQL docs. Google Cloud Bigtable docs. Google BigQuery docs. Google Cloud Dataflow docs Question of the week Question from Jeff Schnitzer: Can you use Java 8 features in Standard App Engine? Google App Engine Standard Environment docs. Google App Engine Managed VMs docs. Retrolambda Github repo.
Ilya Grigorik joined the show to talk about GitHub Archive, logging and archiving GitHub’s public event data, and how he uses Google BigQuery to make querying that data accessible to everyone.
Ilya Grigorik joined the show to talk about GitHub Archive, logging and archiving GitHub’s public event data, and how he uses Google BigQuery to make querying that data accessible to everyone.
Naoya Itoさんをゲストに迎えて、開発合宿、ノマド、Terraform、README、ポエム、MPP, BigQuery などについて話しました。 Show Notes ハッカーズチャンプルー2014 リゾートワーク HikaruStar Venture Camp airbnb NomadList Terraform AWS CloudFormation kumogata わかりやすいREADME.mdを書く Readme Driven Development Design Documents Working Backwards PDD poem-driven development Google BigQuery MPP on Hadoop, RedShift, BigQuery Hadoop Conference Japan 2014 Apache Hive Google BigQueryでDWH構築 Rebuild One-Year Anniversary