Podcast appearances and mentions of Daniel Gross

  • 126PODCASTS
  • 177EPISODES
  • 48mAVG DURATION
  • ?INFREQUENT EPISODES
  • Jan 8, 2026LATEST

POPULARITY

20192020202120222023202420252026


Best podcasts about Daniel Gross

Latest podcast episodes about Daniel Gross

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
Artificial Analysis: Independent LLM Evals as a Service — with George Cameron and Micah-Hill Smith

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Jan 8, 2026 78:24


Happy New Year! You may have noticed that in 2025 we had moved toward YouTube as our primary podcasting platform. As we'll explain in the next State of Latent Space post, we'll be doubling down on Substack again and improving the experience for the over 100,000 of you who look out for our emails and website updates!We first mentioned Artificial Analysis in 2024, when it was still a side project in a Sydney basement. They then were one of the few Nat Friedman and Daniel Gross' AIGrant companies to raise a full seed round from them and have now become the independent gold standard for AI benchmarking—trusted by developers, enterprises, and every major lab to navigate the exploding landscape of models, providers, and capabilities.We have chatted with both Clementine Fourrier of HuggingFace's OpenLLM Leaderboard and (the freshly valued at $1.7B) Anastasios Angelopoulos of LMArena on their approaches to LLM evals and trendspotting, but Artificial Analysis have staked out an enduring and important place in the toolkit of the modern AI Engineer by doing the best job of independently running the most comprehensive set of evals across the widest range of open and closed models, and charting their progress for broad industry analyst use.George Cameron and Micah-Hill Smith have spent two years building Artificial Analysis into the platform that answers the questions no one else will: Which model is actually best for your use case? What are the real speed-cost trade-offs? And how open is “open” really?We discuss:* The origin story: built as a side project in 2023 while Micah was building a legal AI assistant, launched publicly in January 2024, and went viral after Swyx's retweet* Why they run evals themselves: labs prompt models differently, cherry-pick chain-of-thought examples (Google Gemini 1.0 Ultra used 32-shot prompts to beat GPT-4 on MMLU), and self-report inflated numbers* The mystery shopper policy: they register accounts not on their own domain and run intelligence + performance benchmarks incognito to prevent labs from serving different models on private endpoints* How they make money: enterprise benchmarking insights subscription (standardized reports on model deployment, serverless vs. managed vs. leasing chips) and private custom benchmarking for AI companies (no one pays to be on the public leaderboard)* The Intelligence Index (V3): synthesizes 10 eval datasets (MMLU, GPQA, agentic benchmarks, long-context reasoning) into a single score, with 95% confidence intervals via repeated runs* Omissions Index (hallucination rate): scores models from -100 to +100 (penalizing incorrect answers, rewarding ”I don't know”), and Claude models lead with the lowest hallucination rates despite not always being the smartest* GDP Val AA: their version of OpenAI's GDP-bench (44 white-collar tasks with spreadsheets, PDFs, PowerPoints), run through their Stirrup agent harness (up to 100 turns, code execution, web search, file system), graded by Gemini 3 Pro as an LLM judge (tested extensively, no self-preference bias)* The Openness Index: scores models 0-18 on transparency of pre-training data, post-training data, methodology, training code, and licensing (AI2 OLMo 2 leads, followed by Nous Hermes and NVIDIA Nemotron)* The smiling curve of AI costs: GPT-4-level intelligence is 100-1000x cheaper than at launch (thanks to smaller models like Amazon Nova), but frontier reasoning models in agentic workflows cost more than ever (sparsity, long context, multi-turn agents)* Why sparsity might go way lower than 5%: GPT-4.5 is ~5% active, Gemini models might be ~3%, and Omissions Index accuracy correlates with total parameters (not active), suggesting massive sparse models are the future* Token efficiency vs. turn efficiency: GPT-5 costs more per token but solves Tau-bench in fewer turns (cheaper overall), and models are getting better at using more tokens only when needed (5.1 Codex has tighter token distributions)* V4 of the Intelligence Index coming soon: adding GDP Val AA, Critical Point, hallucination rate, and dropping some saturated benchmarks (human-eval-style coding is now trivial for small models)Links to Artificial Analysis* Website: https://artificialanalysis.ai* George Cameron on X: https://x.com/georgecameron* Micah-Hill Smith on X: https://x.com/micahhsmithFull Episode on YouTubeTimestamps* 00:00 Introduction: Full Circle Moment and Artificial Analysis Origins* 01:19 Business Model: Independence and Revenue Streams* 04:33 Origin Story: From Legal AI to Benchmarking Need* 16:22 AI Grant and Moving to San Francisco* 19:21 Intelligence Index Evolution: From V1 to V3* 11:47 Benchmarking Challenges: Variance, Contamination, and Methodology* 13:52 Mystery Shopper Policy and Maintaining Independence* 28:01 New Benchmarks: Omissions Index for Hallucination Detection* 33:36 Critical Point: Hard Physics Problems and Research-Level Reasoning* 23:01 GDP Val AA: Agentic Benchmark for Real Work Tasks* 50:19 Stirrup Agent Harness: Open Source Agentic Framework* 52:43 Openness Index: Measuring Model Transparency Beyond Licenses* 58:25 The Smiling Curve: Cost Falling While Spend Rising* 1:02:32 Hardware Efficiency: Blackwell Gains and Sparsity Limits* 1:06:23 Reasoning Models and Token Efficiency: The Spectrum Emerges* 1:11:00 Multimodal Benchmarking: Image, Video, and Speech Arenas* 1:15:05 Looking Ahead: Intelligence Index V4 and Future Directions* 1:16:50 Closing: The Insatiable Demand for IntelligenceTranscriptMicah [00:00:06]: This is kind of a full circle moment for us in a way, because the first time artificial analysis got mentioned on a podcast was you and Alessio on Latent Space. Amazing.swyx [00:00:17]: Which was January 2024. I don't even remember doing that, but yeah, it was very influential to me. Yeah, I'm looking at AI News for Jan 17, or Jan 16, 2024. I said, this gem of a models and host comparison site was just launched. And then I put in a few screenshots, and I said, it's an independent third party. It clearly outlines the quality versus throughput trade-off, and it breaks out by model and hosting provider. I did give you s**t for missing fireworks, and how do you have a model benchmarking thing without fireworks? But you had together, you had perplexity, and I think we just started chatting there. Welcome, George and Micah, to Latent Space. I've been following your progress. Congrats on... It's been an amazing year. You guys have really come together to be the presumptive new gardener of AI, right? Which is something that...George [00:01:09]: Yeah, but you can't pay us for better results.swyx [00:01:12]: Yes, exactly.George [00:01:13]: Very important.Micah [00:01:14]: Start off with a spicy take.swyx [00:01:18]: Okay, how do I pay you?Micah [00:01:20]: Let's get right into that.swyx [00:01:21]: How do you make money?Micah [00:01:24]: Well, very happy to talk about that. So it's been a big journey the last couple of years. Artificial analysis is going to be two years old in January 2026. Which is pretty soon now. We first run the website for free, obviously, and give away a ton of data to help developers and companies navigate AI and make decisions about models, providers, technologies across the AI stack for building stuff. We're very committed to doing that and tend to keep doing that. We have, along the way, built a business that is working out pretty sustainably. We've got just over 20 people now and two main customer groups. So we want to be... We want to be who enterprise look to for data and insights on AI, so we want to help them with their decisions about models and technologies for building stuff. And then on the other side, we do private benchmarking for companies throughout the AI stack who build AI stuff. So no one pays to be on the website. We've been very clear about that from the very start because there's no use doing what we do unless it's independent AI benchmarking. Yeah. But turns out a bunch of our stuff can be pretty useful to companies building AI stuff.swyx [00:02:38]: And is it like, I am a Fortune 500, I need advisors on objective analysis, and I call you guys and you pull up a custom report for me, you come into my office and give me a workshop? What kind of engagement is that?George [00:02:53]: So we have a benchmarking and insight subscription, which looks like standardized reports that cover key topics or key challenges enterprises face when looking to understand AI and choose between all the technologies. And so, for instance, one of the report is a model deployment report, how to think about choosing between serverless inference, managed deployment solutions, or leasing chips. And running inference yourself is an example kind of decision that big enterprises face, and it's hard to reason through, like this AI stuff is really new to everybody. And so we try and help with our reports and insight subscription. Companies navigate that. We also do custom private benchmarking. And so that's very different from the public benchmarking that we publicize, and there's no commercial model around that. For private benchmarking, we'll at times create benchmarks, run benchmarks to specs that enterprises want. And we'll also do that sometimes for AI companies who have built things, and we help them understand what they've built with private benchmarking. Yeah. So that's a piece mainly that we've developed through trying to support everybody publicly with our public benchmarks. Yeah.swyx [00:04:09]: Let's talk about TechStack behind that. But okay, I'm going to rewind all the way to when you guys started this project. You were all the way in Sydney? Yeah. Well, Sydney, Australia for me.Micah [00:04:19]: George was an SF, but he's Australian, but he moved here already. Yeah.swyx [00:04:22]: And I remember I had the Zoom call with you. What was the impetus for starting artificial analysis in the first place? You know, you started with public benchmarks. And so let's start there. We'll go to the private benchmark. Yeah.George [00:04:33]: Why don't we even go back a little bit to like why we, you know, thought that it was needed? Yeah.Micah [00:04:40]: The story kind of begins like in 2022, 2023, like both George and I have been into AI stuff for quite a while. In 2023 specifically, I was trying to build a legal AI research assistant. So it actually worked pretty well for its era, I would say. Yeah. Yeah. So I was finding that the more you go into building something using LLMs, the more each bit of what you're doing ends up being a benchmarking problem. So had like this multistage algorithm thing, trying to figure out what the minimum viable model for each bit was, trying to optimize every bit of it as you build that out, right? Like you're trying to think about accuracy, a bunch of other metrics and performance and cost. And mostly just no one was doing anything to independently evaluate all the models. And certainly not to look at the trade-offs for speed and cost. So we basically set out just to build a thing that developers could look at to see the trade-offs between all of those things measured independently across all the models and providers. Honestly, it was probably meant to be a side project when we first started doing it.swyx [00:05:49]: Like we didn't like get together and say like, Hey, like we're going to stop working on all this stuff. I'm like, this is going to be our main thing. When I first called you, I think you hadn't decided on starting a company yet.Micah [00:05:58]: That's actually true. I don't even think we'd pause like, like George had an acquittance job. I didn't quit working on my legal AI thing. Like it was genuinely a side project.George [00:06:05]: We built it because we needed it as people building in the space and thought, Oh, other people might find it useful too. So we'll buy domain and link it to the Vercel deployment that we had and tweet about it. And, but very quickly it started getting attention. Thank you, Swyx for, I think doing an initial retweet and spotlighting it there. This project that we released. And then very quickly though, it was useful to others, but very quickly it became more useful as the number of models released accelerated. We had Mixtrel 8x7B and it was a key. That's a fun one. Yeah. Like a open source model that really changed the landscape and opened up people's eyes to other serverless inference providers and thinking about speed, thinking about cost. And so that was a key. And so it became more useful quite quickly. Yeah.swyx [00:07:02]: What I love talking to people like you who sit across the ecosystem is, well, I have theories about what people want, but you have data and that's obviously more relevant. But I want to stay on the origin story a little bit more. When you started out, I would say, I think the status quo at the time was every paper would come out and they would report their numbers versus competitor numbers. And that's basically it. And I remember I did the legwork. I think everyone has some knowledge. I think there's some version of Excel sheet or a Google sheet where you just like copy and paste the numbers from every paper and just post it up there. And then sometimes they don't line up because they're independently run. And so your numbers are going to look better than... Your reproductions of other people's numbers are going to look worse because you don't hold their models correctly or whatever the excuse is. I think then Stanford Helm, Percy Liang's project would also have some of these numbers. And I don't know if there's any other source that you can cite. The way that if I were to start artificial analysis at the same time you guys started, I would have used the Luther AI's eval framework harness. Yup.Micah [00:08:06]: Yup. That was some cool stuff. At the end of the day, running these evals, it's like if it's a simple Q&A eval, all you're doing is asking a list of questions and checking if the answers are right, which shouldn't be that crazy. But it turns out there are an enormous number of things that you've got control for. And I mean, back when we started the website. Yeah. Yeah. Like one of the reasons why we realized that we had to run the evals ourselves and couldn't just take rules from the labs was just that they would all prompt the models differently. And when you're competing over a few points, then you can pretty easily get- You can put the answer into the model. Yeah. That in the extreme. And like you get crazy cases like back when I'm Googled a Gemini 1.0 Ultra and needed a number that would say it was better than GPT-4 and like constructed, I think never published like chain of thought examples. 32 of them in every topic in MLU to run it, to get the score, like there are so many things that you- They never shipped Ultra, right? That's the one that never made it up. Not widely. Yeah. Yeah. Yeah. I mean, I'm sure it existed, but yeah. So we were pretty sure that we needed to run them ourselves and just run them in the same way across all the models. Yeah. And we were, we also did certain from the start that you couldn't look at those in isolation. You needed to look at them alongside the cost and performance stuff. Yeah.swyx [00:09:24]: Okay. A couple of technical questions. I mean, so obviously I also thought about this and I didn't do it because of cost. Yep. Did you not worry about costs? Were you funded already? Clearly not, but you know. No. Well, we definitely weren't at the start.Micah [00:09:36]: So like, I mean, we're paying for it personally at the start. There's a lot of money. Well, the numbers weren't nearly as bad a couple of years ago. So we certainly incurred some costs, but we were probably in the order of like hundreds of dollars of spend across all the benchmarking that we were doing. Yeah. So nothing. Yeah. It was like kind of fine. Yeah. Yeah. These days that's gone up an enormous amount for a bunch of reasons that we can talk about. But yeah, it wasn't that bad because you can also remember that like the number of models we were dealing with was hardly any and the complexity of the stuff that we wanted to do to evaluate them was a lot less. Like we were just asking some Q&A type questions and then one specific thing was for a lot of evals initially, we were just like sampling an answer. You know, like, what's the answer for this? Like, we didn't want to go into the answer directly without letting the models think. We weren't even doing chain of thought stuff initially. And that was the most useful way to get some results initially. Yeah.swyx [00:10:33]: And so for people who haven't done this work, literally parsing the responses is a whole thing, right? Like because sometimes the models, the models can answer any way they feel fit and sometimes they actually do have the right answer, but they just returned the wrong format and they will get a zero for that unless you work it into your parser. And that involves more work. And so, I mean, but there's an open question whether you should give it points for not following your instructions on the format.Micah [00:11:00]: It depends what you're looking at, right? Because you can, if you're trying to see whether or not it can solve a particular type of reasoning problem, and you don't want to test it on its ability to do answer formatting at the same time, then you might want to use an LLM as answer extractor approach to make sure that you get the answer out no matter how unanswered. But these days, it's mostly less of a problem. Like, if you instruct a model and give it examples of what the answers should look like, it can get the answers in your format, and then you can do, like, a simple regex.swyx [00:11:28]: Yeah, yeah. And then there's other questions around, I guess, sometimes if you have a multiple choice question, sometimes there's a bias towards the first answer, so you have to randomize the responses. All these nuances, like, once you dig into benchmarks, you're like, I don't know how anyone believes the numbers on all these things. It's so dark magic.Micah [00:11:47]: You've also got, like… You've got, like, the different degrees of variance in different benchmarks, right? Yeah. So, if you run four-question multi-choice on a modern reasoning model at the temperatures suggested by the labs for their own models, the variance that you can see on a four-question multi-choice eval is pretty enormous if you only do a single run of it and it has a small number of questions, especially. So, like, one of the things that we do is run an enormous number of all of our evals when we're developing new ones and doing upgrades to our intelligence index to bring in new things. Yeah. So, that we can dial in the right number of repeats so that we can get to the 95% confidence intervals that we're comfortable with so that when we pull that together, we can be confident in intelligence index to at least as tight as, like, a plus or minus one at a 95% confidence. Yeah.swyx [00:12:32]: And, again, that just adds a straight multiple to the cost. Oh, yeah. Yeah, yeah.George [00:12:37]: So, that's one of many reasons that cost has gone up a lot more than linearly over the last couple of years. We report a cost to run the artificial analysis. We report a cost to run the artificial analysis intelligence index on our website, and currently that's assuming one repeat in terms of how we report it because we want to reflect a bit about the weighting of the index. But our cost is actually a lot higher than what we report there because of the repeats.swyx [00:13:03]: Yeah, yeah, yeah. And probably this is true, but just checking, you don't have any special deals with the labs. They don't discount it. You just pay out of pocket or out of your sort of customer funds. Oh, there is a mix. So, the issue is that sometimes they may give you a special end point, which is… Ah, 100%.Micah [00:13:21]: Yeah, yeah, yeah. Exactly. So, we laser focus, like, on everything we do on having the best independent metrics and making sure that no one can manipulate them in any way. There are quite a lot of processes we've developed over the last couple of years to make that true for, like, the one you bring up, like, right here of the fact that if we're working with a lab, if they're giving us a private endpoint to evaluate a model, that it is totally possible. That what's sitting behind that black box is not the same as they serve on a public endpoint. We're very aware of that. We have what we call a mystery shopper policy. And so, and we're totally transparent with all the labs we work with about this, that we will register accounts not on our own domain and run both intelligence evals and performance benchmarks… Yeah, that's the job. …without them being able to identify it. And no one's ever had a problem with that. Because, like, a thing that turns out to actually be quite a good… …good factor in the industry is that they all want to believe that none of their competitors could manipulate what we're doing either.swyx [00:14:23]: That's true. I never thought about that. I've been in the database data industry prior, and there's a lot of shenanigans around benchmarking, right? So I'm just kind of going through the mental laundry list. Did I miss anything else in this category of shenanigans? Oh, potential shenanigans.Micah [00:14:36]: I mean, okay, the biggest one, like, that I'll bring up, like, is more of a conceptual one, actually, than, like, direct shenanigans. It's that the things that get measured become things that get targeted by labs that they're trying to build, right? Exactly. So that doesn't mean anything that we should really call shenanigans. Like, I'm not talking about training on test set. But if you know that you're going to be great at another particular thing, if you're a researcher, there are a whole bunch of things that you can do to try to get better at that thing that preferably are going to be helpful for a wide range of how actual users want to use the thing that you're building. But will not necessarily work. Will not necessarily do that. So, for instance, the models are exceptional now at answering competition maths problems. There is some relevance of that type of reasoning, that type of work, to, like, how we might use modern coding agents and stuff. But it's clearly not one for one. So the thing that we have to be aware of is that once an eval becomes the thing that everyone's looking at, scores can get better on it without there being a reflection of overall generalized intelligence of these models. Getting better. That has been true for the last couple of years. It'll be true for the next couple of years. There's no silver bullet to defeat that other than building new stuff to stay relevant and measure the capabilities that matter most to real users. Yeah.swyx [00:15:58]: And we'll cover some of the new stuff that you guys are building as well, which is cool. Like, you used to just run other people's evals, but now you're coming up with your own. And I think, obviously, that is a necessary path once you're at the frontier. You've exhausted all the existing evals. I think the next point in history that I have for you is AI Grant that you guys decided to join and move here. What was it like? I think you were in, like, batch two? Batch four. Batch four. Okay.Micah [00:16:26]: I mean, it was great. Nat and Daniel are obviously great. And it's a really cool group of companies that we were in AI Grant alongside. It was really great to get Nat and Daniel on board. Obviously, they've done a whole lot of great work in the space with a lot of leading companies and were extremely aligned. With the mission of what we were trying to do. Like, we're not quite typical of, like, a lot of the other AI startups that they've invested in.swyx [00:16:53]: And they were very much here for the mission of what we want to do. Did they say any advice that really affected you in some way or, like, were one of the events very impactful? That's an interesting question.Micah [00:17:03]: I mean, I remember fondly a bunch of the speakers who came and did fireside chats at AI Grant.swyx [00:17:09]: Which is also, like, a crazy list. Yeah.George [00:17:11]: Oh, totally. Yeah, yeah, yeah. There was something about, you know, speaking to Nat and Daniel about the challenges of working through a startup and just working through the questions that don't have, like, clear answers and how to work through those kind of methodically and just, like, work through the hard decisions. And they've been great mentors to us as we've built artificial analysis. Another benefit for us was that other companies in the batch and other companies in AI Grant are pushing the capabilities. Yeah. And I think that's a big part of what AI can do at this time. And so being in contact with them, making sure that artificial analysis is useful to them has been fantastic for supporting us in working out how should we build out artificial analysis to continue to being useful to those, like, you know, building on AI.swyx [00:17:59]: I think to some extent, I'm mixed opinion on that one because to some extent, your target audience is not people in AI Grants who are obviously at the frontier. Yeah. Do you disagree?Micah [00:18:09]: To some extent. To some extent. But then, so a lot of what the AI Grant companies are doing is taking capabilities coming out of the labs and trying to push the limits of what they can do across the entire stack for building great applications, which actually makes some of them pretty archetypical power users of artificial analysis. Some of the people with the strongest opinions about what we're doing well and what we're not doing well and what they want to see next from us. Yeah. Yeah. Because when you're building any kind of AI application now, chances are you're using a whole bunch of different models. You're maybe switching reasonably frequently for different models and different parts of your application to optimize what you're able to do with them at an accuracy level and to get better speed and cost characteristics. So for many of them, no, they're like not commercial customers of ours, like we don't charge for all our data on the website. Yeah. They are absolutely some of our power users.swyx [00:19:07]: So let's talk about just the evals as well. So you start out from the general like MMU and GPQA stuff. What's next? How do you sort of build up to the overall index? What was in V1 and how did you evolve it? Okay.Micah [00:19:22]: So first, just like background, like we're talking about the artificial analysis intelligence index, which is our synthesis metric that we pulled together currently from 10 different eval data sets to give what? We're pretty much the same as that. Pretty confident is the best single number to look at for how smart the models are. Obviously, it doesn't tell the whole story. That's why we published the whole website of all the charts to dive into every part of it and look at the trade-offs. But best single number. So right now, it's got a bunch of Q&A type data sets that have been very important to the industry, like a couple that you just mentioned. It's also got a couple of agentic data sets. It's got our own long context reasoning data set and some other use case focused stuff. As time goes on. The things that we're most interested in that are going to be important to the capabilities that are becoming more important for AI, what developers are caring about, are going to be first around agentic capabilities. So surprise, surprise. We're all loving our coding agents and how the model is going to perform like that and then do similar things for different types of work are really important to us. The linking to use cases to economically valuable use cases are extremely important to us. And then we've got some of the. Yeah. These things that the models still struggle with, like working really well over long contexts that are not going to go away as specific capabilities and use cases that we need to keep evaluating.swyx [00:20:46]: But I guess one thing I was driving was like the V1 versus the V2 and how bad it was over time.Micah [00:20:53]: Like how we've changed the index to where we are.swyx [00:20:55]: And I think that reflects on the change in the industry. Right. So that's a nice way to tell that story.Micah [00:21:00]: Well, V1 would be completely saturated right now. Almost every model coming out because doing things like writing the Python functions and human evil is now pretty trivial. It's easy to forget, actually, I think how much progress has been made in the last two years. Like we obviously play the game constantly of like the today's version versus last week's version and the week before and all of the small changes in the horse race between the current frontier and who has the best like smaller than 10B model like right now this week. Right. And that's very important to a lot of developers and people and especially in this particular city of San Francisco. But when you zoom out a couple of years ago, literally most of what we were doing to evaluate the models then would all be 100% solved by even pretty small models today. And that's been one of the key things, by the way, that's driven down the cost of intelligence at every tier of intelligence. We can talk about more in a bit. So V1, V2, V3, we made things harder. We covered a wider range of use cases. And we tried to get closer to things developers care about as opposed to like just the Q&A type stuff that MMLU and GPQA represented. Yeah.swyx [00:22:12]: I don't know if you have anything to add there. Or we could just go right into showing people the benchmark and like looking around and asking questions about it. Yeah.Micah [00:22:21]: Let's do it. Okay. This would be a pretty good way to chat about a few of the new things we've launched recently. Yeah.George [00:22:26]: And I think a little bit about the direction that we want to take it. And we want to push benchmarks. Currently, the intelligence index and evals focus a lot on kind of raw intelligence. But we kind of want to diversify how we think about intelligence. And we can talk about it. But kind of new evals that we've kind of built and partnered on focus on topics like hallucination. And we've got a lot of topics that I think are not covered by the current eval set that should be. And so we want to bring that forth. But before we get into that.swyx [00:23:01]: And so for listeners, just as a timestamp, right now, number one is Gemini 3 Pro High. Then followed by Cloud Opus at 70. Just 5.1 high. You don't have 5.2 yet. And Kimi K2 Thinking. Wow. Still hanging in there. So those are the top four. That will date this podcast quickly. Yeah. Yeah. I mean, I love it. I love it. No, no. 100%. Look back this time next year and go, how cute. Yep.George [00:23:25]: Totally. A quick view of that is, okay, there's a lot. I love it. I love this chart. Yeah.Micah [00:23:30]: This is such a favorite, right? Yeah. And almost every talk that George or I give at conferences and stuff, we always put this one up first to just talk about situating where we are in this moment in history. This, I think, is the visual version of what I was saying before about the zooming out and remembering how much progress there's been. If we go back to just over a year ago, before 01, before Cloud Sonnet 3.5, we didn't have reasoning models or coding agents as a thing. And the game was very, very different. If we go back even a little bit before then, we're in the era where, when you look at this chart, open AI was untouchable for well over a year. And, I mean, you would remember that time period well of there being very open questions about whether or not AI was going to be competitive, like full stop, whether or not open AI would just run away with it, whether we would have a few frontier labs and no one else would really be able to do anything other than consume their APIs. I am quite happy overall that the world that we have ended up in is one where... Multi-model. Absolutely. And strictly more competitive every quarter over the last few years. Yeah. This year has been insane. Yeah.George [00:24:42]: You can see it. This chart with everything added is hard to read currently. There's so many dots on it, but I think it reflects a little bit what we felt, like how crazy it's been.swyx [00:24:54]: Why 14 as the default? Is that a manual choice? Because you've got service now in there that are less traditional names. Yeah.George [00:25:01]: It's models that we're kind of highlighting by default in our charts, in our intelligence index. Okay.swyx [00:25:07]: You just have a manually curated list of stuff.George [00:25:10]: Yeah, that's right. But something that I actually don't think every artificial analysis user knows is that you can customize our charts and choose what models are highlighted. Yeah. And so if we take off a few names, it gets a little easier to read.swyx [00:25:25]: Yeah, yeah. A little easier to read. Totally. Yeah. But I love that you can see the all one jump. Look at that. September 2024. And the DeepSeek jump. Yeah.George [00:25:34]: Which got close to OpenAI's leadership. They were so close. I think, yeah, we remember that moment. Around this time last year, actually.Micah [00:25:44]: Yeah, yeah, yeah. I agree. Yeah, well, a couple of weeks. It was Boxing Day in New Zealand when DeepSeek v3 came out. And we'd been tracking DeepSeek and a bunch of the other global players that were less known over the second half of 2024 and had run evals on the earlier ones and stuff. I very distinctly remember Boxing Day in New Zealand, because I was with family for Christmas and stuff, running the evals and getting back result by result on DeepSeek v3. So this was the first of their v3 architecture, the 671b MOE.Micah [00:26:19]: And we were very, very impressed. That was the moment where we were sure that DeepSeek was no longer just one of many players, but had jumped up to be a thing. The world really noticed when they followed that up with the RL working on top of v3 and R1 succeeding a few weeks later. But the groundwork for that absolutely was laid with just extremely strong base model, completely open weights that we had as the best open weights model. So, yeah, that's the thing that you really see in the game. But I think that we got a lot of good feedback on Boxing Day. us on Boxing Day last year.George [00:26:48]: Boxing Day is the day after Christmas for those not familiar.George [00:26:54]: I'm from Singapore.swyx [00:26:55]: A lot of us remember Boxing Day for a different reason, for the tsunami that happened. Oh, of course. Yeah, but that was a long time ago. So yeah. So this is the rough pitch of AAQI. Is it A-A-Q-I or A-A-I-I? I-I. Okay. Good memory, though.Micah [00:27:11]: I don't know. I'm not used to it. Once upon a time, we did call it Quality Index, and we would talk about quality, performance, and price, but we changed it to intelligence.George [00:27:20]: There's been a few naming changes. We added hardware benchmarking to the site, and so benchmarks at a kind of system level. And so then we changed our throughput metric to, we now call it output speed, and thenswyx [00:27:32]: throughput makes sense at a system level, so we took that name. Take me through more charts. What should people know? Obviously, the way you look at the site is probably different than how a beginner might look at it.Micah [00:27:42]: Yeah, that's fair. There's a lot of fun stuff to dive into. Maybe so we can hit past all the, like, we have lots and lots of emails and stuff. The interesting ones to talk about today that would be great to bring up are a few of our recent things, I think, that probably not many people will be familiar with yet. So first one of those is our omniscience index. So this one is a little bit different to most of the intelligence evils that we've run. We built it specifically to look at the embedded knowledge in the models and to test hallucination by looking at when the model doesn't know the answer, so not able to get it correct, what's its probability of saying, I don't know, or giving an incorrect answer. So the metric that we use for omniscience goes from negative 100 to positive 100. Because we're simply taking off a point if you give an incorrect answer to the question. We're pretty convinced that this is an example of where it makes most sense to do that, because it's strictly more helpful to say, I don't know, instead of giving a wrong answer to factual knowledge question. And one of our goals is to shift the incentive that evils create for models and the labs creating them to get higher scores. And almost every evil across all of AI up until this point, it's been graded by simple percentage correct as the main metric, the main thing that gets hyped. And so you should take a shot at everything. There's no incentive to say, I don't know. So we did that for this one here.swyx [00:29:22]: I think there's a general field of calibration as well, like the confidence in your answer versus the rightness of the answer. Yeah, we completely agree. Yeah. Yeah.George [00:29:31]: On that. And one reason that we didn't do that is because. Or put that into this index is that we think that the, the way to do that is not to ask the models how confident they are.swyx [00:29:43]: I don't know. Maybe it might be though. You put it like a JSON field, say, say confidence and maybe it spits out something. Yeah. You know, we have done a few evils podcasts over the, over the years. And when we did one with Clementine of hugging face, who maintains the open source leaderboard, and this was one of her top requests, which is some kind of hallucination slash lack of confidence calibration thing. And so, Hey, this is one of them.Micah [00:30:05]: And I mean, like anything that we do, it's not a perfect metric or the whole story of everything that you think about as hallucination. But yeah, it's pretty useful and has some interesting results. Like one of the things that we saw in the hallucination rate is that anthropics Claude models at the, the, the very left-hand side here with the lowest hallucination rates out of the models that we've evaluated amnesty is on. That is an interesting fact. I think it probably correlates with a lot of the previously, not really measured vibes stuff that people like about some of the Claude models. Is the dataset public or what's is it, is there a held out set? There's a hell of a set for this one. So we, we have published a public test set, but we we've only published 10% of it. The reason is that for this one here specifically, it would be very, very easy to like have data contamination because it is just factual knowledge questions. We would. We'll update it at a time to also prevent that, but with yeah, kept most of it held out so that we can keep it reliable for a long time. It leads us to a bunch of really cool things, including breakdown quite granularly by topic. And so we've got some of that disclosed on the website publicly right now, and there's lots more coming in terms of our ability to break out very specific topics. Yeah.swyx [00:31:23]: I would be interested. Let's, let's dwell a little bit on this hallucination one. I noticed that Haiku hallucinates less than Sonnet hallucinates less than Opus. And yeah. Would that be the other way around in a normal capability environments? I don't know. What's, what do you make of that?George [00:31:37]: One interesting aspect is that we've found that there's not really a, not a strong correlation between intelligence and hallucination, right? That's to say that the smarter the models are in a general sense, isn't correlated with their ability to, when they don't know something, say that they don't know. It's interesting that Gemini three pro preview was a big leap over here. Gemini 2.5. Flash and, and, and 2.5 pro, but, and if I add pro quickly here.swyx [00:32:07]: I bet pro's really good. Uh, actually no, I meant, I meant, uh, the GPT pros.George [00:32:12]: Oh yeah.swyx [00:32:13]: Cause GPT pros are rumored. We don't know for a fact that it's like eight runs and then with the LM judge on top. Yeah.George [00:32:20]: So we saw a big jump in, this is accuracy. So this is just percent that they get, uh, correct and Gemini three pro knew a lot more than the other models. And so big jump in accuracy. But relatively no change between the Google Gemini models, between releases. And the hallucination rate. Exactly. And so it's likely due to just kind of different post-training recipe, between the, the Claude models. Yeah.Micah [00:32:45]: Um, there's, there's driven this. Yeah. You can, uh, you can partially blame us and how we define intelligence having until now not defined hallucination as a negative in the way that we think about intelligence.swyx [00:32:56]: And so that's what we're changing. Uh, I know many smart people who are confidently incorrect.George [00:33:02]: Uh, look, look at that. That, that, that is very humans. Very true. And there's times and a place for that. I think our view is that hallucination rate makes sense in this context where it's around knowledge, but in many cases, people want the models to hallucinate, to have a go. Often that's the case in coding or when you're trying to generate newer ideas. One eval that we added to artificial analysis is, is, is critical point and it's really hard, uh, physics problems. Okay.swyx [00:33:32]: And is it sort of like a human eval type or something different or like a frontier math type?George [00:33:37]: It's not dissimilar to frontier frontier math. So these are kind of research questions that kind of academics in the physics physics world would be able to answer, but models really struggled to answer. So the top score here is not 9%.swyx [00:33:51]: And when the people that, that created this like Minway and, and, and actually off via who was kind of behind sweep and what organization is this? Oh, is this, it's Princeton.George [00:34:01]: Kind of range of academics from, from, uh, different academic institutions, really smart people. They talked about how they turn the models up in terms of the temperature as high temperature as they can, where they're trying to explore kind of new ideas in physics as a, as a thought partner, just because they, they want the models to hallucinate. Um, yeah, sometimes it's something new. Yeah, exactly.swyx [00:34:21]: Um, so not right in every situation, but, um, I think it makes sense, you know, to test hallucination in scenarios where it makes sense. Also, the obvious question is, uh, this is one of. Many that there is there, every lab has a system card that shows some kind of hallucination number, and you've chosen to not, uh, endorse that and you've made your own. And I think that's a, that's a choice. Um, totally in some sense, the rest of artificial analysis is public benchmarks that other people can independently rerun. You provide it as a service here. You have to fight the, well, who are we to, to like do this? And your, your answer is that we have a lot of customers and, you know, but like, I guess, how do you converge the individual?Micah [00:35:08]: I mean, I think, I think for hallucinations specifically, there are a bunch of different things that you might care about reasonably, and that you'd measure quite differently, like we've called this a amnesty and solutionation rate, not trying to declare the, like, it's humanity's last hallucination. You could, uh, you could have some interesting naming conventions and all this stuff. Um, the biggest picture answer to that. It's something that I actually wanted to mention. Just as George was explaining, critical point as well is, so as we go forward, we are building evals internally. We're partnering with academia and partnering with AI companies to build great evals. We have pretty strong views on, in various ways for different parts of the AI stack, where there are things that are not being measured well, or things that developers care about that should be measured more and better. And we intend to be doing that. We're not obsessed necessarily with that. Everything we do, we have to do entirely within our own team. Critical point. As a cool example of where we were a launch partner for it, working with academia, we've got some partnerships coming up with a couple of leading companies. Those ones, obviously we have to be careful with on some of the independent stuff, but with the right disclosure, like we're completely comfortable with that. A lot of the labs have released great data sets in the past that we've used to great success independently. And so it's between all of those techniques, we're going to be releasing more stuff in the future. Cool.swyx [00:36:26]: Let's cover the last couple. And then we'll, I want to talk about your trends analysis stuff, you know? Totally.Micah [00:36:31]: So that actually, I have one like little factoid on omniscience. If you go back up to accuracy on omniscience, an interesting thing about this accuracy metric is that it tracks more closely than anything else that we measure. The total parameter count of models makes a lot of sense intuitively, right? Because this is a knowledge eval. This is the pure knowledge metric. We're not looking at the index and the hallucination rate stuff that we think is much more about how the models are trained. This is just what facts did they recall? And yeah, it tracks parameter count extremely closely. Okay.swyx [00:37:05]: What's the rumored size of GPT-3 Pro? And to be clear, not confirmed for any official source, just rumors. But rumors do fly around. Rumors. I get, I hear all sorts of numbers. I don't know what to trust.Micah [00:37:17]: So if you, if you draw the line on omniscience accuracy versus total parameters, we've got all the open ways models, you can squint and see that likely the leading frontier models right now are quite a lot bigger than the ones that we're seeing right now. And the one trillion parameters that the open weights models cap out at, and the ones that we're looking at here, there's an interesting extra data point that Elon Musk revealed recently about XAI that for three trillion parameters for GROK 3 and 4, 6 trillion for GROK 5, but that's not out yet. Take those together, have a look. You might reasonably form a view that there's a pretty good chance that Gemini 3 Pro is bigger than that, that it could be in the 5 to 10 trillion parameters. To be clear, I have absolutely no idea, but just based on this chart, like that's where you would, you would land if you have a look at it. Yeah.swyx [00:38:07]: And to some extent, I actually kind of discourage people from guessing too much because what does it really matter? Like as long as they can serve it as a sustainable cost, that's about it. Like, yeah, totally.George [00:38:17]: They've also got different incentives in play compared to like open weights models who are thinking to supporting others in self-deployment for the labs who are doing inference at scale. It's I think less about total parameters in many cases. When thinking about inference costs and more around number of active parameters. And so there's a bit of an incentive towards larger sparser models. Agreed.Micah [00:38:38]: Understood. Yeah. Great. I mean, obviously if you're a developer or company using these things, not exactly as you say, it doesn't matter. You should be looking at all the different ways that we measure intelligence. You should be looking at cost to run index number and the different ways of thinking about token efficiency and cost efficiency based on the list prices, because that's all it matters.swyx [00:38:56]: It's not as good for the content creator rumor mill where I can say. Oh, GPT-4 is this small circle. Look at GPT-5 is this big circle. And then there used to be a thing for a while. Yeah.Micah [00:39:07]: But that is like on its own, actually a very interesting one, right? That is it just purely that chances are the last couple of years haven't seen a dramatic scaling up in the total size of these models. And so there's a lot of room to go up properly in total size of the models, especially with the upcoming hardware generations. Yes.swyx [00:39:29]: So, you know. Taking off my shitposting face for a minute. Yes. Yes. At the same time, I do feel like, you know, especially coming back from Europe, people do feel like Ilya is probably right that the paradigm is doesn't have many more orders of magnitude to scale out more. And therefore we need to start exploring at least a different path. GDPVal, I think it's like only like a month or so old. I was also very positive when it first came out. I actually talked to Tejo, who was the lead researcher on that. Oh, cool. And you have your own version.George [00:39:59]: It's a fantastic. It's a fantastic data set. Yeah.swyx [00:40:01]: And maybe it will recap for people who are still out of it. It's like 44 tasks based on some kind of GDP cutoff that's like meant to represent broad white collar work that is not just coding. Yeah.Micah [00:40:12]: Each of the tasks have a whole bunch of detailed instructions, some input files for a lot of them. It's within the 44 is divided into like two hundred and twenty two to five, maybe subtasks that are the level of that we run through the agenda. And yeah, they're really interesting. I will say that it doesn't. It doesn't necessarily capture like all the stuff that people do at work. No avail is perfect is always going to be more things to look at, largely because in order to make the tasks well enough to find that you can run them, they need to only have a handful of input files and very specific instructions for that task. And so I think the easiest way to think about them are that they're like quite hard take home exam tasks that you might do in an interview process.swyx [00:40:56]: Yeah, for listeners, it is not no longer like a long prompt. It is like, well, here's a zip file with like a spreadsheet or a PowerPoint deck or a PDF and go nuts and answer this question.George [00:41:06]: OpenAI released a great data set and they released a good paper which looks at performance across the different web chat bots on the data set. It's a great paper, encourage people to read it. What we've done is taken that data set and turned it into an eval that can be run on any model. So we created a reference agentic harness that can run. Run the models on the data set, and then we developed evaluator approach to compare outputs. That's kind of AI enabled, so it uses Gemini 3 Pro Preview to compare results, which we tested pretty comprehensively to ensure that it's aligned to human preferences. One data point there is that even as an evaluator, Gemini 3 Pro, interestingly, doesn't do actually that well. So that's kind of a good example of what we've done in GDPVal AA.swyx [00:42:01]: Yeah, the thing that you have to watch out for with LLM judge is self-preference that models usually prefer their own output, and in this case, it was not. Totally.Micah [00:42:08]: I think the way that we're thinking about the places where it makes sense to use an LLM as judge approach now, like quite different to some of the early LLM as judge stuff a couple of years ago, because some of that and MTV was a great project that was a good example of some of this a while ago was about judging conversations and like a lot of style type stuff. Here, we've got the task that the grader and grading model is doing is quite different to the task of taking the test. When you're taking the test, you've got all of the agentic tools you're working with, the code interpreter and web search, the file system to go through many, many turns to try to create the documents. Then on the other side, when we're grading it, we're running it through a pipeline to extract visual and text versions of the files and be able to provide that to Gemini, and we're providing the criteria for the task and getting it to pick which one more effectively meets the criteria of the task. Yeah. So we've got the task out of two potential outcomes. It turns out that we proved that it's just very, very good at getting that right, matched with human preference a lot of the time, because I think it's got the raw intelligence, but it's combined with the correct representation of the outputs, the fact that the outputs were created with an agentic task that is quite different to the way the grading model works, and we're comparing it against criteria, not just kind of zero shot trying to ask the model to pick which one is better.swyx [00:43:26]: Got it. Why is this an ELO? And not a percentage, like GDP-VAL?George [00:43:31]: So the outputs look like documents, and there's video outputs or audio outputs from some of the tasks. It has to make a video? Yeah, for some of the tasks. Some of the tasks.swyx [00:43:43]: What task is that?George [00:43:45]: I mean, it's in the data set. Like be a YouTuber? It's a marketing video.Micah [00:43:49]: Oh, wow. What? Like model has to go find clips on the internet and try to put it together. The models are not that good at doing that one, for now, to be clear. It's pretty hard to do that with a code editor. I mean, the computer stuff doesn't work quite well enough and so on and so on, but yeah.George [00:44:02]: And so there's no kind of ground truth, necessarily, to compare against, to work out percentage correct. It's hard to come up with correct or incorrect there. And so it's on a relative basis. And so we use an ELO approach to compare outputs from each of the models between the task.swyx [00:44:23]: You know what you should do? You should pay a contractor, a human, to do the same task. And then give it an ELO and then so you have, you have human there. It's just, I think what's helpful about GDPVal, the OpenAI one, is that 50% is meant to be normal human and maybe Domain Expert is higher than that, but 50% was the bar for like, well, if you've crossed 50, you are superhuman. Yeah.Micah [00:44:47]: So we like, haven't grounded this score in that exactly. I agree that it can be helpful, but we wanted to generalize this to a very large number. It's one of the reasons that presenting it as ELO is quite helpful and allows us to add models and it'll stay relevant for quite a long time. I also think it, it can be tricky looking at these exact tasks compared to the human performance, because the way that you would go about it as a human is quite different to how the models would go about it. Yeah.swyx [00:45:15]: I also liked that you included Lama 4 Maverick in there. Is that like just one last, like...Micah [00:45:20]: Well, no, no, no, no, no, no, it is the, it is the best model released by Meta. And... So it makes it into the homepage default set, still for now.George [00:45:31]: Other inclusion that's quite interesting is we also ran it across the latest versions of the web chatbots. And so we have...swyx [00:45:39]: Oh, that's right.George [00:45:40]: Oh, sorry.swyx [00:45:41]: I, yeah, I completely missed that. Okay.George [00:45:43]: No, not at all. So that, which has a checkered pattern. So that is their harness, not yours, is what you're saying. Exactly. And what's really interesting is that if you compare, for instance, Claude 4.5 Opus using the Claude web chatbot, it performs worse than the model in our agentic harness. And so in every case, the model performs better in our agentic harness than its web chatbot counterpart, the harness that they created.swyx [00:46:13]: Oh, my backwards explanation for that would be that, well, it's meant for consumer use cases and here you're pushing it for something.Micah [00:46:19]: The constraints are different and the amount of freedom that you can give the model is different. Also, you like have a cost goal. We let the models work as long as they want, basically. Yeah. Do you copy paste manually into the chatbot? Yeah. Yeah. That's, that was how we got the chatbot reference. We're not going to be keeping those updated at like quite the same scale as hundreds of models.swyx [00:46:38]: Well, so I don't know, talk to a browser base. They'll, they'll automate it for you. You know, like I have thought about like, well, we should turn these chatbot versions into an API because they are legitimately different agents in themselves. Yes. Right. Yeah.Micah [00:46:53]: And that's grown a huge amount of the last year, right? Like the tools. The tools that are available have actually diverged in my opinion, a fair bit across the major chatbot apps and the amount of data sources that you can connect them to have gone up a lot, meaning that your experience and the way you're using the model is more different than ever.swyx [00:47:10]: What tools and what data connections come to mind when you say what's interesting, what's notable work that people have done?Micah [00:47:15]: Oh, okay. So my favorite example on this is that until very recently, I would argue that it was basically impossible to get an LLM to draft an email for me in any useful way. Because most times that you're sending an email, you're not just writing something for the sake of writing it. Chances are context required is a whole bunch of historical emails. Maybe it's notes that you've made, maybe it's meeting notes, maybe it's, um, pulling something from your, um, any of like wherever you at work store stuff. So for me, like Google drive, one drive, um, in our super base databases, if we need to do some analysis or some data or something, preferably model can be plugged into all of those things and can go do some useful work based on it. The things that like I find most impressive currently that I am somewhat surprised work really well in late 2025, uh, that I can have models use super base MCP to query read only, of course, run a whole bunch of SQL queries to do pretty significant data analysis. And. And make charts and stuff and can read my Gmail and my notion. And okay. You actually use that. That's good. That's, that's, that's good. Is that a cloud thing? To various degrees of order, but chat GPD and Claude right now, I would say that this stuff like barely works in fairness right now. Like.George [00:48:33]: Because people are actually going to try this after they hear it. If you get an email from Micah, odds are it wasn't written by a chatbot.Micah [00:48:38]: So, yeah, I think it is true that I have never actually sent anyone an email drafted by a chatbot. Yet.swyx [00:48:46]: Um, and so you can, you can feel it right. And yeah, this time, this time next year, we'll come back and see where it's going. Totally. Um, super base shout out another famous Kiwi. Uh, I don't know if you've, you've any conversations with him about anything in particular on AI building and AI infra.George [00:49:03]: We have had, uh, Twitter DMS, um, with, with him because we're quite big, uh, super base users and power users. And we probably do some things more manually than we should in. In, in super base support line because you're, you're a little bit being super friendly. One extra, um, point regarding, um, GDP Val AA is that on the basis of the overperformance of the models compared to the chatbots turns out, we realized that, oh, like our reference harness that we built actually white works quite well on like gen generalist agentic tasks. This proves it in a sense. And so the agent harness is very. Minimalist. I think it follows some of the ideas that are in Claude code and we, all that we give it is context management capabilities, a web search, web browsing, uh, tool, uh, code execution, uh, environment. Anything else?Micah [00:50:02]: I mean, we can equip it with more tools, but like by default, yeah, that's it. We, we, we give it for GDP, a tool to, uh, view an image specifically, um, because the models, you know, can just use a terminal to pull stuff in text form into context. But to pull visual stuff into context, we had to give them a custom tool, but yeah, exactly. Um, you, you can explain an expert. No.George [00:50:21]: So it's, it, we turned out that we created a good generalist agentic harness. And so we, um, released that on, on GitHub yesterday. It's called stirrup. So if people want to check it out and, and it's a great, um, you know, base for, you know, generalist, uh, building a generalist agent for more specific tasks.Micah [00:50:39]: I'd say the best way to use it is get clone and then have your favorite coding. Agent make changes to it, to do whatever you want, because it's not that many lines of code and the coding agents can work with it. Super well.swyx [00:50:51]: Well, that's nice for the community to explore and share and hack on it. I think maybe in, in, in other similar environments, the terminal bench guys have done, uh, sort of the Harbor. Uh, and so it's, it's a, it's a bundle of, well, we need our minimal harness, which for them is terminus and we also need the RL environments or Docker deployment thing to, to run independently. So I don't know if you've looked at it. I don't know if you've looked at the harbor at all, is that, is that like a, a standard that people want to adopt?George [00:51:19]: Yeah, we've looked at it from a evals perspective and we love terminal bench and, and host benchmarks of, of, of terminal mention on artificial analysis. Um, we've looked at it from a, from a coding agent perspective, but could see it being a great, um, basis for any kind of agents. I think where we're getting to is that these models have gotten smart enough. They've gotten better, better tools that they can perform better when just given a minimalist. Set of tools and, and let them run, let the model control the, the agentic workflow rather than using another framework that's a bit more built out that tries to dictate the, dictate the flow. Awesome.swyx [00:51:56]: Let's cover the openness index and then let's go into the report stuff. Uh, so that's the, that's the last of the proprietary art numbers, I guess. I don't know how you sort of classify all these. Yeah.Micah [00:52:07]: Or call it, call it, let's call it the last of like the, the three new things that we're talking about from like the last few weeks. Um, cause I mean, there's a, we do a mix of stuff that. Where we're using open source, where we open source and what we do and, um, proprietary stuff that we don't always open source, like long context reasoning data set last year, we did open source. Um, and then all of the work on performance benchmarks across the site, some of them, we looking to open source, but some of them, like we're constantly iterating on and so on and so on and so on. So there's a huge mix, I would say, just of like stuff that is open source and not across the side. So that's a LCR for people. Yeah, yeah, yeah, yeah.swyx [00:52:41]: Uh, but let's, let's, let's talk about open.Micah [00:52:42]: Let's talk about openness index. This. Here is call it like a new way to think about how open models are. We, for a long time, have tracked where the models are open weights and what the licenses on them are. And that's like pretty useful. That tells you what you're allowed to do with the weights of a model, but there is this whole other dimension to how open models are. That is pretty important that we haven't tracked until now. And that's how much is disclosed about how it was made. So transparency about data, pre-training data and post-training data. And whether you're allowed to use that data and transparency about methodology and training code. So basically, those are the components. We bring them together to score an openness index for models so that you can in one place get this full picture of how open models are.swyx [00:53:32]: I feel like I've seen a couple other people try to do this, but they're not maintained. I do think this does matter. I don't know what the numbers mean apart from is there a max number? Is this out of 20?George [00:53:44]: It's out of 18 currently, and so we've got an openness index page, but essentially these are points, you get points for being more open across these different categories and the maximum you can achieve is 18. So AI2 with their extremely open OMO3 32B think model is the leader in a sense.swyx [00:54:04]: It's hooking face.George [00:54:05]: Oh, with their smaller model. It's coming soon. I think we need to run, we need to get the intelligence benchmarks right to get it on the site.swyx [00:54:12]: You can't have it open in the next. We can not include hooking face. We love hooking face. We'll have that, we'll have that up very soon. I mean, you know, the refined web and all that stuff. It's, it's amazing. Or is it called fine web? Fine web. Fine web.Micah [00:54:23]: Yeah, yeah, no, totally. Yep. One of the reasons this is cool, right, is that if you're trying to understand the holistic picture of the models and what you can do with all the stuff the company's contributing, this gives you that picture. And so we are going to keep it up to date alongside all the models that we do intelligence index on, on the site. And it's just an extra view to understand.swyx [00:54:43]: Can you scroll down to this? The, the, the, the trade-offs chart. Yeah, yeah. That one. Yeah. This, this really matters, right? Obviously, because you can b

The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch
20VC: ElevenLabs Hits $200M ARR: The Untold Story of Europe's Fastest Growing AI Startup | The Real Cost of AI from Talent to Data Centres | How US VCs are in a Different League to Europeans | The Future of Foundation Models with Mati Staniszewski

The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch

Play Episode Listen Later Sep 8, 2025 74:06


Mati Staniszewski is the Co-Founder and CEO of ElevenLabs, the world's leading AI voice platform. Since launching in 2022, ElevenLabs has raised over $350M, most recently at a $3.3BN valuation, making it one of Europe's fastest AI unicorns. The company counts Andreessen Horowitz, Nat Friedman, Daniel Gross, and Sequoia Capital among its backers. Today, Mati announces that the company has hit a staggering $200M ARR. ElevenLabs took 20 months to hit $100M ARR. 10 months to hit $200M ARR. Can they do $300M in 5 months… AGENDA:  [00:00] $100M in 20 Months?! ElevenLabs Untold Growth Story [12:20] Are AI Models Already Plateauing—or Just Getting Started? [14:00] Why OpenAI Can't Beat ElevenLabs  [17:30] The Talent Wars: How Do You Retain World-Class AI Researchers? [23:10] PR vs Product: Why Most Startups Botch Their Launch [36:00] Are U.S. VCs Playing a Different Game Than Europe? [44:00] The Real Cost of AI: Why ElevenLabs Built Its Own Data Centers [59:00] Voice Agents = Multi-Billion Dollar Business of the Future? [01:05:00] Buy OpenAI or Anthropic? Which Foundation Model Wins? [01:09:30] Europe: Strengths, Weaknesses and What Needs to be Done  

Ten Thousand Feet, the OST Podcast
Future-Proofing IoT: Migration, Modernization & AWS Insights

Ten Thousand Feet, the OST Podcast

Play Episode Listen Later Aug 5, 2025 37:47


In this episode of 10,000 Feet, host Richelle Lentz is joined by Rick Krause from Vervint and Daniel Gross from AWS to explore the evolving landscape of connected products and IoT platforms. Together, they unpack the journey from early, DIY-style IoT implementations to today's scalable, secure, and cloud-native solutions powered by AWS. The conversation dives deep into the triggers that signal it's time to migrate—like cost inefficiencies, security vulnerabilities, and limited access to data—and outlines practical strategies for replatforming, including phased rollouts, OTA updates, chaos testing, and blue/green deployments.Listeners will also hear about modernization as a stepping stone for organizations not yet ready for full migration, with tips on optimizing device messaging, leveraging edge processing, and enhancing user experience. Real-world anecdotes, like hacked crosswalks and connected coffee makers, bring the discussion to life while emphasizing the importance of security, interoperability, and customer value. Whether you're managing a growing IoT fleet or just beginning to rethink your platform strategy, this episode offers actionable insights to help future-proof your connected product ecosystem.

Decoder with Nilay Patel
How AI researchers are getting paid like NBA All-Stars

Decoder with Nilay Patel

Play Episode Listen Later Jul 31, 2025 43:44


This is Alex Heath, your Thursday episode guest host and deputy editor at The Verge. Today I'm joined by Hayden Field, The Verge's new senior AI reporter to talk about the AI talent wars and why some researchers are suddenly getting traded like their NBA superstars. Both Hayden and I have been reporting on this for the past several weeks to get a sense of much these companies are paying for top talent, why Big Tech firms like Google are opting to hire instead of acquire, and what it means that some of the most sought-after AI experts in the world are no longer motivated by money alone.  Links:  OpenAI's Windsurf deal is off — and Windsurf's CEO is going to Google | Verge Mark Zuckerberg promises you can trust him with superintelligent AI | Verge Meta is trying to win the AI race with money — but not everyone can be bought | Verge Meta says it's winning the talent war with OpenAI | Command Line Google gets its swag back | Command Line The AI talent wars are just getting started | Command Line Meta tried to buy Safe Superintelligence, hired CEO Daniel Gross instead | CNBC Apple loses top AI models executive to Meta's hiring spree | Bloomberg Meta's AI recruiting campaign finds a new target | Wired Anthropic hires back two coding AI leaders From Anysphere | The Information Credits: Decoder is a production of The Verge and part of the Vox Media Podcast Network. Our producers are Kate Cox and Nick Statt. Our editor is Ursa Wright.  The Decoder music is by Breakmaster Cylinder. Learn more about your ad choices. Visit podcastchoices.com/adchoices

Let's Talk AI
#216 - Grok 4, Project Rainier, Kimi K2

Let's Talk AI

Play Episode Listen Later Jul 14, 2025 102:10 Transcription Available


Our 216th episode with a summary and discussion of last week's big AI news! Recorded on 07/11/2025 Hosted by Andrey Kurenkov and Jeremie Harris. Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/. In this episode: xAI launches Grok 4 with breakthrough performance across benchmarks, becoming the first true frontier model outside established labs, alongside a $300/month subscription tier Grok's alignment challenges emerge with antisemitic responses, highlighting the difficulty of steering models toward "truth-seeking" without harmful biases Perplexity and OpenAI launch AI-powered browsers to compete with Google Chrome, signaling a major shift in how users interact with AI systems Meta study reveals AI tools actually slow down experienced developers by 20% on complex tasks, contradicting expectations and anecdotal reports of productivity gains Timestamps + Links: (00:00:10) Intro / Banter (00:01:02) News Preview       Tools & Apps (00:01:59) Elon Musk's xAI launches Grok 4 alongside a $300 monthly subscription | TechCrunch (00:15:28) Elon Musk's AI chatbot is suddenly posting antisemitic tropes (00:29:52) Perplexity launches Comet, an AI-powered web browser | TechCrunch (00:32:54) OpenAI is reportedly releasing an AI browser in the coming weeks | TechCrunch (00:33:27) Replit Launches New Feature for its Agent, CEO Calls it ‘Deep Research for Coding' (00:34:40) Cursor launches a web app to manage AI coding agents (00:36:07) Cursor apologizes for unclear pricing changes that upset users | TechCrunch       Applications & Business (00:39:10) Lovable on track to raise $150M at $2B valuation (00:41:11) Amazon built a massive AI supercluster for Anthropic called Project Rainier – here's what we know so far (00:46:35) Elon Musk confirms xAI is buying an overseas power plant and shipping the whole thing to the U.S. to power its new data center — 1 million AI GPUs and up to 2 Gigawatts of power under one roof, equivalent to powering 1.9 million homes (00:48:16) Microsoft's own AI chip delayed six months in major setback — in-house chip now reportedly expected in 2026, but won't hold a candle to Nvidia Blackwell (00:49:54) Ilya Sutskever becomes CEO of Safe Superintelligence after Meta poached Daniel Gross (00:52:46) OpenAI's Stock Compensation Reflect Steep Costs of Talent Wars       Projects & Open Source (00:58:04) Hugging Face Releases SmolLM3: A 3B Long-Context, Multilingual Reasoning Model - MarkTechPost (00:58:33) Kimi K2: Open Agentic Intelligence (00:58:59) Kyutai Releases 2B Parameter Streaming Text-to-Speech TTS with 220ms Latency and 2.5M Hours of Training       Research & Advancements (01:02:14) Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning (01:07:58) Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity (01:13:03) Mitigating Goal Misgeneralization with Minimax Regret (01:17:01) Correlated Errors in Large Language Models (01:20:31) What skills does SWE-bench Verified evaluate?       Policy & Safety (01:22:53) Evaluating Frontier Models for Stealth and Situational Awareness (01:25:49) When Chain of Thought is Necessary, Language Models Struggle to Evade Monitors (01:30:09) Why Do Some Language Models Fake Alignment While Others Don't? (01:34:35) Positive review only': Researchers hide AI prompts in papers (01:35:40) Google faces EU antitrust complaint over AI Overviews (01:36:41) The transfer of user data by DeepSeek to China is unlawful': Germany calls for Google and Apple to remove the AI app from their stores (01:37:30) Virology Capabilities Test (VCT): A Multimodal Virology Q&A Benchmark

The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch
20VC: Daniel Gross and Nat Friedman: Acquired by Meta | OpenAI's SBC Bombshell: More Stock Comp Than Revenue | Privat Equity is Back: Olo Bought for $2BN | Microsoft Lays Off 9,000 People: Is This Just the Start | Will Sequoia Part with Shaun Maguire

The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch

Play Episode Listen Later Jul 10, 2025 67:29


Agenda: [00:00] The AI Talent Crisis No One's Ready For [03:00] Daniel Gross and Nat Friedman: Why Two Legendary VCs Walked Away From $1B to Join Meta [12:00] Meta's AI Talent Magnet: Will It Actually Work? [15:00] Cursor Is Breaking the Market: Can Anyone Compete? [18:30] OpenAI's SBC Bombshell: More Stock Comp Than Revenue [22:00] CoreWeave's Power Play: Buying Their Landlords [26:00] Is Circle Next to Go Shopping with Meme Equity? [28:00] PE Is Back: The Olo Take-Private Explained [35:00] Why Triple, Triple, Double, Double Is No Longer Sexy [41:00] QSBS Hack: The Billionaire's Tax Loophole You're Missing [48:00] Microsoft's AI Layoffs: Salespeople Are Dead, Long Live Engineers [50:00] “If You Need a Week to Learn AI, You Should Be Fired” [53:00] Will Sequoia's Sean Maguire Be Pushed Out? Place Your Bets [57:00] Will There Be a Recession in 2025? Jason Bets $75K It's a No [1:00:00] Is Linda Yaccarino Still CEO of X by Year-End? [1:03:00] Circle and CoreWeave's Meme Rally: Real or Mirage?  

Recomendados de la semana en iVoox.com Semana del 5 al 11 de julio del 2021
PC Superinteligencia: Los mercenarios del modelo

Recomendados de la semana en iVoox.com Semana del 5 al 11 de julio del 2021

Play Episode Listen Later Jul 6, 2025 17:05


¿Quién controla la inteligencia artificial? ¿Y cuánto cuesta fichar al futuro? En este episodio desnudamos la guerra secreta por el talento más valioso del planeta: el que entrena modelos. Te aviso: hay millones, CEOs despechados, startups sin producto... y una lluvia de colonia con aroma a ego tecnológico. PUNTOS CLAVE DEL CAPÍTULO Meta va a la caza y captura de cerebros premium: ofertas, sueldos obscenos y fichajes que parecen del PC Fútbol. OpenAI se siente saqueada y responde con drama, recalibraciones y perfumes éticos. Thinking Machines y otras startups sin producto, pero con valoraciones de 10.000 millones, nos recuerdan que aquí manda la narrativa. Mira Murati, Daniel Gross, Ilya Sutskever… todos tienen precio o propuesta. Musk y Trump estrenan nueva telenovela: entre partidos cerdito, amenazas de deportación y guerras de egos. Ranking sorpresa: ¿qué modelo respeta más tu privacidad? (Spoiler: no es Meta, ni Gemini, ni Copilot). Y sí, ya nadie habla de AGI. Ahora lo que mola es la Superinteligencia. Piensa Poco, Scrollea Mucho: El Capitalismo Límbico Nos Tiene https://go.ivoox.com/rf/140187412 Ilya Sutskever y la Superinteligencia Segura: ¿Está el Ex-Jefe de OpenAI un Paso Adelante? https://go.ivoox.com/rf/134801029 HUMANIA: WIN-WIN Corporativo. La Era Trump-Musk https://go.ivoox.com/rf/135752500 Artículos de Referencia https://www.wired.com/story/mark-zuckerberg-welcomes-superintelligence-team https://www.wired.com/story/mark-zuckerberg-meta-offer-top-ai-talent-300-million https://www.entrepreneur.com/business-news/ai-startup-tml-from-ex-openai-exec-mira-murati-pays-500000/494108 https://www.elconfidencial.com/tecnologia/novaceno/2025-07-02/zuckerberg-inteligencia-artificial-openia-futuro-tencologia_4164371 https://www.xataka.com/robotica-e-ia/industria-ia-se-ha-convertido-juego-tronos-eso-revela-verdad-inquietante-ia-casi-todo-humo https://www.wired.com/story/sam-altman-meta-ai-talent-poaching-spree-leaked-messages https://www.businessinsider.es/economia/elon-musk-arremete-nuevo-partido-republicano-ley-presupuestaria-trump-ha-sido-batalla-1470327 https://www.businessinsider.es/economia/ultima-disputa-musk-trump-clavo-ataud-tesla-inversor-ross-gerber-1470868 https://es-us.noticias.yahoo.com/chatbot-inteligencia-artificial-protege-datos-183103697.html

Tech Update | BNR
'Apple wilde concurrentie aangaan met Amazon Web Services en Microsoft Azure'

Tech Update | BNR

Play Episode Listen Later Jul 4, 2025 5:31


Apple zou plannen hebben gehad om de eigen datacenters open te stellen voor externe partijen. Daarmee zou het bedrijf direct concurreren met de clouddiensten van Amazon, Microsoft en Google. Of het plan nog steeds uitgevoerd wordt is niet bekend. The Information schrijft over de plannen van Apple, die samenhangen met project ACDC. Dat project staat voor Apple Chips in Data Centers. Dat plan werd opgetuigd om eigen chips te ontwerpen voor AI-datacenters van het bedrijf. Het zou in eerste instantie enkel voor Apple's eigen diensten en AI-plannen gebruikt moeten worden, zoals nu bijvoorbeeld Private Cloud Compute en de AI-berekeningen van Siri draaien op die datacenters. Maar Michael Abott had binnen Apple plannen opgegooid om ook een businessmodel te maken voor de datacenters, door de rekenkracht van de datacenters te verhuren aan externe ontwikkelaars. Of dat plan nog altijd uitgewerkt wordt is onbekend en onzeker, omdat initiatiefnemer Abott in 2023 het bedrijf verliet. Volgens The Information zouden er wel in 2024 nog gesprekken over gevoerd zijn, maar inmiddels lijkt Apple vooral ook de handen vol te hebben aan het verbeteren van de eigen AI-capaciteiten. Verder in deze Tech Update: Meta trekt Daniel Gross aan voor Super AI-team, maar Ilya Sutskever weet verleiding te weerstaan en neemt plek als CEO in bij start-up Safe Superintelligence Zometeen in De Schaal van Hebben: Aircooling fan-jacket van FERNIDA See omnystudio.com/listener for privacy information.

The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch
20VC: Nat Friedman and Daniel Gross Bought with Zuck's $100BN AI Budget | Navan Files to Go Public and Canva Pulls the Brakes: Why and What Happens | Why Larry Ellison is the Smartest Man in Tech | Substance or Sizzle: What is Real and What is BS in AI

The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch

Play Episode Listen Later Jun 26, 2025 73:57


Agenda: 04:21 - The Meta Acquisition Bombshell: Nat Friedman & Daniel Gross Join Facebook?! 06:00 - Facebook's $100 Billion Gamble: Can Zuck Buy the Future? 09:27 - The “Magic Room” Theory: Why Only Insiders Get Billion-Dollar Paydays 11:27 - Is Loyalty Dead in Silicon Valley? The Great Talent Exodus 16:00 - Harvey's $5 Billion Valuation: Genius or Bubble? 19:00 - The AI Gold Rush: Can Software Really Eat Human Labor? 22:00 - The B2B Unicorn Dilemma: Are There Enough $100B Companies? 25:00 - IPO Mania: Why Navan, Canva, and Circle Are Shaking Up the Markets 29:00 - Meme Stocks & Market Madness: The Circle Rollercoaster 32:00 - Canva's Billion-Dollar Question: Why Stay Private? 36:00 - Larry Ellison's Power Play: How to Buy Back Your Own Empire 39:00 - The Sales Tech Revolution: Why “Cheating” Tools Are the Next Big Thing 42:00 - Slack Lockdown: Is B2B Software About to Get Ugly? 45:00 - The Ultimate Quickfire: Will Trump Launch a Smartphone? Will the US Seize AI?    

Everyday AI Podcast – An AI and ChatGPT Podcast
EP 552: $100 million salaries, Meta fails to acquire Perplexity, Microsoft's AI job cuts and more AI News That Matters

Everyday AI Podcast – An AI and ChatGPT Podcast

Play Episode Listen Later Jun 23, 2025 46:53


Imagine turning down $100 million salaries. That's apparently what's happening at OpenAI. And that's just the tip of the newsworthy AI iceberg for the week. ↳ Meta reportedly failed to acquire Perplexity. Could Apple try next? ↳ Why is Microsoft cutting so many jobs? ↳ Why are AI systems blackmailing at will? ↳ Will too much AI use lead to brain rot?Let's talk AI news shorties. Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Thoughts on this? Join the convo.Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:$100M AI Salaries Being DeclinedMeta's AI Talent War EffortsMeta's Unsuccessful Acquisitions OverviewBrain Rot Concerns with AI UseOpenAI's $200M DoD ContractGoogle's Voice AI Search RolloutGoogle Gemini 2.5 in ProductionSoftBank's $1T Robotics InvestmentAnthropic's AI Model Risks ExposedMicrosoft and Amazon AI Job CutsTimestamps:00:00 Weekly AI News and Insights04:17 Meta's Major AI Acquisitions08:50 AI Impact on Student Writing Skills12:53 OpenAI Expands Government AI Program15:31 Google Launches Voice AI Search19:32 Google AI Models' Stability Feature22:55 "Project Crystal Land Initiative"27:17 AI Acquisition Talks Intensify29:43 "Apple Eyes Perplexity Acquisition"31:54 Apple's Potential Market Decline36:57 AI Ethics and Safety Concerns40:44 Amazon Warns of AI-Driven Layoffs42:44 AI's Impact on Job Market45:24 "Canvas Tips for Business Intelligence"Keywords:$100 million salaries, AI talent war, Meta, OpenAI, AI signing bonuses, Andrew Bosworth, Scale AI acquisition, Alexander Wang, Safe Superintelligence, Daniel Gross, Nat Friedman, Perplexity AI, Brain rot from AI, chat GBT and brain, MIT study on AI, SAT style essays using AI, AI neural activity, AI and cognitive effort, AI in government, $200 million contract with Department of Defense, OpenAI in security, ChatGPTgov, Federal AI initiatives, Google Gemini 2.5, AI mission-critical business, Gemini 2.5 flashlight, AI model stability, SoftBank $1 trillion investment, Project Crystal Land, Arizona robotics hub, Taiwan Semiconductor Manufacturing Company, Embodied AI, AI job cuts, Microsoft layoffs, Amazon AI workforce, Anthropic study on AI ethics, AI blackmail, Google voice-based AI search, AI search live, New AI apps, Apple acquisition interest in Perplexity, AI-powered search engine, Siri integration, AI-driven efficiencies, GenSend Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Try Google Veo 3 today! Sign up at gemini.google to get started. Try Google Veo 3 today! Sign up at gemini.google to get started.

Doppelgänger Tech Talk
Zuckerbergs $100m Angebote für OpenAI Talente, Big Tech Entlassungen und Eutelsat #468

Doppelgänger Tech Talk

Play Episode Listen Later Jun 20, 2025 47:34


Andy Jassy kündigt KI-bedingte Entlassungen an. Microsoft zieht nach und plant einen massiven Stellenabbau im Vertrieb. Ein gescheiterter Abwerbeversuch von Meta bei OpenAI sorgt für Schlagzeilen, während Mark Zuckerberg weiterhin aggressiv in KI-Talente investiert. MITs bahnbrechendes SEAL-Framework könnte die Selbstverbesserung von KI auf ein neues Niveau heben. Große Technologieunternehmen fordern ein zehnjähriges Verbot von KI-Regulierungen durch US-Bundesstaaten. Goldman Sachs hebt das Verbot für SPAC-Deals auf. Meta erweitert seine Smart Glasses-Kollektion mit EssilorLuxottica. Australien plant ein Social-Media-Verbot für Teenager unter 16 Jahren. Teslas Robotaxi-Ambitionen stehen auf dem Prüfstand. Eutelsat sichert sich Milliarden zur Schaffung eines europäischen Starlink-Konkurrenten. Elon Musk plant, X zur ultimativen Plattform für Investitionen und Handel zu machen.  Unterstütze unseren Podcast und entdecke die Angebote unserer Werbepartner auf ⁠⁠⁠⁠⁠doppelgaenger.io/werbung⁠⁠⁠⁠⁠. Vielen Dank!  Philipp Glöckler und Philipp Klöckner sprechen heute über: (00:00:00) Amazon & Microsoft Stellenabbau  (00:08:45) OpenAI Meta Abwerbeversuch (00:17:00) MIT Forschung SEAL Selbstverbesserung (00:22:00) Big Tech KI Regulierung Lobbyarbeit (00:25:20) Goldman Sachs SPACs Verbot (00:30:30) Smart Glasses Oakley (00:33:30) Australien Soziale Medien Teenager Verbot (00:35:30) Tesla Robotaxi (00:36:40) Eutelsat Finanzierung Europa Starlink (00:39:30) Elon Musk X Investment Shownotes Amazon-CEO: KI führt zu kleinerer Belegschaft – wsj.com Microsoft plant weitere Stellenstreichungen im Vertrieb – bloomberg.com Sam Altman: Meta scheiterte beim Abwerben von OpenAI-Talenten – techcrunch.com Meta versuchte, Safe Superintelligence zu kaufen, stellte CEO Daniel Gross ein – cnbc.com MIT-Forscher enthüllen "SEAL": Neuer Schritt zur selbstverbessernden KI – syncedreview.com Big Tech fordert 10-jähriges Verbot für US-Bundesstaaten zur Regulierung von KI – ft.com Goldman hebt Verbot von SPACs auf – bloomberg.com Meta-Smart-Brillen mit Oakley und Prada, Erweiterung des Luxottica-Deals – cnbc.com Verbot von sozialen Medien für Teenager in Australien rückt näher – bloomberg.com Teslas Robotaxi-Ambitionen stehen vor einer Bewährungsprobe nach dem Start – ft.com Eutelsat sammelt 1,35 Milliarden Euro für Europas Starlink-Rivalen – bloomberg.com Elon Musks X: Investitionen und Handel in 'Super-App' – ft.com X droht mit Klage, um Werbegeschäft zu sichern – wsj.com Gloeckler Peak Big Tech Employment  – linkedin.com

Sharp Tech with Ben Thompson
(Preview) Meta Continues Its AI Spending Spree, More Fun with OpenAI and Microsoft, ‘Apple in China' and Related Matters

Sharp Tech with Ben Thompson

Play Episode Listen Later Jun 19, 2025 14:22


Ben and Andrew react to reports that in addition to adding Scale AI CEO Alexandr Wang, Meta's now in advanced talks to hire prominent AI investors and frequent Stratechery guests, Nat Friedman and Daniel Gross, for an offer that could exceed $1 billion. Then: Follow-ups on Perplexity and Apple, the calculus for both sides amid reports of between OpenAI and Microsoft, a question about ‘Apple in China' and culpability for the last 20 years of decision-making, and thoughts on the competition between the US and China, in general.

New World Podcast
Bonus Episode: Interview with Editor Daniel Gross (UNDER THE BOARDWALK,

New World Podcast

Play Episode Listen Later Jun 5, 2025 73:13


One of the biggest inflection points for New World Pictures was when Roger Corman bought a former lumberyard and turned it into a studio to shoot BATTLE BEYOND THE STARS, which jumpstarted many careers from James Cameron to this week's guest, editor Daniel Gross! Daniel may have started with New World on BATTLE, but he went on to work on New World films like TUFF TURF, SORCERESS and HIGHPOINT while also editing trailers and promotional materials for the company after Corman sold it. He also edited the New World films THE ANNIHILATORS, PRETTY SMART, and UNDER THE BOARDWALK! We discuss Daniel's long career, which also dipped into Cannon Films, worked with legendary genre directors Greydon Clark and Larry Cohen, and edited SPACED INVADERS! This interview discusses a wide range of New World films from two different eras, and Daniel is a hilarious guest with great stories so you don't want to miss this! To watch Daniel's "music video" for Tuff Turf, head here: https://www.youtube.com/watch?v=M8aZNZSV41U. For more about the New World Pictures Podcast, including previous episodes, t-shirts, mugs, sweatshirts, other merch and more, head here: https://newworldpicturespodcast.com/ For all the shows in Someone's Favorite Productions Podcast Network, head here:  https://www.someonesfavoriteproductions.com/  

PULS BIZNESU do słuchania
Sztuczna inteligencja, która odpisuje na maile za Ciebie

PULS BIZNESU do słuchania

Play Episode Listen Later May 19, 2025 29:19


Czy AI może wyręczyć Cię w codziennej pracy biurowej? Zeta Labs, start-up założony przez Fryderyka Wiatrowskiego i Petera Alberta, właśnie nad tym pracuje. Ich rozwiązanie – agent AI o imieniu Jace – ma ułatwić zarządzanie skrzynką mailową. Jace analizuje wiadomości, proponuje odpowiedzi w stylu użytkownika i wysyła je po akceptacji. Technologia wciąż się rozwija, ale już dziś testują ją pierwsi klienci.W rozmowie przyglądamy się drodze dwóch młodych założycieli – od spotkania w Miecie, przez doświadczenia zawodowe w Londynie i Dubaju, aż po rozwój firmy w Warszawie i Monachium. Dowiesz się też, dlaczego ich pomysł przyciągnął takich inwestorów jak Daniel Gross, Nat Friedman, Bartek Pucek czy Mati Staniszewski z ElevenLabs, i co oznacza runda finansowania na poziomie 4,4 mln dolarów.To opowieść o AI w praktyce, o globalnych ambicjach i o tym, jak zbudować start-up z Polski, który przyciąga uwagę Doliny Krzemowej.

The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch
20VC: The Future of Foundation Models | The Future of AI Consumer Apps and Why OpenAI Did a Disservice to Them | The Future of Music: Spotify vs YouTube & Spotify vs TikTok: What Happens with Mikey Shulman @ Suno

The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch

Play Episode Listen Later Jan 10, 2025 57:50


Mikey Shulman is the Co-Founder and CEO of Suno, the leading music AI company. Suno lets everyone make and share music. Mikey has raised over $125M for the company from the likes of Lightspeed, Founder Collective and Nat Friedman and Daniel Gross. Prior to founding Suno, Mikey was the first machine learning engineer and head of machine learning at Kensho technologies, which was acquired by S&P Global for over $500 million.  In Today's Episode with Mikey Shulman: 1. The Future of Models:  Who wins the future of models? Anthropic, OpenAI or X? Will we live in a world of many smaller models? When does it make sense for specialised vs generalised models? Does Mikey believe we will continue to see the benefits of scaling laws? 2. The Future of UI and Consumer Apps:  Why does Mikey believe that OpenAI did AI consumer companies a massive disservice? Why does Mikey believe consumers will not choose their model or pay for a superior model in the future?  Why does Mikey believe that good taste is more important than good skills? Why does Mikey argue physicists and economists make the best ML engineers? 3. The Future of Music:  What is going on with Suno's lawsuit against some of the biggest labels in music? How does Mikey see the future of music discovery? How does Mikey see the battle between Spotify and YouTube playing out? How does Mikey see the battle between TikTok and Spotify playing out?  

Infinite Machine Learning
Voice-to-Voice Foundation Models

Infinite Machine Learning

Play Episode Listen Later Oct 30, 2024 39:08


Alan Cowen is the cofounder and CEO of Hume, a company building voice-to-voice foundation models. They recently raised their $50M Series B from Union Square Ventures, Nat Friedman, Daniel Gross, and others. Alan's favorite book: 1984 (Author: George Orwell)(00:01) Introduction(00:06) Defining Voice-to-Voice Foundation Models(01:26) Historical Context: Handling Voice and Speech Understanding(03:54) Emotion Detection in Voice AI Models(04:33) Training Models to Recognize Human Emotion in Speech(07:19) Cultural Variations in Emotional Expressions(09:00) Semantic Space Theory in Emotion Recognition(12:11) Limitations of Basic Emotion Categories(15:50) Recognizing Blended Emotional States(20:15) Objectivity in Emotion Science(24:37) Practical Aspects of Deploying Voice AI Systems(28:17) Real-Time System Constraints and Latency(31:30) Advancements in Voice AI Models(32:54) Rapid-Fire Round--------Where to find Prateek Joshi: Newsletter: https://prateekjoshi.substack.com Website: https://prateekj.com LinkedIn: https://www.linkedin.com/in/prateek-joshi-91047b19 Twitter: https://twitter.com/prateekvjoshi 

Vorbitorincii. Cu Radu Paraschivescu și Cătălin Striblea
Daniel Gross. Românește nu înseamnă rău

Vorbitorincii. Cu Radu Paraschivescu și Cătălin Striblea

Play Episode Listen Later Sep 26, 2024 136:42


  Prieteni, invitatul acestei ediții de Leaders este Daniel Gross, CEO-ul rețelei de magazine Penny România, o companie cu peste 7.000 de angajați.  A preluat conducerea companiei într-o perioadă în care retailul alimentar se confrunta cu provocări semnificative, inclusiv concurența crescută și schimbările în comportamentul consumatorilor. Sub conducerea sa, Penny a continuat să se extindă pe piața românească, concentrându-se pe oferirea de produse accesibile și pe dezvoltarea unor strategii de sustenabilitate. A fost angajatul cu nr.19 în companie, a lucrat în foarte multe compartimente... știe cum funcționează tot, am putea spune. Se zbate să aducă pe rafturi cât mai multe produse românești, 100%, și aflăm că nu este ușor deloc să facă asta, și spune convingător că noi, românii, suntem și exigenți, și muncitori.  Vizionare plăcută!    02:28 Penny, sponsor principal al Naționalei de fotbal 10:43 Zidul galben 13:22 În inima Naționalei 20:00 Noi, ca români, suntem foarte exigenți și muncitori 28:43 Angajatul nr. 19 35:16 Măsurători pentru vizibilitatea produselor 40:00 Colaborări cu Banca pentru Alimente și Bonapp (aplicație) 42:00 Carnea proaspătă este 100% românească 47:50 Produsele TripluRO 1:00:18 Sortiment românesc - obiectiv strategic 1:01:20 De ce merele poloneze sunt mai ieftine ca cele românești 1:04:00 Prețul contează mai mult decât proveniența produsului 1:14:04 Impactul unui TVA mai mare din 2025 1:24:00 Sunt mai sănătos și mai vesel de când alerg  1:37:07 În România se trăiește (mai) bine 1:49:23 Administrația din România e…. înceată 1:57:43 Cum funcționează un magazin autonom 2:05:57 Concurență  

AI For Everyone
I'M BACK! Wildfire Drones / Ilya Raises a BILLION $$$$$

AI For Everyone

Play Episode Listen Later Sep 8, 2024 9:51


In this episode, we dive into three groundbreaking developments at the intersection of technology, AI, and healthcare that are set to shape the future:### 1. **AI-Driven Drone Swarms Combatting Wildfires**   - Researchers from the University of Sheffield and the University of Bristol in collaboration with Windracers have developed **AI-driven, self-coordinating drone swarms** to fight wildfires more efficiently. These drones use advanced thermal and optical imaging to autonomously detect, assess, and monitor fires.   - Tested by Lancashire Fire and Rescue, these drones can carry up to **100 kg of fire retardant**, making them a formidable tool for early wildfire mitigation. The Windracer ULTRA drones can monitor vast areas—potentially the size of Greece—helping to address challenges in remote wildfire detection and response.   - With climate change causing more frequent and severe wildfires in the UK, this innovation represents a significant step forward in cost-effective, rapid-response firefighting.### 2. **Transforming Healthcare with Smartphone-Based Disease Detection**   - Google has trained a powerful AI model named **HeAR (Health Acoustic Representations)** on 300 million audio samples to detect diseases using just a smartphone. This AI listens to sounds like coughs and breathing patterns to detect early signs of respiratory illnesses, including tuberculosis.   - Partnering with Salcit Technologies in India, Google aims to deploy this technology in high-risk, underserved communities where access to traditional diagnostic tools is limited. The technology could potentially expand to identify other respiratory and cardiovascular conditions, revolutionizing early disease detection and healthcare accessibility worldwide.   - This breakthrough showcases the power of **bioacoustics**—the combination of biology and acoustics—in extracting crucial health information from everyday sounds, making healthcare more accessible and effective in remote areas.### 3. **Safe Superintelligence (SSI) Raises $1 Billion to Build the Future of AI**   - In a monumental move, **Safe Superintelligence (SSI)**, a new AI startup co-founded by former OpenAI chief scientist Ilya Sutskever, raised a staggering **$1 billion in funding** just three months after its inception.   - SSI's mission is to develop superintelligent AI systems that are safe and beneficial for humanity. Co-founded by Sutskever, Daniel Gross, and Daniel Levy, the startup is already valued at **$5 billion** and has attracted funding from major venture capital firms such as Andreessen Horowitz and Sequoia Capital.   - With only ten employees, SSI plans to use the funds to acquire computing power and hire top-tier talent, focusing on AI safety to ensure that superintelligent AI systems surpass human intelligence without posing risks.   - This massive seed round underscores the importance and urgency of AI safety as we venture further into an era dominated by artificial intelligence.### **Key Takeaways:**- **Innovation in Disaster Management:** AI-driven drone swarms represent a significant leap in wildfire mitigation, potentially saving lives and property.- **Revolutionizing Healthcare:** Smartphone-based AI for disease detection could democratize healthcare, providing critical diagnostic capabilities to underserved regions.- **Future of Safe AI:** SSI's unprecedented funding round reflects a growing recognition of the need for safe, superintelligent AI systems that benefit humanity.**Don't miss out on these discussions and more as we explore the future of technology and its potential to reshape our world!**Get intouch with Myles at mylesdhillon@gmail.com

This Week in Pre-IPO Stocks
E146: Safe Superintelligence raises $1B for new AI LLM; OpenAI hits 200M weekly active users for ChatGPT; Salesforce acquires Own company for $1.9B; ByteDance raises $600M for Dongchedi, valued at $3B; Anthropic's Claude AI powers new Amazon Alexa; xAI'

This Week in Pre-IPO Stocks

Play Episode Listen Later Sep 6, 2024 35:37


Send us a textPRE-IPO STOCK FUNDS CLOSING TO NEW INVESTORS ON SEP 13 (NEXT FRIDAY)AG Dillon has seven (7) pre-IPO stock funds closing on Friday, Sep 13. Next Friday. See fund list at www.agdillon.com/product (page 3). Available for purchase at Schwab, Fidelity, or directly at AG Dillon. Email aaron.dillon@agdillon.com to investSubscribe to AG Dillon Pre-IPO Stock Research at agdillon.com/subscribe;- Wednesday = secondary market valuations, revenue multiples, performance, index fact sheets- Saturdays = pre-IPO news and insights, webinar replays00:07 | Safe Superintelligence Raises $1B for new AI LLM- AI venture focused on creating safe AI models- Co-founded by Ilya Sutskever, Daniel Gross, and Daniel Levy- Raised $1B in May 2024 from investors like Andreessen Horowitz and Sequoia Capital- Offices in Palo Alto and Tel Aviv- For-profit entity addressing AI safety00:43 | OpenAI Hits 200M Weekly Active Users for ChatGPT- AI large language model business- ChatGPT now has 200M weekly active users, doubling since Nov 2023- 1M paid corporate users, up from 600K in April 2024- Expected to generate $2B annually from $20/month premium subscriptions- 50% of corporate users are in the U.S.; strong presence in Germany, Japan, U.K.- Secondary market valuation: $103.8B (+20.7% vs Apr 2024 round)01:33 | Salesforce Acquires Own Company for $1.9B- Data management firm specializing in data backup and recovery- Acquired by Salesforce for $1.9B in cash- Previously valued at $3.35B in Aug 2021- 7,000 customers; raised $507.3M from Tiger Global and Salesforce Ventures- Global data backup market valued at $12.9B in 2023, growing at a 10.9% CAGR02:14 | ByteDance Raises $600M for Dongchedi, Valued at $3B- Chinese parent company of TikTok- Raising $600M for car trading platform Dongchedi- Dongchedi boasts 35.7M monthly active users- Competing with platforms like Autohome and Bitauto- Secondary market valuation: $300B (+11.8% vs Dec 2023 round)03:00 | Anthropic's Claude AI Powers New Amazon Alexa- Amazon to release new Alexa powered by Anthropic's Claude AI in October- Paid version to cost $5-$10/month; current version remains free- Estimated $600M in annual sales if 10% of Alexa's 100M users opt for the paid version- Anthropic has a $23.6B secondary market valuation (+31.4% vs Jan 2024 round)03:49 | xAI's Colossus System Becomes Most Powerful AI Trainer- AI large language model business by Elon Musk- Colossus built with 100,000 Nvidia H100 GPUs in 122 days, doubling to 200,000 GPUs- Phase 1 cost estimated at $2B, located in Memphis- Colossus will consume 150 megawatts of power and 1M gallons of water daily for cooling- Secondary market valuation: $26.1B (+8.9% vs May 2024 round)04:45 | X Launches Beta Version of TV App for Fire TV and Google TV- Formerly Twitter, now focusing on becoming a "video-first" platform- Launched beta version of TV app for Amazon Fire TV and Google TV- Initial feedback suggests bugs, but fixes are anticipated soon- Aimed at reviving ad revenue and attracting video creators05:26 | Fidelity Cuts X Holdings Valuation by Another 4%- Fidelity reduced X Holdings (formerly Twitter) valuation by 4% in July- Total decrease of 72% since Elon Musk's acquisition in Oct 2022- New valuation implies X shares are worth $15, down from Musk's original $54.20/share- X's total value now approximately $21B06:03 | Pre-IPO Stock Market Weekly Performance06:48 | Pre-IPO Stock Vintage Index Weekly Performance

Diaspora.nz
S2 | E7 — Min-Kyu Jung (Co-founder & CEO at Ivo) on creating AI-powered legal assistants; the journey from NZ corporate law to leading a hot Silicon Valley startup & why others should move there too.

Diaspora.nz

Play Episode Listen Later Aug 16, 2024 31:01


Min-Kyu Jung is the CEO and co-founder at Ivo, an AI contract law assistant for legal teams, which has raised $6.2 million in funding total from investors including Uncork Capital, Fika Ventures, GD1, Phase One, and Daniel Gross. Min-Kyu got the idea for Ivo (previously Latch) while working as a corporate lawyer in New Zealand, when he saw how much time, effort and money were spent drawing up agreements. His entrepreneurial streak got the better of him — drawn to what he saw as “low-hanging fruit”, under-optimised processes around him in the legal profession, he taught himself how to code in two months and took the leap to start a startup.Ivo works in Microsoft Word to explain legal terms, determine if clauses are market standard and instantly create a summary of an agreement to help speed up the process. After a cold outbound DM landed him an angel investment from Daniel Gross in San Francisco, he moved his whole team over for an initial three months — and never looked back. He thinks other kiwi founders - at least those who aspire to be at the frontiers of AI - should do the same, and issues a challenge to other founders to reflect on where they need to locate to maximise their chances of success.He's not afraid to roll up his sleeves and do the work to sell, get connected with people... even if that means lots of cold outbound: “Kiwis tend to be modest and avoid making impositions on others. You will need to overcome this cultural quirk and simply cold email / DM people you find interesting.” We talk about how social capital flows in the Bay Area, and how it helped him build a local network, recruit his team, land hundreds of customer conversations, and more: “The SF Bay Area has a strong culture of paying it forward. Successful people here are often willing to spend time and social capital helping founders with no network if they seem to be working on something interesting.” We talk about his thesis for AI product development, how founders should think about designing user experiences, how Ivo handles issues with Large Language Model (“LLM”) reliability and hallucinations, and how he's preparing to leverage ever more powerful AI models to his advantage in coming years. This was a fun episode to record — we look forward to your feedback!!  Where to find Min-Kyu online:* LinkedIn: https://www.linkedin.com/in/min-kyu-jung/* Twitter/X: https://twitter.com/mkjungKnow an expat we should feature on diaspora.nz? * reach out via david@diaspora.nz This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit www.diaspora.nz

E68: Tyler Cowen on Talent, the Importance of Stamina, and Predicting Success

Play Episode Listen Later Jul 19, 2024 56:37


This week on Upstream, we're releasing a fascinating discussion with economist, professor, and bestselling author Tyler Cowen about how to find talented people. This was recorded in 2022 around the launch of his book 'Talent: How to Identify Energizers, Creatives, and Winners Around the World' co-authored with Daniel Gross. Tyler and Erik discuss strategies for assessing raw talent, recognizing late bloomers, and fostering an environment conducive to high achievers. They also cover the importance of understanding founder compatibility, building strong peer groups, and the role of mentorship in talent development.

Let's Talk AI
#171 - - Apple Intelligence, Dream Machine, SSI Inc

Let's Talk AI

Play Episode Listen Later Jun 24, 2024 124:01 Transcription Available


Our 171st episode with a summary and discussion of last week's big AI news! With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris) Feel free to leave us feedback here. Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ Email us your questions and feedback at contact@lastweekin.ai and/or hello@gladstone.ai Timestamps + Links: (00:00:00) Intro / Banter Tools & Apps(00:03:13) Apple Intelligence: every new AI feature coming to the iPhone and Mac (00:10:03) ‘We don't need Sora anymore': Luma's new AI video generator Dream Machine slammed with traffic after debut (00:14:48) Runway unveils new hyper realistic AI video model Gen-3 Alpha, capable of 10-second-long clips (00:18:21) Leonardo AI image generator adds new video mode — here's how it works (00:22:31) Anthropic just dropped Claude 3.5 Sonnet with better vision and a sense of humor Applications & Business(00:28:23 ) Sam Altman might reportedly turn OpenAI into a regular for-profit company (00:31:19) Ilya Sutskever, Daniel Gross, Daniel Levy launch Safe Superintelligence Inc. (00:38:53) OpenAI welcomes Sarah Friar (CFO) and Kevin Weil (CPO) (00:41:44) Report: OpenAI Doubled Annualized Revenue in 6 Months (00:44:30) AI startup Adept is in deal talks with Microsoft (00:48:55) Mistral closes €600m at €5.8bn valuation with new lead investor (00:53:12) Huawei Claims Ascend 910B AI Chip Manages To Surpass NVIDIA's A100, A Crucial Alternative For China (00:56:58) Astrocade raises $12M for AI-based social gaming platform Projects & Open Source(01:01:03) Announcing the Open Release of Stable Diffusion 3 Medium, Our Most Sophisticated Image Generation Model to Date (01:05:53) Meta releases flurry of new AI models for audio, text and watermarking (01:09:39) ElevenLabs unveils open-source creator tool for adding sound effects to videos Research & Advancements(01:12:02) Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling (01:22:07) Improve Mathematical Reasoning in Language Models by Automated Process Supervision (01:28:01) Introducing Lamini Memory Tuning: 95% LLM Accuracy, 10x Fewer Hallucinations (01:30:32) An Empirical Study of Mamba-based Language Models (01:31:57) BERTs are Generative In-Context Learners (01:33:33) SELFGOAL: Your Language Agents Already Know How to Achieve High-level Goals Policy & Safety(01:35:16) Sycophancy to subterfuge: Investigating reward tampering in language models (01:42:26) Waymo issues software and mapping recall after robotaxi crashes into a telephone pole (01:45:53) Meta pauses AI models launch in Europe (01:46:44) Refusal in Language Models Is Mediated by a Single Direction Sycophancy to subterfuge: Investigating reward tampering in language models (01:51:38) Huawei exec concerned over China's inability to obtain 3.5nm chips, bemoans lack of advanced chipmaking tools Synthetic Media & Art(01:55:07) It Looked Like a Reliable News Site. It Was an A.I. Chop Shop. (01:57:39) Adobe overhauls terms of service to say it won't train AI on customers' work (01:59:31) Buzzy AI Search Engine Perplexity Is Directly Ripping Off Content From News Outlets (02:02:23) Outro + AI Song 

GREY Journal Daily News Podcast
Why Are Investors Betting Big on Accounting Startups

GREY Journal Daily News Podcast

Play Episode Listen Later Jun 24, 2024 1:24


Klarity, an accounting startup based in San Francisco, raised $70 million in a Series B funding round led by Nat Friedman and Daniel Gross, with additional support from Scale Venture Partners, Tola Capital, Picus Capital, Invus Capital, and Y Combinator. The raised funds will be used to expand Klarity's workforce, tripling it to 390 employees within the year. Klarity employs AI to process data in contracts and internal records, eliminating the need for manual work. This trend of significant funding is also observed in other accounting tech firms like Ageras, FloQast, and DataSnipper, which have also secured substantial investment to automate accounting tasks using AI. AI-driven startups in other sectors, such as legal tech, are also attracting significant investment.Learn more on this news visit us at: https://greyjournal.net/news/ Hosted on Acast. See acast.com/privacy for more information.

This Week in Google (MP3)
TWiG 773: Mouse Jigglers and Keyboard Strokers - Social Media Warning Lables, US Sues Adobe

This Week in Google (MP3)

Play Episode Listen Later Jun 20, 2024 134:42


The Surgeon General Is Wrong. Social Media Doesn't Need Warning Labels The Stanford Internet Observatory is being dismantled Pop Culture Has Become an Oligopoly Tesla takes fight for Elon Musk's pay package back to court US sues Adobe for 'deceiving' subscriptions that are too hard to cancel Apple, Meta set to face EU charges under landmark tech rules, sources say Mozilla buys Anonym, betting privacy is compatible with ads Jeff on Perplexity Discover Luma extends memes Gutenberg animated by Dream Machine Ilya Sutskever, Daniel Gross, Daniel Levy announce Safe Superintelligence Inc. Paper: "ChatGPT is bullshit" How A.I. Is Revolutionizing Drug Development McDonald's is ending its drive-thru AI test US bank Wells Fargo fires employees for 'simulating' being at their keyboards BeReal acquired by mobile apps and games company Voodoo Netflix to Open Massive Entertainment, Dining and Shopping Complexes in Two Cities in 2025 Sounds of the Forest - Soundmap Hamburger Dad Reuters annual news report ChatGPT of the Reuters report Original YouTube deal memo & pitch deck Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Download or subscribe to this show at https://twit.tv/shows/this-week-in-google. Get episodes ad-free with Club TWiT at https://twit.tv/clubtwit Sponsors: eufy.com 1password.com/twig

The AI Breakdown: Daily Artificial Intelligence News and Discussions
Ilya Sutskever is Back Building Safe Superintelligence

The AI Breakdown: Daily Artificial Intelligence News and Discussions

Play Episode Listen Later Jun 20, 2024 14:35


After months of speculation, Ilya Sutskever, co-founder of OpenAI, has launched Safe Superintelligence Inc. (SSI) to build safe superintelligence. With a singular focus on creating revolutionary breakthroughs, SSI aims to advance AI capabilities while ensuring safety. Joined by notable figures like Daniel Levy and Daniel Gross, this new venture marks a significant development in the AI landscape. After months of speculation, Ilya Sutskever, co-founder of OpenAI, has launched Safe Superintelligence Inc. (SSI) to build safe superintelligence. With a singular focus on creating revolutionary breakthroughs, SSI aims to advance AI capabilities while ensuring safety. Joined by notable figures like Daniel Levy and Daniel Gross, this new venture marks a significant development in the AI landscape. Learn about their mission, the challenges they face, and the broader implications for the future of AI. Learn how to use AI with the world's biggest library of fun and useful tutorials: https://besuper.ai/ Use code 'youtube' for 50% off your first month. The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614 Subscribe to the newsletter: https://aidailybrief.beehiiv.com/ Join our Discord: https://bit.ly/aibreakdown

All TWiT.tv Shows (MP3)
This Week in Google 773: Mouse Jigglers and Keyboard Strokers

All TWiT.tv Shows (MP3)

Play Episode Listen Later Jun 20, 2024 134:42


The Surgeon General Is Wrong. Social Media Doesn't Need Warning Labels The Stanford Internet Observatory is being dismantled Pop Culture Has Become an Oligopoly Tesla takes fight for Elon Musk's pay package back to court US sues Adobe for 'deceiving' subscriptions that are too hard to cancel Apple, Meta set to face EU charges under landmark tech rules, sources say Mozilla buys Anonym, betting privacy is compatible with ads Jeff on Perplexity Discover Luma extends memes Gutenberg animated by Dream Machine Ilya Sutskever, Daniel Gross, Daniel Levy announce Safe Superintelligence Inc. Paper: "ChatGPT is bullshit" How A.I. Is Revolutionizing Drug Development McDonald's is ending its drive-thru AI test US bank Wells Fargo fires employees for 'simulating' being at their keyboards BeReal acquired by mobile apps and games company Voodoo Netflix to Open Massive Entertainment, Dining and Shopping Complexes in Two Cities in 2025 Sounds of the Forest - Soundmap Hamburger Dad Reuters annual news report ChatGPT of the Reuters report Original YouTube deal memo & pitch deck Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Download or subscribe to this show at https://twit.tv/shows/this-week-in-google. Get episodes ad-free with Club TWiT at https://twit.tv/clubtwit Sponsors: eufy.com 1password.com/twig

Radio Leo (Audio)
This Week in Google 773: Mouse Jigglers and Keyboard Strokers

Radio Leo (Audio)

Play Episode Listen Later Jun 20, 2024 134:42


The Surgeon General Is Wrong. Social Media Doesn't Need Warning Labels The Stanford Internet Observatory is being dismantled Pop Culture Has Become an Oligopoly Tesla takes fight for Elon Musk's pay package back to court US sues Adobe for 'deceiving' subscriptions that are too hard to cancel Apple, Meta set to face EU charges under landmark tech rules, sources say Mozilla buys Anonym, betting privacy is compatible with ads Jeff on Perplexity Discover Luma extends memes Gutenberg animated by Dream Machine Ilya Sutskever, Daniel Gross, Daniel Levy announce Safe Superintelligence Inc. Paper: "ChatGPT is bullshit" How A.I. Is Revolutionizing Drug Development McDonald's is ending its drive-thru AI test US bank Wells Fargo fires employees for 'simulating' being at their keyboards BeReal acquired by mobile apps and games company Voodoo Netflix to Open Massive Entertainment, Dining and Shopping Complexes in Two Cities in 2025 Sounds of the Forest - Soundmap Hamburger Dad Reuters annual news report ChatGPT of the Reuters report Original YouTube deal memo & pitch deck Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Download or subscribe to this show at https://twit.tv/shows/this-week-in-google. Get episodes ad-free with Club TWiT at https://twit.tv/clubtwit Sponsors: eufy.com 1password.com/twig

This Week in Google (Video HI)
TWiG 773: Mouse Jigglers and Keyboard Strokers - Social Media Warning Lables, US Sues Adobe

This Week in Google (Video HI)

Play Episode Listen Later Jun 20, 2024 134:42


The Surgeon General Is Wrong. Social Media Doesn't Need Warning Labels The Stanford Internet Observatory is being dismantled Pop Culture Has Become an Oligopoly Tesla takes fight for Elon Musk's pay package back to court US sues Adobe for 'deceiving' subscriptions that are too hard to cancel Apple, Meta set to face EU charges under landmark tech rules, sources say Mozilla buys Anonym, betting privacy is compatible with ads Jeff on Perplexity Discover Luma extends memes Gutenberg animated by Dream Machine Ilya Sutskever, Daniel Gross, Daniel Levy announce Safe Superintelligence Inc. Paper: "ChatGPT is bullshit" How A.I. Is Revolutionizing Drug Development McDonald's is ending its drive-thru AI test US bank Wells Fargo fires employees for 'simulating' being at their keyboards BeReal acquired by mobile apps and games company Voodoo Netflix to Open Massive Entertainment, Dining and Shopping Complexes in Two Cities in 2025 Sounds of the Forest - Soundmap Hamburger Dad Reuters annual news report ChatGPT of the Reuters report Original YouTube deal memo & pitch deck Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Download or subscribe to this show at https://twit.tv/shows/this-week-in-google. Get episodes ad-free with Club TWiT at https://twit.tv/clubtwit Sponsors: eufy.com 1password.com/twig

All TWiT.tv Shows (Video LO)
This Week in Google 773: Mouse Jigglers and Keyboard Strokers

All TWiT.tv Shows (Video LO)

Play Episode Listen Later Jun 20, 2024 134:42


The Surgeon General Is Wrong. Social Media Doesn't Need Warning Labels The Stanford Internet Observatory is being dismantled Pop Culture Has Become an Oligopoly Tesla takes fight for Elon Musk's pay package back to court US sues Adobe for 'deceiving' subscriptions that are too hard to cancel Apple, Meta set to face EU charges under landmark tech rules, sources say Mozilla buys Anonym, betting privacy is compatible with ads Jeff on Perplexity Discover Luma extends memes Gutenberg animated by Dream Machine Ilya Sutskever, Daniel Gross, Daniel Levy announce Safe Superintelligence Inc. Paper: "ChatGPT is bullshit" How A.I. Is Revolutionizing Drug Development McDonald's is ending its drive-thru AI test US bank Wells Fargo fires employees for 'simulating' being at their keyboards BeReal acquired by mobile apps and games company Voodoo Netflix to Open Massive Entertainment, Dining and Shopping Complexes in Two Cities in 2025 Sounds of the Forest - Soundmap Hamburger Dad Reuters annual news report ChatGPT of the Reuters report Original YouTube deal memo & pitch deck Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Download or subscribe to this show at https://twit.tv/shows/this-week-in-google. Get episodes ad-free with Club TWiT at https://twit.tv/clubtwit Sponsors: eufy.com 1password.com/twig

Radio Leo (Video HD)
This Week in Google 773: Mouse Jigglers and Keyboard Strokers

Radio Leo (Video HD)

Play Episode Listen Later Jun 20, 2024 134:42


The Surgeon General Is Wrong. Social Media Doesn't Need Warning Labels The Stanford Internet Observatory is being dismantled Pop Culture Has Become an Oligopoly Tesla takes fight for Elon Musk's pay package back to court US sues Adobe for 'deceiving' subscriptions that are too hard to cancel Apple, Meta set to face EU charges under landmark tech rules, sources say Mozilla buys Anonym, betting privacy is compatible with ads Jeff on Perplexity Discover Luma extends memes Gutenberg animated by Dream Machine Ilya Sutskever, Daniel Gross, Daniel Levy announce Safe Superintelligence Inc. Paper: "ChatGPT is bullshit" How A.I. Is Revolutionizing Drug Development McDonald's is ending its drive-thru AI test US bank Wells Fargo fires employees for 'simulating' being at their keyboards BeReal acquired by mobile apps and games company Voodoo Netflix to Open Massive Entertainment, Dining and Shopping Complexes in Two Cities in 2025 Sounds of the Forest - Soundmap Hamburger Dad Reuters annual news report ChatGPT of the Reuters report Original YouTube deal memo & pitch deck Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Download or subscribe to this show at https://twit.tv/shows/this-week-in-google. Get episodes ad-free with Club TWiT at https://twit.tv/clubtwit Sponsors: eufy.com 1password.com/twig

AI DAILY: Breaking News in AI
AI DETECTS DISEASES EARLY

AI DAILY: Breaking News in AI

Play Episode Listen Later Jun 20, 2024 3:49


Plus Europe Struggles For AI Relevance (subscribe below) Like this? Get AIDAILY, delivered to your inbox, every weekday. Subscribe to our newsletter at https://aidaily.us AI-Driven Blood Test Predicts Parkinson's Disease Years Before Symptoms Researchers from UCL and University Medical Center Goettingen have developed an AI-driven blood test capable of predicting Parkinson's disease up to seven years before symptoms appear. This breakthrough, utilizing machine learning to analyze blood biomarkers, offers a promising method for early diagnosis and potential treatment to protect dopamine-producing brain cells. AI System Predicts Heart Attacks Up to 10 Years in Advance Oxford University scientists have developed an AI heart attack scan that can predict heart attacks up to a decade in advance. This AI technology, expected to be assessed by NICE and the NHS, analyzes artery inflammation not visible on standard CT scans, potentially saving thousands of lives annually by providing more accurate diagnoses.  Europe Struggles for AI Relevance Amid US Dominance European AI firms face challenges due to American dominance in AI development, with chatbots often reflecting US cultural nuances, says Peter Sarlin of Silo AI. This "AI sovereignty" issue drives Europe to invest in AI infrastructure. However, without significant tech giants, Europe's efforts may fall short. Recent deals, like Mistral AI's partnership with Microsoft, highlight the continent's dependency on US platforms, complicating Europe's bid for AI independence. AI Poised to Transform Banking Industry, Says Citigroup Citigroup Inc. predicts AI will displace more banking jobs than in any other sector, potentially automating 54% of roles. The technology, which could add $170 billion to the industry by 2028, is already being used to enhance productivity and cut costs. Citigroup's CEO Jane Fraser emphasized moving AI from experimentation to practical application, with uses in custom investment recommendations and cybersecurity. However, AI adoption might not reduce headcount due to the need for AI managers and compliance officers. Despite AI's potential, challenges like chatbot comprehension and risks of misinformation remain. Former OpenAI Chief Scientist Launches Safety-Focused AI Startup Ilya Sutskever, co-founder and former chief scientist at OpenAI, announced the launch of Safe Superintelligence Inc. (SSI), a new AI startup focused on safety. SSI aims to develop a powerful AI system while avoiding commercial pressures. Co-founded by former Apple AI lead Daniel Gross and ex-OpenAI staff Daniel Levy, SSI prioritizes safety and progress without distractions from product cycles --- Send in a voice message: https://podcasters.spotify.com/pod/show/aidaily/message

Tech Update | BNR
Ex-OpenAI-kopstuk begint voor zichzelf met Safe Superintelligence Inc.

Tech Update | BNR

Play Episode Listen Later Jun 20, 2024 6:24


De bij OpenAI vertrokken medeoprichter Ilya Sutskever heeft een eigen AI-bedrijf opgezet: Safe Superintelligence Inc. Joe van Burik vertelt je in deze Tech Update wat je moet weten. Ilya Sutskever is een naam die je bekend kan voorkomen, want deze man koos er mede voor om topman Sam Altman afgelopen najaar bij OpenAI te ontslaan - om daar kort daarna alweer spijt van te betuigen. Even later zorgde Microsoft dat Altman toch weer kon terugkeren en Sutskever vertrok vorige maand bij het bedrijf achter ChatGPT. Nu begint hij voor zichzelf met Safe Superintelligence Inc., oftewel SSI. Die naam verwijst naar dé ultieme kunstmatige intelligentie die ze willen bereiken, iets wat een bepaalde groep AI-onderzoekers en -ondernemers al enige tijd nastreeft. SSI van Sutskèver claimt dat met met focus op veiligheid te doen en zegt daarbij dat anders te doen dan OpenAI, Microsoft maar ook Google. Ook interessant: Sutskèver zet deze club op met een voormalig collega bij OpenAI, Daniel Levy, én met de voormalig AI-topman van Apple, Daniel Gross. Sutskever en Gross zijn allebei opgegroeid in Israël en SSI krijgt kantoren in zowel Palo Alto als Tel Aviv. Geld ophalen is in elk geval geen probleem, zegt Gross tegen Bloomberg, maar in hoeverre hun AI technologie dan in de praktijk veiliger en verantwoorden moet zijn, zal moeten blijken. Verder in deze Tech Update: Citigroup, de grootste Amerikaanse bank na JP Morgan Chase en Bank of America, verwacht dat meer dan de helft van de banen bij banken door AI vervangen kunnen worden Snapchat heeft gepresenteerd hoe met generatieve AI filters voor het delen van beelden gemaakt kan worden See omnystudio.com/listener for privacy information.

The Nonlinear Library
LW - Ilya Sutskever created a new AGI startup by harfe

The Nonlinear Library

Play Episode Listen Later Jun 19, 2024 1:49


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Ilya Sutskever created a new AGI startup, published by harfe on June 19, 2024 on LessWrong. [copy of the whole text of the announcement on ssi.inc, not an endorsement] Safe Superintelligence Inc. Superintelligence is within reach. Building safe superintelligence (SSI) is the most important technical problem of our time. We have started the world's first straight-shot SSI lab, with one goal and one product: a safe superintelligence. It's called Safe Superintelligence Inc. SSI is our mission, our name, and our entire product roadmap, because it is our sole focus. Our team, investors, and business model are all aligned to achieve SSI. We approach safety and capabilities in tandem, as technical problems to be solved through revolutionary engineering and scientific breakthroughs. We plan to advance capabilities as fast as possible while making sure our safety always remains ahead. This way, we can scale in peace. Our singular focus means no distraction by management overhead or product cycles, and our business model means safety, security, and progress are all insulated from short-term commercial pressures. We are an American company with offices in Palo Alto and Tel Aviv, where we have deep roots and the ability to recruit top technical talent. We are assembling a lean, cracked team of the world's best engineers and researchers dedicated to focusing on SSI and nothing else. If that's you, we offer an opportunity to do your life's work and help solve the most important technical challenge of our age. Now is the time. Join us. Ilya Sutskever, Daniel Gross, Daniel Levy June 19, 2024 Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

CFO Bookshelf
The Greatest But Most Obscure Banker of All Time

CFO Bookshelf

Play Episode Listen Later Jun 1, 2024 52:51


Like the show? Send us a text message on what you liked.I had never heard of Edmond Safra until I read Daniel Gross's informative and inspiring biography, A Banker's Journey.For those who knew him and did business with him, he was everyone's favorite banker. His banks never had to write off loans, and many of his early deals were on a handshake. He never needed a government bailout, nor did he ever head to DC complaining about regulations. While his professional and personal story is uplifting, the Shakespearian periods of his life include the American Express saga and how he died.During this conversation, Dan Gross gives us a dozen compelling reasons to revisit this banker's remarkable life.Make More with Matt HeslinExplore strategies to thrive financially, build legacy, and enhance life experiences.Listen on: Apple Podcasts Spotify

The Cost of Glory
88 - Mysteries of the Scrolls — with Nat Friedman

The Cost of Glory

Play Episode Listen Later May 30, 2024 57:14


An interview with Nat Friedman, former CEO of GitHub and creator of the Vesuvius Challenge, which aims to crack the riddles of the Herculaneum Papyri.In this episode:The Genesis of the Vesuvius ChallengeEarly Attempts to Open the ScrollsUsing a Particle Accelerator to Scan the Scrolls!Partnering with Daniel Gross and Brent SealesNat's Childhood experience with Open-source CommunitiesHow to Design Prize Incentives for a Complex ContestDoing Crazy, Strange and Risky ProjectsA Possible Resurgence of Epicureanism? This episode is sponsored by Ancient Language Institute. If you're interested in actually reading the newly unlocked scrolls, you will need to know the languages of the ancient world. The Ancient Language Institute will help you do just that. Registration is now open (till August 10th) for their Fall term where you can take advanced classes in Latin, Ancient Greek, Biblical Hebrew, and Old English.

Big Think
Economist explains the two futures of CRYPTO

Big Think

Play Episode Listen Later May 22, 2024 7:54


Economist Tyler Cowen confirms there are good reasons to be crypto-skeptical. Cryptocurrency is truly a new idea, and it's rare for society to encounter fundamentally new ideas. Cryptocurrency is well positioned to serve a crucial financial and transactional role as a globalized internet grows to include more of our lives. Crypto enthusiasts espouse grand plans that do not sound realistic, while crypto skeptics fail to appreciate the revolutionary nature of the technology. ------------------------------------------------------------------------------------------------------ About Tyler Cowen: Tyler is the Holbert L. Harris Chair of Economics at George Mason University and serves as chairman and general director of the Mercatus Center at George Mason University. He is co-author of the popular economics blog Marginal Revolution and co-founder of the online educational platform Marginal Revolution University. Tyler also writes a column for Bloomberg View, and he has contributed to The Wall Street Journal and Money. In 2011, Bloomberg Businessweek profiled Tyler as “America's Hottest Economist” after his e-book, The Great Stagnation, appeared twice on The New York Times e-book bestseller list. He graduated from George Mason University with a bachelor's degree in economics and earned a Ph.D. in economics from Harvard University. He also runs a podcast series called Conversations with Tyler. His latest book Talent: How to Identify Energizers, Creatives and Winners Around the World is co-authored with venture capitalist Daniel Gross. ---------------------------------------------------------------------------------------------------- About Big Think | Smarter Faster™ ► Big Think The leading source of expert-driven, educational content. With thousands of videos, featuring experts ranging from Bill Clinton to Bill Nye, Big Think helps you get smarter, faster by exploring the big ideas and core skills that define knowledge in the 21st century. Go Deeper with Big Think: ►Become a Big Think Member Get exclusive access to full interviews, early access to new releases, Big Think merch and more ►Get Big Think+ for Business Guide, inspire and accelerate leaders at all levels of your company with the biggest minds in business Learn more about your ad choices. Visit megaphone.fm/adchoices

Big Think
Can AMERICA make a COMEBACK? | Tyler Cowen - BIGTHINK

Big Think

Play Episode Listen Later May 20, 2024 11:02


An interview with economist Tyler Cowen on why American progress has seemed to stall and how we can get it back on track. The rate of progress in American society has been uneven throughout history, argues economist Tyler Cowen. Tremendous periods of growth are followed by periods of stagnation. Periods of growth occur when there is a breakthrough, and other advances quickly follow. For example, the Industrial Revolution and electrification of homes allowed the standard of living to grow at a fast rate, particularly in the early to mid-20th century. But starting in the 70s, progress slowed. One reason is that the easier tasks, like electrification, had already been accomplished. Also, government regulation and a general aversion to risk have made Americans less entrepreneurial. As a result, progress has slowed, and we have not matched our earlier performance. Today, we are at a pivotal crossroads between stagnation and growth. To get back to a growth mindset, he argues, we need to stop taking our prosperity for granted. -------------------------------------------------------------------------------------------- Chapters For Easier Navigation:- 0:00 intro 0:05 whats wrong with america 1:53 can america make a comeback 3:27 when are we going to get vaccines This video is part of The Progress Issue, a Big Think and Freethink special collaboration. ------------------------------------------------------------------------------------------ About Tyler Cowen Tyler is the Holbert L. Harris Chair of Economics at George Mason University and serves as chairman and general director of the Mercatus Center at George Mason University. He is co-author of the popular economics blog Marginal Revolution and co-founder of the online educational platform Marginal Revolution University. Tyler also writes a column for Bloomberg View, and he has contributed to The Wall Street Journal and Money. In 2011, Bloomberg Businessweek profiled Tyler as “America's Hottest Economist” after his e-book, The Great Stagnation, appeared twice on The New York Times e-book bestseller list. He graduated from George Mason University with a bachelor's degree in economics and earned a Ph.D. in economics from Harvard University. He also runs a podcast series called Conversations with Tyler. His latest book Talent: How to Identify Energizers, Creatives and Winners Around the World is co-authored with venture capitalist Daniel Gross. ------------------------------------------------------------------------------------------ About Big Think | Smarter Faster™ ► Big Think The leading source of expert-driven, educational content. With thousands of videos, featuring experts ranging from Bill Clinton to Bill Nye, Big Think helps you get smarter, faster by exploring the big ideas and core skills that define knowledge in the 21st century. Go Deeper with Big Think: ►Become a Big Think Member Get exclusive access to full interviews, early access to new releases, Big Think merch and more ►Get Big Think+ for Business Guide, inspire and accelerate leaders at all levels of your company with the biggest minds in business Learn more about your ad choices. Visit megaphone.fm/adchoices

The Nonlinear Library
EA - Metascience of the Vesuvius Challenge by Maxwell Tabarrok

The Nonlinear Library

Play Episode Listen Later Mar 30, 2024 9:29


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Metascience of the Vesuvius Challenge, published by Maxwell Tabarrok on March 30, 2024 on The Effective Altruism Forum. The Vesuvius Challenge is a million+ dollar contest to read 2,000 year old text from charcoal-papyri using particle accelerators and machine learning. The scrolls come from the ancient villa town of Herculaneum, nearby Pompeii, which was similarly buried and preserved by the eruption of Mt. Vesuvius. The prize fund comes from tech entrepreneurs and investors Nat Friedman, Daniel Gross, and several other donors. In the 9 months after the prize was announced, thousands of researchers and students worked on the problem, decades-long technical challenges were solved, and the amount of recovered text increased from one or two splotchy characters to 15 columns of clear text with more than 2000 characters. The success of the Vesuvius Challenge validates the motivating insight of metascience: It's not about how much we spend, it's about how we spend it. Most debate over science funding concerns a topline dollar amount. Should we double the budget of the NIH? Do we spend too much on Alzheimer's and too little on mRNA? Are we winning the R&D spending race with China? All of these questions implicitly assume a constant exchange rate between spending on science and scientific progress. The Vesuvius Challenge is an illustration of exactly the opposite. The prize pool for this challenge was a little more than a million dollars. Nat Friedman and friends probably spent more on top of that hiring organizers, building the website etc. But still this is pretty small in the context academic grants. A million dollars donated to the NSF or NIH would have been forgotten if it was noticed at all. Even a direct grant to Brent Seales, the computer science professor whose research laid the ground work for reading the scrolls, probably wouldn't have induced a tenth as much progress as the prize pool did, at least not within 9 months. It would have been easy to spend ten times as much on this problem and get ten times less progress out the other end. The money invested in this research was of course necessary but the spending was not sufficient, it needed to be paired with the right mechanism to work. The success of the challenge hinged on design choices at a level of detail beyond just a grants vs prizes dichotomy. Collaboration between contestants was essential for the development of the prize-winning software. The discord server for the challenge was (and is) full of open-sourced tools and discoveries that helped everyone get closer to reading the scrolls. A single, large grand prize is enticing but it's also exclusive. Only one submission can win so the competition becomes more zero-sum and keeping secrets is more rewarding. Even if this larger prize had the same expected value to each contestant, it would not have created as much progress because more research would be duplicated as less is shared. Nat Friedman and friends addressed this problem by creating several smaller progress prizes to reward open-source solutions to specific problems along the path to reading the scrolls or just open ended prize pools for useful community contributions. They also added second-place and runner-up prizes. These prizes funded the creation of data labeling tools that everyone used to train their models and visualizations that helped everyone understand the structure of the scrolls. They also helped fund the contestant's time and money investments in their submissions. Luke Farritor, one of the grand prize winners, used winnings from the First Letters prize to buy the computers that trained his prize winning model. A larger grand prize can theoretically provide the same incentive, but it's a lot harder to buy computers with expected value! Nat and his team also decided to completely swit...

The Nonlinear Library
LW - On Devin by Zvi

The Nonlinear Library

Play Episode Listen Later Mar 18, 2024 16:57


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: On Devin, published by Zvi on March 18, 2024 on LessWrong. Introducing Devin Is the era of AI agents writing complex code systems without humans in the loop upon us? Cognition is calling Devin 'the first AI software engineer.' Here is a two minute demo of Devin benchmarking LLM performance. Devin has its own web browser, which it uses to pull up documentation. Devin has its own code editor. Devin has its own command line. Devin uses debugging print statements and uses the log to fix bugs. Devin builds and deploys entire stylized websites without even being directly asked. What could possibly go wrong? Install this on your computer today. Padme. The Real Deal I would by default assume all demos were supremely cherry-picked. My only disagreement with Austen Allred's statement here is that this rule is not new: Austen Allred: New rule: If someone only shows their AI model in tightly controlled demo environments we all assume it's fake and doesn't work well yet But in this case Patrick Collison is a credible source and he says otherwise. Patrick Collison: These aren't just cherrypicked demos. Devin is, in my experience, very impressive in practice. Here we have Mckay Wrigley using it for half an hour. This does not feel like a cherry-picked example, although of course some amount of select is there if only via the publication effect. He is very much a maximum acceleration guy, for whom everything is always great and the future is always bright, so calibrate for that, but still yes this seems like evidence Devin is for real. This article in Bloomberg from Ashlee Vance has further evidence. It is clear that Devin is a quantum leap over known past efforts in terms of its ability to execute complex multi-step tasks, to adapt on the fly, and to fix its mistakes or be adjusted and keep going. For once, when we wonder 'how did they do that, what was the big breakthrough that made this work' the Cognition AI people are doing not only the safe but also the smart thing and they are not talking. They do have at least one series rival, as Magic.ai has raised $100 million from the venture team of Daniel Gross and Nat Friedman to build 'a superhuman software engineer,' including training their own model. The article seems strange interested in where AI is 'a bubble' as opposed to this amazing new technology. This is one of those 'helps until it doesn't situations' in terms of jobs: vanosh: Seeing this is kinda scary. Like there is no way companies won't go for this instead of humans. Should I really have studied HR? Mckay Wrigley: Learn to code! It makes using Devin even more useful. Devin makes coding more valuable, until we hit so many coders that we are coding everything we need to be coding, or the AI no longer needs a coder in order to code. That is going to be a ways off. And once it happens, if you are not a coder, it is reasonable to ask yourself: What are you even doing? Plumbing while hoping for the best will probably not be a great strategy in that world. The Metric Devin can sometimes (13.8% of the time?!) do actual real jobs on Upwork with nothing but a prompt to 'figure it out.' Aravind Srinivas (CEO Perplexity): This is the first demo of any agent, leave alone coding, that seems to cross the threshold of what is human level and works reliably. It also tells us what is possible by combining LLMs and tree search algorithms: you want systems that can try plans, look at results, replan, and iterate till success. Congrats to Cognition Labs! Andres Gomez Sarmiento: Their results are even more impressive you read the fine print. All the other models were guided whereas devin was not. Amazing. Deedy: I know everyone's taking about it, but Devin's 13% on SWE Bench is actually incredible. Just take a look at a sample SWE Bench problem: this is a task for a human! Shout out to Car...

Business Breakdowns
Match Group: Swipe Right - [Business Breakdowns, EP.133]

Business Breakdowns

Play Episode Listen Later Oct 25, 2023 69:10


This is Matt Reustle and today we are breaking down the giant of online dating. Even if you found love the old-fashioned way, you're likely familiar with the Match brands like Tinder and Hinge, amongst many others. To break down Match, I'm joined by George Hadjia, founder of Bristlemoon Capital. George goes through a background on this industry, what made Match who it is today, and all of the key debates that are driving this stock and all the commentary around it. Please enjoy this breakdown of Match Group. Interested in hiring from the Colossus Community? Click here. For the full show notes, transcript, and links to the best content to learn more, check out the episode page here. ----- This episode is brought to you by Tegus Converge — the first virtual event centered on the world of investor research. When twin brothers Tom and Mike Elnick realized that the research process for investors was broken, they founded Tegus to fix it. Now the people behind the most trusted research platform are bringing institutional investors together to investigate the state — and the future — of fundamental research. On November 8th, join industry luminaries like IGSB Founder Reece Duca and Daniel Gross, AI Expert, Entrepreneur and Investor, to dig into the latest research trends and breakthrough technologies shaping the investment landscape. Register today at tegus.com/register. ----- Business Breakdowns is a property of Colossus, LLC. For more episodes of Business Breakdowns, visit joincolossus.com/episodes. Stay up to date on all our podcasts by signing up to Colossus Weekly, our quick dive every Sunday highlighting the top business and investing concepts from our podcasts and the best of what we read that week. Sign up here. Follow us on Twitter: @JoinColossus | @patrick_oshag | @jspujji | @zbfuss | @ReustleMatt | @domcooke Show Notes (00:03:10) - (First question) - George's response since releasing his recent report on Match (00:04:55) - A general overview of the online dating market (00:10:55) - Comparing the different brands within the dating app industry   (00:14:10) - The reason for the existence of so many niche brands in the market (00:18:55) - The different avenues for these brands when it comes to monetization  (00:21:25) - The breakdown of revenue per customer and the different tiers dating apps offer  (00:24:10) - Customer turnover due to the nature of dating and how the retention rate differs between the different apps (00:28:40) - A snapshot of how the industry has been growing over recent years (00:29:50) - Determining normalized earning profiles and margins when taking into account the lack of marketing spend historically (00:32:40) - The historical percentage of revenue that goes into marketing expenses (00:35:10) - How Bumble's advertising expenditure differs from Match Group brands (00:36:40) - Price competition between different brands and a look at Tinder's introduction of premium monetization tiers (00:39:20) - Dissecting top-line growth and the percentage due to recent price increases  (00:40:10) - An overview of the business' capital allocation and how they intend to invest in the growth of the business (00:42:50) - The new management team's strategy and how it differs from the previous regimes (00:46:25) - Potential changes to Apple app store fees and how it could affect the business  (00:51:10) - A forward outlook at where George expects the business to go in the coming years (00:54:40) - The key risks to the business moving forward (00:57:20) - The threat that Facebook poses in terms of its entry into the market  (01:02:20) - The lessons learned from researching Match Learn more about your ad choices. Visit megaphone.fm/adchoices

Invest Like the Best with Patrick O'Shaughnessy
Aswath Damodaran - Making Sense of the Market Pt. 2 - [Invest Like the Best, EP.349]

Invest Like the Best with Patrick O'Shaughnessy

Play Episode Listen Later Oct 24, 2023 81:26


Today's guest is Aswath Damodaran, who is joining us for a second time on Invest Like the Best. Aswath is a Professor of Finance at NYU's Stern School of Business and is often referred to as the Dean of Valuation for his clarity of thought on the subject. This conversation picks up where we left off 18 months ago and covers a wide range of topics from macro risks to Nvidia and the process of crafting a personal investment philosophy. Please enjoy this great discussion with Aswath Damodaran. Listen to Founders Podcast For the full show notes, transcript, and links to mentioned content, check out the episode page here. ----- This episode is brought to you by Tegus Converge — the first virtual event centered on the world of investor research. When twin brothers Tom and Mike Elnick realized that the research process for investors was broken, they founded Tegus to fix it. Now the people behind the most trusted research platform are bringing institutional investors together to investigate the state — and the future — of fundamental research. On November 8th, join industry luminaries like IGSB Founder Reece Duca and Daniel Gross, AI Expert, Entrepreneur and Investor, to dig into the latest research trends and breakthrough technologies shaping the investment landscape. Register today at tegus.com/register. ----- Invest Like the Best is a property of Colossus, LLC. For more episodes of Invest Like the Best, visit joincolossus.com/episodes.  Past guests include Tobi Lutke, Kevin Systrom, Mike Krieger, John Collison, Kat Cole, Marc Andreessen, Matthew Ball, Bill Gurley, Anu Hariharan, Ben Thompson, and many more. Stay up to date on all our podcasts by signing up to Colossus Weekly, our quick dive every Sunday highlighting the top business and investing concepts from our podcasts and the best of what we read that week. Sign up here. Follow us on Twitter: @patrick_oshag | @JoinColossus Show Notes (00:01:30) - (First question) - The general prevailing narrative in markets today (00:03:45) - The biggest business implications given the current market landscape   (00:05:30) - Why it's bad to have risky founders with cheap capital trying experiments (00:07:38) - The natural rate of interest and how it's priced  (00:08:40) - His updated view and thoughts on what's currently driving inflation   (00:12:20) - Macro variables that most have his attention today (00:13:30) - The nature of the trouble that we're all in  (00:17:30) - Whether or not international equities will become a place of interest (00:20:38) - The unique absolute basis of NVIDIA's growth  (00:22:10) - His take on the new wave of AI in a broad sense  (00:28:00) - Trying to value AI companies without tangible business models (00:31:30) - The parts of his own valuation process that are beyond automation  (00:34:40) - Commonalities between investors who beat the benchmark  (00:37:20) - Episodes on his own path that lead him towards his investment philosophy  (00:40:50) - How he goes about valuing non-traditional companies like sports franchises  (00:45:30) - The world of entertainment and how he sees it as a business today  (00:52:30) - The best business models he's ever seen  (00:54:25) - What valuing Instacart taught him about online grocery shopping (00:57:40) - The most interesting company he valued over the last year (00:59:30) - A well known company he wouldn't bother valuing using his typical model (01:02:10) - How bank failures changed his thinking on our financial systems and banks as businesses writ large  (01:05:00) - The changing attitude towards ESG investing  (01:09:56) - Why there are still so many pools of capital that pursue an active strategy (01:10:51) - Being sick and tired of the conversation always revolving around central banks (01:14:38) - What he's most excited to look into over the coming year (01:16:38) - Major differences between a financial and an accounting balance sheet 

Business Breakdowns
WEX: Fleet Cards - [Business Breakdowns, EP.131]

Business Breakdowns

Play Episode Listen Later Oct 13, 2023 41:56


This is Matt Reustle and today we are breaking down WEX, a big fish in a less known pond. WEX is a leader in the fleet card market - they offer trucking businesses special credit cards which help secure advantaged rates on fuel among many other things. This is a business with a long history as WEX is headquartered in Maine, and really came to life in the 1980s.  To break down WEX, I'm joined by Mark Tomasovic from Energize Capital, a multiple-time guest on Business Breakdowns. We get into the history of this industry and how WEX found a very creative way to accelerate adoption within this market. Please enjoy this breakdown of WEX. Subscribe to Colossus's New Show: Art of Investing Buy a ticket to Patrick and David Senra's live show. Interested in hiring from the Colossus Community? Click here. For the full show notes, transcript, and links to the best content to learn more, check out the episode page here.  ----- This episode is brought to you by Tegus Converge — the first virtual event centered on the world of investor research. When twin brothers Tom and Mike Elnick realized that the research process for investors was broken, they founded Tegus to fix it. Now the people behind the most trusted research platform are bringing institutional investors together to investigate the state — and the future — of fundamental research. On November 8th, join industry luminaries like IGSB Founder Reece Duca and Daniel Gross, AI Expert, Entrepreneur and Investor, to dig into the latest research trends and breakthrough technologies shaping the investment landscape. Register today at tegus.com/register. ----- Business Breakdowns is a property of Colossus, LLC. For more episodes of Business Breakdowns, visit joincolossus.com/episodes. Stay up to date on all our podcasts by signing up to Colossus Weekly, our quick dive every Sunday highlighting the top business and investing concepts from our podcasts and the best of what we read that week. Sign up here. Follow us on Twitter: @JoinColossus | @patrick_oshag | @jspujji | @zbfuss | @ReustleMatt | @domcooke Show Notes (00:02:57) - (First question) - An overview of what WEX is and what they do (00:03:50) - A summary of the market that WEX operates in (00:05:59) - The history of the company's creation (00:08:53) - The importance of signing up large gas companies rather than retail locations (00:11:33) - Value propositions behind providing fleet cards  (00:12:53) - How the economic model works for the cards   (00:13:48) - The percentage of spend equivalent to Visa or Mastercard (00:14:31) - The difficulty behind switching from one fleet card provider to another  (00:17:05) - The role fuel prices play in the total revenue of the business (00:20:09) - Threats to consider on the supply end of the business   (00:21:57) - Recharging at home and the process of receiving a credit (00:23:06) - Other businesses WEX is involved in (00:24:27) - A comparison between all of WEX's businesses and where they direct focus (00:25:13) - A look into their health and employee benefits line  (00:26:39) - The overall financial profile from a revenue and margins standpoint (00:28:43) - How big players like Amazon or Walmart play a part in potential business (00:30:02) - The threat of Visa or Mastercard entering the same space (00:31:17) - Total amount of revenue generated from electric vehicle fleets   (00:33:27) - Electric charging locations and the process of building these facilities (00:35:13) - Technology invested into creating faster charging stations (00:35:50) - An overall look at risks for the business (00:37:54) - Other parts of WEX that stand out (00:40:10) - Lessons learned from studying WEX Learn more about your ad choices. Visit megaphone.fm/adchoices

Invest Like the Best with Patrick O'Shaughnessy
Strauss Zelnick - Playing to Your Strengths - [Invest Like the Best, EP.347]

Invest Like the Best with Patrick O'Shaughnessy

Play Episode Listen Later Oct 10, 2023 74:30


My guest this week is Strauss Zelnick, the CEO of leading game publisher Take-Two Interactive. Maybe most well-known for its hugely successful Grand Theft Auto game, Take-Two is a sophisticated, top-tier developer, publisher, and marketer of interactive entertainment that owns Rockstar Games and 2K. Strauss's passion for entertainment led him strong and fast into the industry as he worked his way from sales to CEO and transitioned from motion picture to gaming. Today we cover his approach to staying on the cutting edge of media development, unlocking talent and potential in those around you, and becoming the leader you were meant to be. His intensity and his standard for excellence come through clearly. Please enjoy my conversation with Strauss Zelnick. Subscribe to Colossus's New Show: Art of Investing Buy a ticket to Patrick and David Senra's live show. Listen to Founders Podcast For the full show notes, transcript, and links to mentioned content, check out the episode page here. ----- This episode is brought to you by Tegus Converge — the first virtual event centered on the world of investor research. When twin brothers Tom and Mike Elnick realized that the research process for investors was broken, they founded Tegus to fix it. Now the people behind the most trusted research platform are bringing institutional investors together to investigate the state — and the future — of fundamental research. On November 8th, join industry luminaries like IGSB Founder Reece Duca and Daniel Gross, AI Expert, Entrepreneur and Investor, to dig into the latest research trends and breakthrough technologies shaping the investment landscape. Register today at tegus.com/register. ----- Invest Like the Best is a property of Colossus, LLC. For more episodes of Invest Like the Best, visit joincolossus.com/episodes.  Past guests include Tobi Lutke, Kevin Systrom, Mike Krieger, John Collison, Kat Cole, Marc Andreessen, Matthew Ball, Bill Gurley, Anu Hariharan, Ben Thompson, and many more. Stay up to date on all our podcasts by signing up to Colossus Weekly, our quick dive every Sunday highlighting the top business and investing concepts from our podcasts and the best of what we read that week. Sign up here. Follow us on Twitter: @patrick_oshag | @JoinColossus Show Notes (00:03:39) - (First question) - Why the entertainment media sector is so interesting (00:05:08) - Key inflection points in the history of media (00:09:17) - The role of pure content in businesses today (00:10:32) - Requirements for being a successful media business operator (00:12:03) - Strategies for working effectively with creatives (00:16:13) - How to cultivate a conducive environment for creatives (00:25:54) - The allure of collaborating with Take-Two (00:30:09) - Strauss' journey to becoming the chairman and CEO of Take-Two (00:37:42) - Strategies for reducing costs in business (00:41:16) - Embracing diversity in the video game industry (00:43:41) - Identifying high-quality intellectual property (IP) (00:46:04) - The inspiration behind Strauss' book Becoming Ageless: The Four Secrets To Looking and Feeling Younger Than Ever (00:51:12) - Influential leaders for learning and growth (00:55:45) - The impact of technology and the rise of new platforms (00:57:00) - Common misconceptions about Take-Two (00:59:42) - Unique attributes of Take-Two projects (01:00:36) - Defining moments in the history of the business (01:04:19) - Anticipating the future direction of Take-Two (01:13:41) - Sources of motivation and inspiration (01:15:29) - The concept and value of a masterpiece (01:11:08) - Paramount values as a parent (01:11:38) - The kindest thing anyone has ever done for Strauss

Murder Sheet
The Return of Ted Maher: Billionaire Edmond Safra's Killer Accused of Plotting Yet Another Murder

Murder Sheet

Play Episode Listen Later Oct 6, 2023 72:39


The Murder Sheet has an exclusive report touching upon an infamous international case.In 1999, an American nurse named Ted Maher was accused of setting fire to a Monte Carlo penthouse and murdering billionaire Edmond Safra and a colleague named Vivian Torrente.In 2023, under the new name Jon Green, the same man was charged with criminal solicitation to commit murder.-----------------------------------------------------------------------------------------------------------------The Murder Sheet participates in the Amazon Associate program and earns money from qualifying purchases.Reporting on Edmond Safra:The Los Angeles Times's reporting on American Express:https://www.latimes.com/archives/la-xpm-1992-04-28-fi-1108-story.htmlCoverage from Forbes on the Russia-related scandal: https://www.forbes.com/2007/05/17/bony-russia-lawsuit-biz-services-cx_lm_0517suit.html?sh=4dcae2bd21c1The Jewish Week's feature on Safra: https://www.hsje.org/Whoswho/Edmund_Safra/we_have_lost_our_crown.htmlThe New York Post's coverage of Safra's reputation: https://nypost.com/1999/12/14/safras-sleuth-pi-joe-mullen-saved-the-reputation-of-the-late-edmond-safra-and-has-cracked-many-a-case-for-this-decades-famous-and-infamous/The Washington Post's coverage of the American Express incident involving Safra: https://www.washingtonpost.com/archive/business/1989/07/29/american-express-offers-4-million-and-apology/aafa682c-f909-420a-8cba-64c1171b8754/Coverage from The Times of Israel on Edmond Safra: https://www.timesofisrael.com/new-biography-probes-into-mysterious-backstory-of-billionaire-banker-edmond-j-safra/“A Banker's Journey: How Edmond J. Safra Built a Global Financial Empire” by Daniel Gross: https://www.amazon.com/Bankers-Journey-Edmond-Global-Financial/dp/1635767857?&_encoding=UTF8&tag=murdersheet-20&linkCode=ur2&linkId=e421f9aad81731bd8c533450c2d33219&camp=1789&creative=9325"Vendetta: American Express and the Smearing of Edmond Safra" by Bryan Burrough: https://www.amazon.com/Vendetta-American-Express-Smearing-Edmond/dp/0060167599?&_encoding=UTF8&tag=murdersheet-20&linkCode=ur2&linkId=daa163a4a68a59e1be75830f6856bef4&camp=1789&creative=9325“Gilded Lily: Lily Safra: The Making of One of the World's Wealthiest Widows” by Isabel Vincent: https://www.amazon.com/GILDED-LILY-Isabel-Vincent/dp/0061133949?&_encoding=UTF8&tag=murdersheet-20&linkCode=ur2&linkId=703a51336f36524e9ccc1178740241e4&camp=1789&creative=9325Reporting on Ted Maher:The New York Times's story on the 1999 nursing strike: https://www.nytimes.com/1999/08/05/nyregion/nurses-plan-strike-monday-at-columbia-presbyterian.htmlThe New York Times's story on how the 1999 nursing strike was called off: https://www.nytimes.com/1999/08/10/nyregion/tentative-deal-averts-strike-by-nurses.htmlTime's reporting on Ted Maher: https://content.time.com/time/subscriber/article/0,33009,992877,00.htmlSeacoastonline's report on Heidi Maher: https://www.seacoastonline.com/story/news/2002/11/21/praying-for-murder-acquittal/51281826007/Coverage from the New York Post on Ted Maher's release:https://nypost.com/2007/08/17/back-from-dead/The New York Post on Ted Maher's former wife's lawsuit against the Safra estate:https://nypost.com/2003/05/27/60m-safra-suit-killers-wife-hits-widow-over-police-grilling/The New York Post on Lily Safra's reaction to Ted Maher's release: https://nypost.com/2007/08/18/widows-pique-at-killers-release/The New York Post's coverage of Ted Maher's innocence claims: https://nypost.com/2007/10/14/tycoons-killer-my-frame-up/"Framed in Monte Carlo: How I Was Wrongfully Convicted for a Billionaire's Fiery Death” by Ted Maher, Bill Hayes, and Jennifer Thomas: https://www.amazon.com/Framed-Monte-Carlo-Prison-Murder/dp/1510755861?&_encoding=UTF8&tag=murdersheet-20&linkCode=ur2&linkId=fddac60f9dea02c78ede9cf2a644bf01&camp=1789&creative=9325Coverage of the fire and homicides in Monaco:The Washington Post's coverage of the 1999 murders: https://www.washingtonpost.com/wp-srv/pmextra/dec99/6/safra.htmThe NBC special on the case, with quotes from Torrente's daughter: https://www.nbcnews.com/id/wbna23767683The Guardian's report on the 1999 murders: https://www.theguardian.com/theobserver/2000/oct/29/features.magazine47Another Guardian report on the 1999 murders:https://www.theguardian.com/world/1999/dec/07/jonhenleyYet another Guardian report on the 1999 murders:https://www.theguardian.com/world/1999/dec/05/paulwebster.theobserverThe New York Post article on the 1999 murders:https://nypost.com/2002/11/18/safra-choke-twist/Dominick Dunne for Vanity Fair on the killings: https://www.vanityfair.com/culture/2000/12/dunne200012MSNBC on the 1999 murders: https://archive.org/details/MSNBCW_20151213_000000_Mystery_of_the_Billionaire_BankerCNN on the 1999 murders:http://www.cnn.com/2002/LAW/08/12/ctv.monaco.trial/index.htmlThe Wall Street Journal on the 1999 murders: https://www.wsj.com/articles/SB94441779970529365Court TV's timeline of the 1999 murders: https://web.archive.org/web/20080204074511/http://www.courttv.com/trials/monaco/chronology.htmlNewsweek's coverage of the 1999 murders: https://www.newsweek.com/bad-bet-monte-carlo-151519Coverage from CBS of Ted Maher's trial: https://www.cbsnews.com/news/part-ii-an-american-on-trial/Additional coverage from CBS of Ted Maher's trial: https://www.cbsnews.com/news/murder-in-monaco-an-american-on-trial/A report from the Times on the trial of Ted Maher: https://www.thetimes.co.uk/article/monaco-police-in-dock-for-billionaire-s-death-mk7v5nrb8crA report from The Telegraph on the trial of Ted Maher: https://www.telegraph.co.uk/news/worldnews/europe/monaco/1414023/Gilded-Lily-faces-her-husbands-killer.htmlCoverage of the dognapping incident involving Jon Green:KRQE's coverage of the dognapping: https://www.krqe.com/news/new-mexico/carlsbad-dognapping-man-with-bizarre-past-accused-of-taking-ex-wifes-dogs/KRQE on the return of the missing dogs: https://www.krqe.com/news/stolen-search-and-rescue-dogs-reunited-with-carlsbad-woman/Fox San Antonio's story on the rescue of the rescue dogs: https://foxsanantonio.com/news/local/dognapping-suspect-wanted-on-multiple-charges-arrested-after-extensive-manhuntNBC's coverage of Jon Green's legal issues: https://www.nbcnews.com/dateline/man-mysterious-past-facing-multiple-charges-run-after-dognapping-carlsbad-n1295803The Carlsbad Current Argus on the missing dogs: https://www.currentargus.com/story/news/crime/2022/06/15/missing-carlsbad-search-and-rescue-dogs-found-safe-in-texas/65361052007/A feature from the American Veterinary Medical Association mentioning Dr. Kim Lark:https://www.avma.org/javma-news/2011-09-15/honoring-dogs-911Send tips to murdersheet@gmail.com.The Murder Sheet is a production of Mystery Sheet LLC .See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

The Cloudcast
Considerations for Enterprise AI

The Cloudcast

Play Episode Listen Later Aug 27, 2023 34:46


Let's talk through some of the challenges that Enterprises will have with AI - from data location to GPU location, to model biases, to data privacy to training vs. execution.SHOW: 748CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotwCHECK OUT OUR NEW PODCAST - "CLOUDCAST BASICS"SHOW SPONSORS:Datadog Security Solution: Modern Monitoring and SecurityStart investigating security threats before it affects your customers with a free 14 day Datadog trial. Listeners of The Cloudcast will also receive a free Datadog T-shirt.Find "Breaking Analysis Podcast with Dave Vellante" on Apple, Google and SpotifyKeep up to data with Enterprise Tech with theCUBEAWS Insiders is an edgy, entertaining podcast about the services and future of cloud computing at AWS. Listen to AWS Insiders in your favorite podcast player. Cloudfix HomepageSHOW NOTES:An Interview with Daniel Gross and Nat Friedman on the AI Hype Cycle (Stratechery)ARE THERE EXPECTATIONS OF “OLD AI” vs. “NEW AI”?Are business leaders thinking about unique AI applications and use-cases, or just “ChatGPT-everything”?Formal data scientists vs. citizen data scientists?Will this just be an application, or have an impact on every aspect of a business and the IT industry?WILL ENTERPRISE AI BE DIFFERENT THAN CONSUMER AI? The industry is actively working on a broad set of models that can be used for different use-cases. It's commonly accepted that AI models need to be trained near the sources of data. Many businesses are concerned about including their company data into these public modelsMany businesses will want to deploy tuned models and applications in data center, public cloud and edge environments. New AI applications will be required to meet security, regulatory and compliance standards, like other business applications. FEEDBACK?Email: show at the cloudcast dot netTwitter: @thecloudcastnet

This Week in Startups
Using GPUs as leverage, MSFT beats the case, FTC fails under Khan | E1775

This Week in Startups

Play Episode Listen Later Jul 11, 2023 52:02


Lemon.io - Hire pre-vetted remote developers, get 15% off your first 4 weeks of developer time at https://Lemon.io/twist OpenPhone. Create business phone numbers for you and your team that work through an app on your smartphone or desktop. TWiST listeners can get an extra 20% off any plan for your first 6 months at openphone.com/twist VEED makes it super easy for anyone (yes, you) to create great video. Filled with amazing features like templates, auto subtitles, text formatting, auto-resizing, a full suite of AI tools, and much more, VEED gives you the tools to engage your audience on any platform. Head to VEED.io to start creating incredible video content in minutes. * Today's show: Jason breaks down investors and companies using the GPU shortage as leverage to invest in AI startups (1:33) before discussing Microsoft's run-in with EU regulators over bundling Teams into its Office suite (28:53). They wrap on the FTC's loss in its quest to stop Microsoft's Activision Blizzard acquisition, and Lina Khan's track record as FTC chair (37:58). * Time stamps: (0:00) Nick joins Jason (1:33) Nvidia's GPU leverage (9:48) Lemon.io - Get 15% off your first 4 weeks of developer time at https://Lemon.io/twist (11:07) CoreWeave's pivot & the pros and cons of WFH (19:28) OpenPhone - Get 20% off your first six months at https://openphone.com/twist (20:54) Daniel Gross and Nat Friedman's GPU play (27:24) Veed - Sign up and engage your audience on any platform at https://www.veed.io/avatars?utm_campaign=TWIS&utm_medium=YT&utm_source=MKT (28:53) Microsoft's run-in with EU regulators over bundling (37:58) FTC loses cases against Microsoft and Activision Blizzard merger (46:37) Lina Khan's track record as the head of the FTC * Follow Nick: https://twitter.com/nickcalacanis * Read LAUNCH Fund 4 Deal Memo: https://www.launch.co/four Apply for Funding: https://www.launch.co/apply Buy ANGEL: https://www.angelthebook.com Great recent interviews: Steve Huffman, Brian Chesky, Aaron Levie, Sophia Amoruso, Reid Hoffman, Frank Slootman, Billy McFarland, PrayingForExits, Jenny Lefcourt Check out Jason's suite of newsletters: https://substack.com/@calacanis * Follow Jason: Twitter: https://twitter.com/jason Instagram: https://www.instagram.com/jason LinkedIn: https://www.linkedin.com/in/jasoncalacanis * Follow TWiST: Substack: https://twistartups.substack.com Twitter: https://twitter.com/TWiStartups YouTube: https://www.youtube.com/thisweekin * Subscribe to the Founder University Podcast: https://www.founder.university/podcast

The New Yorker Radio Hour
An Audiobook Master on the Secrets of Her Craft

The New Yorker Radio Hour

Play Episode Listen Later Dec 20, 2022 23:09 Very Popular


You've probably never heard of Robin Miles, but you may well have heard her—possibly at some length. Miles is an actor who's cultivated a particular specialty in recording audiobooks, a booming segment of the publishing industry. She has lent her voice to more than 400 titles in all sorts of genres—from the classic “Charlotte's Web” to Isabel Wilkerson's “Caste,” a deep analysis of race in America. “Telling a story, fully, all of it—from all the aspects of it—and creating the kind of intimacy between you and your listener is so satisfying,” she tells the New Yorker editor Daniel Gross. “Being in a great play means you have to have the money and the other actors and a script and a director. This is just me and my book, and I love that.”

Conversations with Tyler
Shruti Rajagopalan talks to Daniel Gross and Tyler about Identifying and Predicting Talent

Conversations with Tyler

Play Episode Listen Later Sep 1, 2022 67:45 Very Popular


How can one identify and predict talent? On a search to answer this question and others like it, Tyler Cowen joined venture capitalist and entrepreneur Daniel Gross to explore the art and science of finding talent in their new book Talent: How to Identify Energizers, Creatives, and Winners Around the World. In a panel discussion hosted by Shruti Rajagopalan, Cowen and Gross discuss the applications of their new book, particularly how lifestyle characteristics can indicate an individual is capable of great creativity and talent. Daniel and Tyler also discuss undervalued talents and skills, what talents they look for in the start-up and investment world, why there is no good chocolate ice cream to be found in San Francisco, what their exercise preferences indicate about their personalities, how they approach identifying talent in different countries and industries, how immigration impacts entrepreneurialism, the short-comings to Zoom interviews, what a messy desk reveals about a person, and more. Read a full transcript enhanced with helpful links, or watch the full video. Recorded June 29th, 2022 Other ways to connect Follow us on Twitter and Instagram Follow Tyler on Twitter  Follow Daniel on Twitter Follow Shruti on Twitter Email us: cowenconvos@mercatus.gmu.edu Subscribe at our newsletter page to have the latest Conversations with Tyler news sent straight to your inbox. Photo credit: Drew Bird Photo

Invest Like the Best with Patrick O'Shaughnessy
Tyler Cowen & Daniel Gross - Identifying Talent - [Invest Like the Best, EP. 277]

Invest Like the Best with Patrick O'Shaughnessy

Play Episode Listen Later May 17, 2022 79:07 Very Popular


My guests today are Tyler Cowen and Daniel Gross. Tyler is an economics professor and creator of one of the most popular economics blogs on the internet. Daniel is the founder of start-up accelerator Pioneer, having previously been a director at Apple and a partner at Y Combinator. Both Daniel and Tyler are prolific talent spotters and that is the focus of our discussion and their new book, which is called Talent. Please enjoy this conversation with Tyler Cowen and Daniel Gross.   For the full show notes, transcript, and links to mentioned content, check out the episode page here.   -----   This episode is brought to you by Canalyst. Canalyst is the leading destination for public company data and analysis. If you're a professional equity investor and haven't talked to Canalyst recently, you should give them a shout. Learn more and try Canalyst for yourself at canalyst.com/Patrick.    -----   This episode is brought to you by Brex. Brex is the integrated financial platform trusted by the world's most innovative entrepreneurs and fastest-growing companies. With Brex, you can move money fast for instant impact with high-limit corporate cards, payments, venture debt, and spend management software all in one place. Ready to accelerate your business? Learn more at brex.com/best.   -----   Invest Like the Best is a property of Colossus, LLC. For more episodes of Invest Like the Best, visit joincolossus.com/episodes.    Past guests include Tobi Lutke, Kevin Systrom, Mike Krieger, John Collison, Kat Cole, Marc Andreessen, Matthew Ball, Bill Gurley, Anu Hariharan, Ben Thompson, and many more.   Stay up to date on all our podcasts by signing up to Colossus Weekly, our quick dive every Sunday highlighting the top business and investing concepts from our podcasts and the best of what we read that week. Sign up here.   Follow us on Twitter: @patrick_oshag | @JoinColossus   Show Notes [00:02:38] - [First question] - Defining what talent is to them writ large [00:03:34] - The differences between means and ends in regards to talent [00:04:14] - What the Diet Coke idea is and why it's relevant [00:06:32] - Types of energy that are valuable and the subtle differences between them  [00:07:40] - Thoughts on using a moneyball-like approach to acquiring and evaluating talent  [00:11:49] - The talent market and thinking about pricing talent specifically [00:13:14] - What is seemingly overpriced in today's talent landscape [00:15:50] - Relationship between experience and/or age when it comes to talent [00:20:34] - Lessons about the utility of intelligence and where they've lead them wrong [00:23:35] - What's beneath being an outsider and why it's important [00:24:46] - Why what people do in their downtime is worth considering   [00:28:27] - Whether or not references should be held in higher regard than interviews [00:31:41] - Things to try and get out of a reference call as an objective [00:32:40] - Disabilities and what lead them write that chapter specifically [00:35:01] - Whether or not talented people are happier   [00:38:40] - Lack of contentment and it's dynamic influence over individuals [00:41:01] - Where they think the other is most talented [00:43:33] - Thinking about the physical side of mental performance [00:45:49] - What was frustrating about writing the book [00:48:25] - How they evaluate talent most differently now after having finished the book [00:50:41] - What makes for a good bat signal and how to cast one well  [00:53:27] - Personality inventories and what they would and wouldn't recommend   [00:54:15] - Geographical frictions and their role in high success rates [00:56:08] - Antonio Gracias; Existing supply constraints on talent development [01:00:01] - How they would redesign the current attractors of talent that we rely on today [01:01:18] - Assembly line development and how we can improve and scale talent filters [01:02:29] - The biggest open questions for talent today writ large [01:05:16] - The kindest thing anyone has ever done for Tyler