POPULARITY
In this episode, FY26 SWE President Inaas Darrat sits down with two early-career SWE leaders to talk honestly about life after engineering school and the lessons they wish they had learned sooner. Abigail Fennell, biomedical engineering Ph.D. candidate at Johns Hopkins University, shares how her mentors and SWE connections helped her realize she wanted to pursue a Ph.D., along with the differences between undergraduate courses and graduate research. Abby Culloton, hydraulic engineer with the U.S. Army Corps of Engineers, reflects on learning how to make friends after college and transitioning into her first engineering role. Hear practical advice on setting new goals after college, finding support systems as an adult, and letting go of the pressure to figure everything out at once. — The Society of Women Engineers is a powerful, global force uniting nearly 45,000 members of all genders spanning 90+ countries. We are the world's largest advocate and catalyst for change for women in engineering and technology. To join and access all the exclusive benefits to elevate your professional journey, visit membership.swe.org.
How do you evaluate an AI model for a war you can only fight once? Ike Harris, a Naval officer turned Hill staffer turned AI policy operator, joins the show to discuss his effort to bridge the gap between the labs that build frontier models and the operators who'll deploy them. Ike Harris is the executive director of the newly launched Frontier Security Institute, and was most recently the Republican tech lead on the House Select Committee on the CCP, with prior stints in OSD and as a surface warfare officer. We discuss… The GAIN AI and Overwatch acts: and Congress's most aggressive attempt to wrest export-control authority from the executive branch since the Cold War Why you can't just "buy AI": and why national security evals look nothing like the SWE benchmarks the labs optimize for Strategic-level evals :for problems you can't run ten times, from Iran negotiations to targeting at the COCOM level China's robot-army advantage: open-weight models at the edge, Ukraine-style drone iteration soaked up via Russia, and a casualty tolerance the US can't match The "no more NASA" problem: how risk tolerance, mission command, and law-of-armed-conflict constraints shape who wins the deployment race Breaking into tech policy: Ike's case for why every aspiring policy person should spend a year on the Hill Learn more about your ad choices. Visit megaphone.fm/adchoices
How do you evaluate an AI model for a war you can only fight once? Ike Harris, a Naval officer turned Hill staffer turned AI policy operator, joins the show to discuss his effort to bridge the gap between the labs that build frontier models and the operators who'll deploy them. Ike Harris is the executive director of the newly launched Frontier Security Institute, and was most recently the Republican tech lead on the House Select Committee on the CCP, with prior stints in OSD and as a surface warfare officer. We discuss… The GAIN AI and Overwatch acts: and Congress's most aggressive attempt to wrest export-control authority from the executive branch since the Cold War Why you can't just "buy AI": and why national security evals look nothing like the SWE benchmarks the labs optimize for Strategic-level evals :for problems you can't run ten times, from Iran negotiations to targeting at the COCOM level China's robot-army advantage: open-weight models at the edge, Ukraine-style drone iteration soaked up via Russia, and a casualty tolerance the US can't match The "no more NASA" problem: how risk tolerance, mission command, and law-of-armed-conflict constraints shape who wins the deployment race Breaking into tech policy: Ike's case for why every aspiring policy person should spend a year on the Hill Learn more about your ad choices. Visit megaphone.fm/adchoices
To see the archival photos and documents referenced in the episode, watch the video podcast here: https://youtu.be/ItBlWLPcAyU In this special video episode for SWE's Founders Day, host Troy Eller English, chief archivist for the Society of Women Engineers (SWE), is joined by two of the editors of the book, “Women Engineering Legends 1952-1976: Society of Women Engineers Achievement Award Recipients,” Jill Tietjen and Holly Teig. Along with four other members of SWE's Late Career and Retiree Affinity Group, this literary team explored the stories of the first 25 recipients of SWE's Achievement Award. They discuss the technical legacy of early women engineers, from Edith Clarke's work in electrical power systems to Alice Stoll's research on g-forces and fire-resistant materials, along with the barriers they faced during a time when women made up less than 1% of the engineering workforce. Hear how members of the SWE Late Career and Retiree Affinity Group came together to research these stories, how the SWE archives made this work possible, and why it's important for engineers today to understand this history. — The Society of Women Engineers is a powerful, global force uniting nearly 45,000 members of all genders spanning 90+ countries. We are the world's largest advocate and catalyst for change for women in engineering and technology. To join and access all the exclusive benefits to elevate your professional journey, visit membership.swe.org.
In honor of Asian Pacific American Heritage Month, Gigi Elbert, CEO of SASE, sits down with Karen Horting, executive director and CEO of SWE, to explore the experiences of Asian American and Pacific Islander engineers in STEM and what it will take to build stronger pathways into leadership. Gigi and Karen unpack why Asian Americans are represented in the workforce but remain underrepresented at the highest levels — with Asian women making up less than 1% of promotions from senior vice president to the C-suite, according to research from McKinsey & Company. They also discuss the growing gap between being “career ready” and navigating the workplace, including understanding unspoken professional norms. Plus, hear how SASE and SWE are helping students move from the classroom to the boardroom through mentorship, leadership opportunities, and community building. — The Society of Women Engineers is a powerful, global force uniting nearly 45,000 members of all genders spanning 90+ countries. We are the world's largest advocate and catalyst for change for women in engineering and technology. To join and access all the exclusive benefits to elevate your professional journey, visit membership.swe.org.
The Congress of South African Trade Unions, COSATU says it fully supports its affiliate, the South African Commercial Catering and Allied Workers Union, SACCAWU in opposing planned job cuts at Pick n Pay. SACCAWU says the retailer intends to retrench 22,000 workers and has accused the company of trying to bypass collective bargaining by serving notices directly to staff. The union also alleges Pick n Pay wants to cut benefits including transport allowances, subsidised meals and Sunday premiums. SaccaWu will hold a press conference in Johannesburg today to set out its response. Pick n Pay has not commented on the allegations. Swe spoke to SACCAWU National Spokesperson, Sithembele Tshwete.
Beth Barnes and David Rein on the one graph that ate the AI timelines discourse, and why the two people who built it are the most careful about how you read it.**SPONSOR**Prolific - Quality data. From real people. For faster breakthroughs.https://www.prolific.com/?utm_source=mlstInterview: https://youtu.be/cnxZZTl1tkk---Beth Barnes and David Rein from METR on the one graph that ate the AI timelines discourse, and why the people who built it are the most careful about how it gets read.Beth founded METR after leaving OpenAI alignment. David is first author on GPQA and co-author on HCAST and the METR Time Horizons paper. Together they built the measurement Daniel Kokotajlo called the single most important piece of evidence on AI timelines: the log-linear line of "how long a task a frontier model can complete at 50% reliability" vs release date.The conversation opens on reward hacking. Current models can articulate in chat why a behaviour is undesired and then execute it anyway as agents. From there: construct validity, Melanie Mitchell's four-problem taxonomy, and the ARC-AGI 1-to-2 collapse as a worked example of adversarially-selected benchmarks regressing once labs target them. Beth's counter: METR deliberately does not adversarially select. David's: models do not have to do the right thing for the right reasons.Methodology, then specification — David's compiler analogy, Beth on four-month tasks as expensive to evaluate rather than unspecifiable. Then the SWE-bench reality check, the METR finding that half of passing PRs would not be merged, and Beth's horses-versus-bank-tellers analogy for the labour market.The close: monitorability, the coin-spinning boat, two-year recursive self-improvement, and Beth's line that "overhyped now" and "big deal later" are not correlated claims.---TIMESTAMPS:00:00:00 Intro00:02:06 Sponsor break: Prolific human-feedback infrastructure00:02:33 Welcome and the scalable oversight motivation00:06:02 Construct validity, benchmark pathologies and the Chollet worry00:15:45 Time Horizons: human time, HCAST tasks and the 50% logistic00:24:50 Is human difficulty really one variable?00:33:05 Agent harness evolution and the inference-compute dividend00:40:00 Scaffolding bells, token budgets and the credit-assignment problem00:44:15 Look at the damn graph: regularisation bug and reliability nuance00:50:00 Why 50%? Reliability, reward hacking and pizza-party transcripts00:55:20 Extrapolation risk and straight lines on graphs00:59:25 Software engineering as a specification acquisition problem01:07:40 Compilers also made ugly code: vibe-coding quality and Claude on METR Slack01:15:15 Strongest defensible claim, Carlini's compiler swarm and AI 202701:23:45 SWE-bench merge rates, the bank-teller analogy and horses01:31:45 Scheming, alignment faking and the mentalistic vocabulary problem01:40:45 Reward hacking, monitorability and chain-of-thought faithfulness01:45:25 Recursive self-improvement, knowledge vs intelligence and closingReScript: https://app.rescript.info/public/share/de3bb40cc02ee39fdf36e2c60366eb4d(PDF, refs, transcript etc)
Mitchell Hashimoto 氏へのインタビューをベースにオープンソース、Git、AI開発ついて話しました。ファウンダーCEOから平社員に戻ったSWEが語るパッションドリブンなキャリアパス https://open.spotify.com/episode/0ROQTmAq7wTHDNbQPNHRyD?si=sRzMIp-5R9WHt52GrP92pg感想をぜひハッシュタグ #tilfm でつぶやいてください!お便りフォーム https://forms.gle/J2ioXHS98dYNoMbq5Your co-hosts:Tomoaki Imai, Noxx CTO https://x.com/tomoaki_imai bsky: https://bsky.app/profile/tomoaki-imai.bsky.socialRyoichi Kato, Software Engineer https://x.com/ryo1kato bsky: https://bsky.app/profile/ryo1kato.bsky.social
The AI model that was too dangerous to release just got breached. Anthropic entered the design software market. And OpenAI dropped its biggest model yet, just six weeks after the last one. This week, the NSA is using Anthropic's Mythos despite the Pentagon blacklisting the company, Claude Design takes on Figma and sends its stock down 7%, Yelp transforms into an agentic consumer app, Mythos gets accessed by an unauthorized Discord group, and OpenAI fires back with GPT-5.5. If you are a founder, operator, or executive trying to keep up with AI, this is your weekly five-minute briefing every Tuesday. Stories Covered This Week: NSA uses Anthropic's Mythos Preview despite the Pentagon declaring the company a supply chain risk Anthropic launches Claude Design, a prompt-to-prototype design tool that sent Figma stock down 7% Yelp's upgraded AI assistant can now book restaurants, doctors, and more in one conversation Anthropic investigates unauthorized access to Mythos through a third-party vendor environment OpenAI releases GPT-5.5, scoring 88.7% on SWE-bench with a 60% drop in hallucinations vs GPT-5.4 Episode Timestamps: 00:00 Intro 00:20 NSA uses Anthropic's Mythos despite Pentagon blacklist 01:10 Anthropic launches Claude Design 02:00 Yelp's AI assistant goes full service 02:50 Anthropic investigates Mythos breach 03:40 OpenAI drops GPT-5.5 04:30 Outro Partner Links Subscribe to our free newsletter: https://newsletter.theaireport.ai/subscribe Join the community: www.theaireport.ai/leaders-launch-guide Learn more about your ad choices. Visit megaphone.fm/adchoices
Hey, Alex here, I'll try to catch you up, but it's one of the more intense weeks in AI in recent memory. Here's the TL;DR - OpenAI dominates across the board this week! Finally launches “spud”, called it GPT 5.5 (and 5.5 Pro), and it's SOTA on most things,nearly matching the mysterious Claude Mythos but released and we can actually use it (we tested it extensively). OpenAI also took the crown in image generate with the incredible GPT-image-v2 release, beating Nano Banana 2 and pro by a significant margin, the images are incredible, this model can generate working QR codes and 360 images it's quite bonkers. Codex was updated with Computer Use (which I told you about last week), in-app browser and a bunch of other tools that match GPT 5.5 intelligence. Meanwhile, Anthropic launched an incredible research preview of Claude Design, finally admitted that Claude was dumb and reset quotas across the board, while breaking the trust of the community with removing Claude code from the pro plan. We've also got great open source updates, Kimi K2.6 and Qwen 3.6 27B are both great performers! We were live on the stream for almost 4 hours today waiting for GPT 5.5 and finally got it and tested it live on the show + had Peter Gostev on from Arena who had early access and shared with us his insights. Let's get into it! ThursdAI - Highest signal weekly AI news show is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.OpenAI's GPT 5.5 is here - SOTA AI intelligence you can actually use (Release Blog)OpenAI finally gave us all access to their latest intelligence boost, GPT 5.5 thinking (and GPT 5.5 Pro). These models take the crown across many benchmarks, including TerminalBench (82.7%), GPDval (84%) and more. You can see the highlited versions on the image above. Though, its not uncommon for OpenAI to do some chart crimes, so @d4m1n created a chart that also showed the full benchmarks, including the ones GPT 5.5 is not beating Opus at, as you can see below, it underperforms on Humanity's Last Exam, and scaled tool use. But, benchmarks don't tell the full story. GPT 5.5 uses significantly less tokens, compared to 5.4, about 40% less. It's also more expensive, but given the lower token usage, it nets out at about ~20% price increase, while being more intelligence and faster. Tons of folks who had early access are reporting the same things, this model excels in long running tasks, Peter Gostev from Arena, who joined our live stream, showed us an incredible demo that ran overnight for over 8h! This model can work until the task is done, no longer just pausing in the middel asking for your input. The real highlight is, paired with the recent GPT-image-2 (which I'll expand on later in this newsletter), GPT 5.5 becomes an excellent UI designer. This is a big area in which Claude still has moat and OpenAI is trying to catch up here, and the real alpha now is to use both the Image gen and 5.5 in tandem to create beautiful visuals and UIs. The main thing is, after testing it quite a few times, this only works if you generate an image outside of the session that builds the actual UI. we tried a couple of times to do it in 1 session, and the resulting UI doesn't seem to be remotely close to the generated image. Only after sending this image to a completely fresh session and asking for a “pixel perfect” implementation, did GPT 5.5 start to resemble the input image and rebuild the whole ui in pixel perfect fidelity! GPT Image v2 - SOTA thinking image model, finally beating Nano Banana (Blog, Live)Like we said, OpenAI is dominating this week, and in both instances those are great models. Though, apples to apples comparison, GPT-image-v2 is a much higher jump — from previous models — than GPT 5.5! According to Artificial Analysis, the jump in how many people prefer GPT-image-2 in blind tests compared to other model is the higest we've ever seen, over 250 points. And you can clearly see it in the generations as well. Previously this week, we did a live streaming session with Peter Gostev (from Arena) and we did a deep dive comparing this new model to GPT Image 1.5, Nano Banana and Grok Imagine, and it's a clear winner across most categories.Character consistency is immaculate, high resolution imagery, instruction following, are all so so good it's a bit hard to explain in text. Reasoning visual intelligence Like with Nano Banana, this model is likely based on a big GPT image, it's no longer just diffusion, as you can see, it reasons! And apparently the more reasoning you give it (if you choose GPT pro) the better it'll be. The examples are indeed wild, the model can generate images of code that works, generate functional QR codes and bar codes! The craziest thing people figured out it can do, is functional 360 imagery (equirectangular format), you can just ask the model to create a 360 image of “scene” and then drop this in to a 360 viewer! Peter shows us on the show how he combined GPT 5.5 and Image v2 to create a sort of “street view” from a bunch of 360 images, it blew our minds. He literally spun up an overnight GPT 5.5 task in Codex that planned out the hanging gardens of Babylon, generated hundreds of equirectangular images, stitched them into a walkable interface, and had it running 8+ hours without babysitting. A street view of a place we don't actually know what it looked like, hallucinated from latent space. What a time.Day one availability is wide: Figma, Canva, Adobe Firefly, fal.ai, and Microsoft Foundry all have it. Nano Banana dominated for what felt like an eternity in AI time (it was really only a few months
$852 billion. That's what OpenAI is now worth, and its own investors are starting to question if that math adds up. This week, Anthropic's new model takes the coding crown from GPT-5.4, OpenAI's backers get cold feet, Snap cuts 1,000 jobs and points the finger at AI, twelve tech giants team up to secure the internet, and Nvidia writes a $5 billion check to its oldest rival. If you're a founder, operator, or executive trying to keep up with AI, this is your weekly five-minute briefing every Tuesday. Stories Covered This Week: Claude Opus 4.7 hits 87.6% on SWE-bench Verified, beating GPT-5.4 and Gemini 3.1 Pro on coding OpenAI's $852B valuation faces scrutiny as Anthropic's revenue triples to $30B in one quarter Snap lays off 1,000 people (16% of staff), citing AI writing 65% of its code Anthropic launches Project Glasswing with Amazon, Apple, Microsoft, Google, Nvidia and 7 others Nvidia invests $5B in Intel, co-developing x86 chips built for its AI stack Timestamps: 00:00 Intro 00:31 Claude Opus 4.7 takes the coding crown 01:26 OpenAI investors get cold feet 02:18 Snap cuts 1,000 jobs, blames AI 02:56 Project Glasswing: Securing the world's critical software 03:49 NVIDIA invests 5 billion into Intel 04:41 Outro Partner Links Book Enterprise Training: https://www.upscaile.com/ Subscribe to our free newsletter: https://www.theaireport.ai/subscribe Free AI Tool Stack: https://community.theaireport.ai/checkout/the-ai-report-welcome-gift?coupon_code=WRTH Learn more about your ad choices. Visit megaphone.fm/adchoices
AI Unraveled: Latest AI News & Trends, Master GPT, Gemini, Generative AI, LLMs, Prompting, GPT Store
Hey ya'll, Alex here with your weekly AI news catch up. It's one of those Thursday's where no matter how well I prep, the big AI labs are hell bent to show up before each other. Alibaba dropped Qwen 3.6 with Apache 2, confirming their commitment to Open Source, then Anthropic released Claude Opus 4.7 (not quite Mythos) and OpenAI followed with a huge Codex update that includes Computer Use among other things. The highlight of Computer User is the background usage, more on that below. This is all just from today!Previously in the week we had 2 incredible 3D world generators, Lyra 2.0 from Nvidia and HYWorld 2 from Tencent, Windsurf dropping 2.0 version with Devin integration and Google releasing a Gemini TTS, with over 90+ languages support and incredible emotions range, and Baidu open sources Ernie Image, rivaling Nano Banana. Today on the show we had 3 awesome guests, Theodor from Cognition joined to cover the new Windsurf, Kwindla is back on the show to talk about “the side project that escaped containment” Gradient-Bang, a multi agent, voice based space game and Trevor from Marimo joined to talk about pairing your agents with a Marimo notebook. Let's dive in!
Many people—especially AI company employees [1] —believe current AI systems are well-aligned in the sense of genuinely trying to do what they're supposed to do (e.g., following their spec or constitution, obeying a reasonable interpretation of instructions). [2] I disagree. Current AI systems seem pretty misaligned to me in a mundane behavioral sense: they oversell their work, downplay or fail to mention problems, stop working early and claim to have finished when they clearly haven't, and often seem to "try" to make their outputs look good while actually doing something sloppy or incomplete. These issues mostly occur on more difficult/larger tasks, tasks that aren't straightforward SWE tasks, and tasks that aren't easy to programmatically check. Also, when I apply AIs to very difficult tasks in long-running agentic scaffolds, it's quite common for them to reward-hack / cheat (depending on the exact task distribution)—and they don't make the cheating clear in their outputs. AIs typically don't flag these cheats when doing further work on the same project and often don't flag these cheats even when interacting with a user who would obviously want to know, probably both because the AI doing further work is itself misaligned and because it [...] ---Outline:(09:20) Why is this misalignment problematic?(13:50) How much should we expect this to improve by default?(14:51) Some predictions(16:44) What misalignment have I seen?(40:04) Are these issues less bad in Opus 4.6 relative to Opus 4.5?(42:16) Are these issues less bad in Mythos Preview? (Speculation)(45:54) Misalignment reported by others(46:45) The relationship of these issues with AI psychosis and things like AI psychosis(48:19) Appendix: This misalignment would differentially slow safety research and make a handoff to AIs unsafe(51:22) Appendix: Heading towards Slopolis(55:30) Appendix: Apparent-success-seeking (or similar types of misalignment) could lead to takeover(59:16) Appendix: More on what will happen by default and implications of commercial incentives to fix these issues(01:03:20) Appendix: Can we get out useful work despite these issues with inference-time measures (e.g., critiques by a reviewer)? The original text contained 14 footnotes which were omitted from this narration. --- First published: April 15th, 2026 Source: https://www.lesswrong.com/posts/WewsByywWNhX9rtwi/current-ais-seem-pretty-misaligned-to-me --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podc
Audrey McCormack, opening keynote for WE Local Dublin and senior director, research & commercialisation facility lead at MSD Ireland, discusses how engineers can turn failure into a powerful career advantage in this episode. In conversation with host Sam East, Audrey reflects on her unconventional path from electrician to biotech leader and shares how the moments that didn't go to plan shaped her resilience, confidence, and leadership style. Hear how to pursue STEM roles when you don't meet every requirement, rebuild confidence after a failure, and create space for your team to take risks — even in high-stakes technical environments. Audrey will expand on these insights as the opening keynote at WE Local Dublin, taking place April 23-24, 2026. WE Local conferences bring together engineers and technologists for networking opportunities, professional development, and inspiring conversations like this one. Register at welocal.swe.org to join SWE at WE Local Dublin or an upcoming WE Local near you. — The Society of Women Engineers is a powerful, global force uniting nearly 45,000 members of all genders spanning 90+ countries. We are the world's largest advocate and catalyst for change for women in engineering and technology. To join and access all the exclusive benefits to elevate your professional journey, visit membership.swe.org.
Hey yall, Alex here, writing this from sunny London, at the first ever AI Engineer conference in Europe!What a show we have for you today! First, let me catch you up on what's important: Anthropic, this week announced a whopping $30B ARR up from 19B in Feb, while also telling us about Claude Mythos Preview their next gen HUGE model that they won't release to the public (yet?) that finds crazy vulnerabilities in existing code bases. Apparently OpenAI will follow up with a similar non-public model soon.The Meta Superintelligence Lab led by Alex Wang finally showed what they were working on, Muse Spark, the smaller of their upcoming models on a complete new infrastructure (MSL announcement, Simon Willison's deep dive on the 16 hidden tools).In other news:Z.AI released GLM 5.1 in OSS finally (HF weights), Seedance 2.0 finally available in US on Replicate, OpenAI testing out GPT-image-2 on LM Arena under codenames, HappyHorse from Alibaba takes the video crown, and Mila Jovovich (5th Element, Resident Evil) releases agentic memory plugin called MemPalace (Ben Sigman's transparent correction thread is worth reading).We had 5 guests today on the show, we kick off with @swyx the founder of AI Engineer and host of Latent Space. We then chatted with @petergostev from Arena (formerly LMArena) about Mythos and the compute wars, then Vincent Koc, the second most prolific contributor to OpenClaw, then our friends VB from OpenAI and Omar from DeepMind, both previously at HuggingFace. This is a busy busy show, and given the time-zones, I unfortunately don't have time for a full weekly writeup, but as always, I will share the raw notes and post the video (lightly edited).ThursdAI - Highest signal weekly AI news show is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.AI Engineer - LondonThursdAI came a long way since the first AI Engineer conference, but many who read this don't know, that was my big break. Swyx invited me to cover the first AIE in San Francisco in 2023, and I remember, I was in an Uber to the airport, the driver asked me what I do, and I, for the first time said “I host a podcast”. I (and ThursdAI) owe a lot to Swyx, and AIE team, and it's been incredible to see how big they've grown and how many great speakers this event hosts! The term AI Engineer has drifted in those 3 years, but also has the term Software Engineer. Swyx predicted this nearly 3 years ago, what I don't think he predicted, is that all engineers are now AI Engineers, and this includes domains like Agens (OpenClaw), Context and Harness Engineering, Evals and Observability, Voice & Vision all of which are tracks in this conference. I was really surprised to see how many of the talks/speakers here are native to London (after all, Deepmind is from here, OAI, Anthropic, Meta have offices here) and the latest boom in agents, OpenClaw, Pi were all Europe based as well, and they are joined the AI Engineer stage. Oh, and there's also a Giant Inflatable Claw at the entrance, yup, for pictures and vibes, and to show off how quickly the OpenClaw took over the mind-share. Anthropic announces $30B ARR and Mythos, their next model, will not be released to the public. The thing that everyone will tell you, is that Anthropic is on a roll, this is obviously connected to their upcoming IPO this year. We've been covering many issues on their part, but this week we saw them posting about a HUGE increase in ARR, from 19B in February to 30B in April, passing OpenAI at $25B. That last fact though, is kind of disproven because they report on ARR differently, OpenAI apparently only counts their cloud revenue from Microsoft per the information. The growth is undeniable though, and so is the most unprecedented release announcement, Claude Mythos Preview, which was rumored for a bit and now was announced proper. With project Project GlassWing, Anthropic has announced that this model is SO good at cyber security and finding bugs in code, that they cannot share it with the public, and through GlassWing they will share it with companies like Microsoft, Linux, CrowdStrike and a bunch of others, to harden their security. This is it folks, this is the first time, where a model was “announced” but deemed too risky to release. Now, is it truly “too risky”? Previously, folks thought that DALL-E is too risky, or cloning voice tech is too risky, and now it's everywhere. The capabilities catch up even in OpenSource. But the facts are, Anthropic says they've found a 27-year old bug in OpenBSD (famously very secure), and that this model is very very good at connecting the dots between several, seemingly inacuous bugs, to string them together into one coheren exploit. This is, indeed scary. Just last week, one of the top security researchers in the world, Nicolas Carlini, now at Anthropic, gave a talk at Black Hat, showing off these results, and saying that these models since December and definitely recently have passed him as a security engineer. If you haven't seen this talk, watch it, then try to estimate if Anthropic did the right thing by only releasing this model to enterprises first. But on the show, Peter Gostev from Arena gave me a take on this that I haven't been able to shake. Peter pulled up his Compute Wars chart live on the show — and the picture is that OpenAI is way ahead of Anthropic on compute, with Anthropic only recently getting a noticeable bump (which lines up suspiciously well with Mythos being trainable in the first place). His read: “it sounds cooler to say it's too risky to release than ‘we can't serve it.'” The official partner pricing is $25 / $125 per million tokens — 5x Opus 4.6 — but if you don't have the GPUs to serve it broadly, the price doesn't matter. In the year of the IPO, the company that cannot serve a model says the model is too dangerous to serve. Make of that what you will.This also reframes the whole rate-limit drama with OpenClaw. Anthropic didn't ban OpenClaw — I want to be very clear about this because the discourse went sideways. What they did is they made it significantly more expensive for Max-tier subscribers to use Opus through OpenClaw, which pushed a lot of people over to GPT-5.4 via Codex. Same root cause: they're out of compute. The freshly announced Anthropic + Google TPU deal (Google already owns ~10% of Anthropic) is them trying to fix this — though as Peter noted, it's pretty wild that Google is propping up a direct competitor to their own DeepMind team. Same pattern as their original $2B Anthropic investment ending up propping AWS Bedrock against Google Cloud. Big Google contains multitudes.Meta Superintelligence Labs ships Muse Spark — Llama is dead, long live MuseLlama is dead, long live Muse. This week Meta finally showed what the very expensive Meta Superintelligence Labs under Alexandr Wang has been cooking, and the answer is Muse Spark — the smaller of their new model family, built on a fully rebuilt AI stack from scratch in just 9 months. Nine months is wild for that kind of overhaul, and the headline number people are quoting is that they reach Llama 4 Maverick capability with over 10x less compute.Spark is intentionally small and latency-optimized — it's not trying to be the biggest, it's trying to be the first step on Meta's new scaling ladder. But the benchmarks in certain areas are nuts: 86.4 on CharXiv Reasoning (beats Opus, Gemini, GPT-5.4), and the one that really got me — 42.8 on HealthBench Hard vs Opus at 14.8 and Gemini at 20.6. They trained it with data curated by over 1,000 physicians and it shows. They also shipped a Contemplating mode which is parallel multi-agent reasoning, hitting 58.4% on Humanity's Last Exam with tools. Coding is the acknowledged weak point (77.4 on SWE-Bench Verified vs Opus 80.8) but for v1 from a brand new stack, this is extremely respectable.Meta is Back!The real story isn't any single benchmark though, it's distribution. Spark is rolling out across meta.ai, WhatsApp, Instagram, Threads, Messenger, and Ray-Ban Meta glasses — billions of users. Meta went from open Llama to a closed consumer model and they're clearly playing a different game now (though Wang says future Muse versions might be open-sourced).The deep-dive that's really worth your time is Simon Willison's post where he poked at the meta.ai chat UI and got the model to spit out descriptions of 16 hidden tools behind the scenes — full Code Interpreter with persistent Python 3.9, a visual grounding tool that does pixel-precise object detection (bounding boxes, point coordinates, counting — it located 8 objects including individual whiskers and claws on a generated raccoon), sub-agent spawning, file editing, and semantic search across Instagram/Threads/Facebook posts. It's basically an entire agentic harness baked into the chat UI. Jack Wu from MSL confirmed the tools are part of a new harness built specifically for Spark's launch. Meta stock went up 7% on this. They are very much back in the frontier game.Guest highlights We had an unprecedented packed show with 5 guests (also this is the shortest show we've everSwyx kicked us off with vibes from the AI Engineer floor — harness engineering as the dominant theme (gains are coming from the harness, not the weights), the rise of skills (English-as-programming-language) absorbing more of that harness work, and his thesis that supply-chain attacks like the recent light LLM and Axios incidents mean you should basically vendor everything — pip fork instead of pip install. We also chatted about how MCP has gone from “the most exciting protocol” to “settled and stable, therefore less interesting,” which is a great problem to have.Peter Gostev from Arena (you saw a lot of him in the Mythos section above) also dropped a bonus on us: Arena just released 3 years of historical leaderboard data and actual prompt datasets on Hugging Face. He used to literally scrape the arena website by hand into Google sheets to make those overtime leaderboards we all loved — now it's all public. Also: he confirmed that Seedance 2.0 jumped ~80 ELO points above the next video model on Arena, which is unprecedented — video models normally cluster within 10 points of each other.Vincent Koc — the #2 OpenClaw maintainer after Peter Steinberger — joined us fresh off the OpenClaw track stage. The OpenClaw codebase is now ~1.5 million lines of code including unreleased iOS and Android native apps. GitHub literally caps the issue/PR counter at “5K+” and they hit the ceiling. We talked about OpenClaw 2026.4.5 which ships /dreaming GA (Light/Deep/REM phases that defrag agent memory and write a human-readable Dream Diary to DREAMS.md), built-in video and music generation across 4 backends, GPT-5.4 as the new default, prompt-cache reuse improvements, and Control UI + docs in 12 new languages. Vincent's framing of dreaming was beautiful — “how do you explain agent memory to a mom? You call it dreaming.” He also gave my favorite line of the show on the GPT-5.4 personality problem: incredible at coding, but soulless. (For what it's worth, I came home after watching Project Hail Mary, cloned the Rocky voice, dropped it into my OpenClaw, and it was magical. That's the kind of thing you can only do when the harness and the model are decoupled.)VB from OpenAI told us Codex just hit 3 million weekly active users — up from 2 million last month. We talked plugins (the Stripe / Supabase / shadcn ones that ship as packages), sub-agents (yes, one is named Jason), and Guardian Approvals — an experimental mode that classifies each tool call by risk and only escalates the dangerous ones to you, so you don't have to YOLO-mode everything. The story that stuck with me though is his 9 AM Codex automation: every morning it reads his Slack mentions, cross-references Gmail and Calendar, and creates 5-minute pre-brief calendar events for upcoming meetings. None of that is “coding.” That's the super-app future hiding inside a “developer tool.” I'm stealing this workflow.Omar Sanseviero from Google DeepMind came on to celebrate Gemma 4 crossing 10M+ downloads with 1,000+ Gemma-4-based fine-tunes already on HF (and Gemma family total is now over 500M downloads). Gemma 4 is also the foundation for the next generation of Gemini Nano on Pixel/Samsung devices. Lama.cpp vision capability fixes are landing. Gemma 4 is also live on W&B Inference if you want to play. Wolfram (whose entire household runs on Pixel + Google AI Studio, including his 70-year-old mother on voice unlock) was in heaven.This Week's BuzzA short but spicy week from Weights & Biases:* W&B Automations are LIVE. You can now wire event triggers from your training runs (completion, eval thresholds, drift) into notifications, GitHub Actions, deployments, infra shutdowns — closing the loop from experiment to production. Pairs really well with the iOS app we recently shipped, so you can get a ping on your phone the moment something interesting happens on a run.* GLM 5.1 is live on W&B Inference (alongside Gemma 4 from last week) — the team is moving fast to host the best open models the moment they drop.* Wolfram published a deep dive on “more reasoning is not always better” on the W&B blog — the research behind his finding that giving models more thinking tokens can actually make them dumber on certain tasks. It's the in-depth version of what we discussed on the show last week, with all the data. Go read it on wandb.com.Also: shout out to everyone who came up to me at AI Engineer and said hi. The Wolf Bench mentions in particular made my day. If you're listening to this and you're at AIE — come find us, we'll be around tomorrow too.That's it for this week — newsletter is short because the show was long and London is calling. As always, thanks for reading and listening
I've recently updated towards substantially shorter AI timelines and much faster progress in some areas. [1] The largest updates I've made are (1) an almost 2x higher probability of full AI R&D automation by EOY 2028 (I'm now a bit below 30% [2] while I was previously expecting around 15%; my guesses are pretty reflectively unstable) and (2) I expect much stronger short-term performance on massive and pretty difficult but easy-and-cheap-to-verify software engineering (SWE) tasks that don't require that much novel ideation [3] . For instance, I expect that by EOY 2026, AIs will have a 50%-reliability [4] time horizon of years to decades on reasonably difficult easy-and-cheap-to-verify SWE tasks that don't require much ideation (while the high reliability—for instance, 90%—time horizon will be much lower, more like hours or days than months, though this will be very sensitive to the task distribution). In this post, I'll explain why I've made these updates, what I now expect, and implications of this update. I'll refer to "Easy-and-cheap-to-verify SWE tasks" as ES tasks and to "ES tasks that don't require much ideation (as in, don't require 'new' ideas)" as ESNI tasks for brevity. Here are the main drivers of [...] ---Outline:(04:58) Whats going on with these easy-and-cheap-to-verify tasks?(08:17) Some evidence against shorter timelines Ive gotten in the same period(10:46) Why does high performance on ESNI tasks shorten my timelines?(13:15) How much does extremely high performance on ESNI tasks help with AI R&D?(18:22) My experience trying to automate safety research with current models(19:58) My experience seeing if my setup can automate massive ES tasks(21:08) SWE tasks(23:29) AI R&D task(24:20) Cyber[... 1 more section]--- First published: April 6th, 2026 Source: https://www.lesswrong.com/posts/dKpC6wHFqDrGZwnah/ais-can-now-often-do-massive-easy-to-verify-swe-tasks-and-i --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another
この記事の裏話をSWEのsukeさんと話しました。お知らせゼロトピックやブログの更新をニュースレターでお知らせしています。簡単に登録できますのでぜひご利用くださいゼロトピックへのおたよりはこちらまで 。番組の感想やご質問等なんでも構いません。反響があると続けるモチベーションになります。頂いたおたよりは番組内で取り上げさせていただくことがございます。Xアカウント: https://x.com/0topic_podcast株式会社10Xでは絶賛採用中です。ご関心を持っていただけた方は、こちらのリンクをご確認ください!
When women and allies in engineering connect globally, they strengthen the entire community. In this episode, host Abosede Adewole, collegiate engagement lead for the SWE Global Women Engineers Affinity Group, is joined by Banisha Prinja, lead-elect, and Eshika Mahajan, professional development lead, to discuss how they have built relationships with engineers around the world and the benefits they have gained from these global connections. They open up about their experiences as women in STEM, from navigating industries where they were the only women in the room to finding commonalities across their experiences in Nigeria and India. Hear why a global perspective matters at the local level, what to do when you don't “feel ready,” and how participating in SWE's Global Women Engineers Affinity Group has grown their networks and leadership skills. Learn more and get involved with the SWE Global Women Engineers Affinity Group: https://affinitygroups.swe.org/global-women-engineers/ — The Society of Women Engineers is a powerful, global force uniting 50,000 members of all genders spanning 85 countries. We are the world's largest advocate and catalyst for change for women in engineering and technology. To join and access all the exclusive benefits to elevate your professional journey, visit membership.swe.org.
In this episode, Abigail Mizzi, master's student in aeronautical and astronautical engineering at Purdue University, and Steven Collicott, Ph.D., professor in Purdue's School of Aeronautics and Astronautics, share the story behind Purdue 1 — a groundbreaking university-led spaceflight mission set for 2027. Abigail is poised to become the first graduate student to conduct her thesis research in space, operating her fluids experiment during three minutes of microgravity. Dr. Collicott will also fly a human-tended experiment studying how liquids move over surfaces in weightlessness — research that can't be replicated on Earth. In conversation with FY26 SWE President Inaas Darrat, hear how Purdue 1 became a reality, what it takes to prepare mentally and physically for suborbital flight, and how SWE has shaped Abigail's STEM journey — including receiving the Outstanding Collegiate Member award. — The Society of Women Engineers is a powerful, global force uniting 50,000 members of all genders spanning 85 countries. We are the world's largest advocate and catalyst for change for women in engineering and technology. To join and access all the exclusive benefits to elevate your professional journey, visit membership.swe.org.
What does responsible AI actually look like, and who gets to shape it? Live from the WE25 Diverse Podcast Studio in New Orleans, Lisa Thee, TEDx speaker and global AI thought leader, discusses using AI for social good and building technology that puts people first. Lisa shares her unexpected journey from industrial engineering at the University of Michigan to becoming an accidental entrepreneur, and the moment in 2015 when she realized AI could help combat human trafficking — including a collaboration that helped law enforcement recover 130 missing children in its first month. In conversation with Larry Guthrie, director of content strategy at SWE, hear three practical tips to use AI ethically, where bias enters AI systems, and how SWE and its annual conferences have repeatedly shaped Lisa's career. — The Society of Women Engineers is a powerful, global force uniting 50,000 members of all genders spanning 85 countries. We are the world's largest advocate and catalyst for change for women in engineering and technology. To join and access all the exclusive benefits to elevate your professional journey, visit membership.swe.org.
Emily finds months' worth of missing lip products buried deep in the couch. Mysterious glowing orbs set off the alarms at Matt's work, completely derail Kaitlin's week, and leave the entire SWE team questioning reality (ghosts? government? the afterlife? all of the above?). Kaitlin's face-brushing journey fails, Matt and Emily's dishwasher and oven both die, and the comforting reality that all of our food is killing us. Kaitlin turns down a potential interview guest for the first time ever. Then Caroline and Hannah call in and everyone shares what their biggest first-date red flag would be. Follow SWE on Instagram → @so.what.else Follow Kaitlin on Instagram → @kaitlingraceelliott https://www.kaitlinelliott.com/
Our 237th episode with a summary and discussion of last week's big AI news!Recorded on 03/13/2026Hosted by Andrey Kurenkov and Jeremie HarrisFeel free to email us your questions and feedback at andreyvkurenkov@gmail.com and/or hello@gladstone.aiRead out our text newsletter and comment on the podcast at https://lastweekin.ai/In this episode:* Perplexity announced “Personal Computer,” a local Mac-based AI agent positioned as a safer alternative to OpenAI's computer-use agents, while Anthropic added GitHub PR code review pricing reviews at $15–$25 and Cursor launched trigger-based “Automations” for always-on coding agents.* ChatGPT introduced interactive math/science visuals and Anthropic added in-chat interactive charts/diagrams; Nvidia released open weights for its 120B-parameter Natron Free Super hybrid Transformer–Mamba latent-MoE model trained natively at 4-bit for Blackwell GPUs.* Nvidia halted H200 production for China amid customs blocks and domestic chip pressure; xAI saw major co-founder departures; Anthropic previewed a Claude Marketplace for enterprise procurement; Yann LeCun's aMI raised $1.3B; humanoid robot maker Sanctuary reached a $1.15B valuation.* Anthropic sued the Pentagon over a “supply chain risk” designation as memos ordered removal within 180 days; research covered models resisting activation steering, limits of chain-of-thought control, inference-scaling boosting cyber-task success, low-probability risky actions, weaknesses in SWE-bench, multimodal pretraining, long-context RNN memory caching, context-parallel training efficiency, RL for CUDA kernel optimization, and latent introspection detecting concept injection.A thank you to our current sponsors:Box - visit Box.com/AI to learn moreODSC AI - go to odsc.ai/east and use promo code LWAI for an additional 15% off your pass to ODSC AI East 2026.Factor - head to factormeals.com/lwai50off and use code lwai50off to get 50 percent off and free breakfast for a yearTimestamps:(00:00:10) Intro / Banter(00:01:23) Response to listener commentsTools & Apps(00:02:06) Perplexity's Personal Computer turns your spare Mac into an AI agent | The Verge(00:04:22) Anthropic launches code review tool to check flood of AI-generated code | TechCrunch(00:08:08 ) Cursor is rolling out a new kind of agentic coding tool | TechCrunch(00:11:14) ChatGPT can now create interactive visuals to help you understand math and science concepts | TechCrunch(00:11:56) Anthropic's Claude AI can respond with charts, diagrams, and other visuals now | The VergeProjects & Open Source(00:13:54) Introducing Nemotron 3 Super: An Open Hybrid Mamba-Transformer MoE for Agentic Reasoning | NVIDIA Technical BlogApplications & Business(00:21:22) Nvidia halts H200 production as China backs Huawei AI chips(00:28:33) Another XAI Cofounder Has Left, and Another Says He's Leaving. - Business Insider(00:34:04) Anthropic's Claude Marketplace allows customers to buy third-party cloud services | TechRadar(00:37:57) Yann LeCun's AMI Labs raises $1.03 billion to build world models | TechCrunch(00:44:52) Humanoid robotics maker Sunday reaches $1.15B valuation to build household robots | TechCrunchPolicy & Safety(00:46:09) Anthropic Sues Department of Defense Over ‘Supply Chain Risk' Label - The New York Times + Google and OpenAI Just Filed a Legal Brief in Support of Anthropic (00:53:24) Internal Pentagon memo orders military commanders to remove Anthropic AI technology from key systems - CBS News(00:58:15) Endogenous Resistance to Activation Steering in Language Models(01:06:27) Reasoning Models Struggle to Control their Chains of Thought(01:09:52) ‘It means missile defence on datacentres': drone strikes raise doubts over Gulf as AI superpower(01:14:57) Evidence for inference scaling in AI cyber tasks: Increased evaluation budgets reveal higher success rates(01:18:24) Frontier Models Can Take Actions at Low ProbabilitiesResearch & Advancements(01:24:20) Research note: Many SWE-bench-Passing PRs Would Not Be Merged into Main(01:28:26) [2603.03276] Beyond Language Modeling: An Exploration of Multimodal Pretraining(01:40:09) Memory Caching: RNNs with Growing Memory(01:48:47) Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking(01:58:41) CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation(02:08:57) Latent Introspection: Models Can Detect Prior Concept Injections(02:16:45) Physics of RL: Toy scaling laws for the emergence of reward-seekingSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
Guys I think its red I saw them vent... Justtt kidding nobody is ever an imposter in SWE!! Explore imposter syndrome with us as we share our experiences being a woman in male dominated fields. You belong where you are and you are always worth it. If you ever feel down and dumb come talk to us and we'll hype you up! Stay as gorgeous and smart as you are, you're always deserving and worth it. Love, Emma and Ava
Paige Feikert, research and technology engineer with Spirit AeroSystems, joins us live from the WE25 Diverse Podcast Studio to break down why communication can make or break your impact as an engineer — and how to get better at it. In conversation with Larry Guthrie, director of content strategy at SWE, Paige shares her unconventional career path from biomedical engineering student, to TV news producer at a CBS affiliate, and back into engineering. Hear why storytelling matters in technical work, actionable advice to help you communicate to executives and non-technical audiences, and what producing live TV news taught Paige about teamwork, deadlines, and handling pressure. — The Society of Women Engineers is a powerful, global force uniting 50,000 members of all genders spanning 85 countries. We are the world's largest advocate and catalyst for change for women in engineering and technology. To join and access all the exclusive benefits to elevate your professional journey, visit membership.swe.org.
As part of SWE's “Un Cafecito With a Woman in STEM” series, spotlighting Latina voices across the globe, this special episode of Diverse is presented in Spanish. Claudia Guerrero, SWE global ambassador and service program leader at GE Aerospace, sits down with host Doris Moreno Maldonado, process engineer at Kellogg's and lead of the SWE Latinos Affinity Group, for a conversation on antifragility, authenticity, and creating community wherever you are. Recorded live at WE25 in New Orleans, Claudia shares her journey as one of the first women in her family to study engineering in Mexico, her role in growing the SWE affiliate in Querétaro, and surviving a life-altering medical crisis. Hear how to move beyond resilience toward antifragility, build your personal board of directors, and lean into your unique authentic strengths. — The Society of Women Engineers is a powerful, global force uniting 50,000 members of all genders spanning 85 countries. We are the world's largest advocate and catalyst for change for women in engineering and technology. To join and access all the exclusive benefits to elevate your professional journey, visit membership.swe.org.
Google just dropped Gemini 3 Flash—a model that outperforms Gemini 2.5 Pro (their last top model) while running 3x faster at less than 1/4 the cost. It's frontier-level reasoning at Flash-level speed, and it's rolling out globally right now.We're sitting down with Logan Kilpatrick from Google DeepMind to explore what this actually means for developers, knowledge workers, and anyone trying to figure out how AI fits into their workflow.What we'll cover:
It is the eve of the USANZ ASM here in Melbourne (which Renu is Convening!), and some of the 27 International guests are already into town. So we grabbed Anders Bjartell (Lund University/University Hospital Malmo, SWE), for a sit down chat in studio. Anders has been a huge name in prostate cancer research in Europe for the past couple of decades, including as a senior investigator on TITAN. so we decided to pick his brain on three topical areas in prostate cancer:1. ADT/ARPI doublets in mHSPC - can we adopt an intermittent approach before the trials read out? What about docetaxel triplet therapy?2. The SPARC consensus on PSMA PET/CT reporting - Anders led this initiative which recently publihsed in Eur Urol. What is this and why does it matter?3. Organised Prostate Testing - what is the difference between screening and organised prostate testing? With usual hosts Renu Eapen and Declan MurphyThis is a Themed Podcast supported by our Gold Partners, Johnson & Johnson. Links:SPARC consensus paper in European Urology
Carolina Caro, WE Local Portland keynote and CEO of Conscious Leadership Partners, breaks down communication blind spots for engineers and explores how your communication style shapes your leadership impact in this episode. As a scientist by training, Carolina reflects on why technical expertise isn't enough and speaks to the platinum rule, where you meet people where they are and communicate in the way they can best receive. In conversation with FY26 SWE President Inaas Darrat, hear how to strengthen your communication without compromising your authenticity and why communication style is an often-overlooked dimension of diversity. Carolina will expand on these insights as the opening keynote at WE Local Portland, taking place Feb. 27-28. WE Local conferences bring together engineers and technologists for networking opportunities, professional development, and inspirational speakers. Register at welocal.swe.org to join SWE at an upcoming WE Local near you! — The Society of Women Engineers is a powerful, global force uniting 50,000 members of all genders spanning 85 countries. We are the world's largest advocate and catalyst for change for women in engineering and technology. To join and access all the exclusive benefits to elevate your professional journey, visit membership.swe.org.
CoreStory is building code intelligence platforms that address the fundamental limitation of today's coding agents: their inability to navigate complex enterprise codebases. While foundation models excel at greenfield development, they fail at real-world engineering tasks in systems spanning millions of lines of code. CoreStory's context layer delivers a 44% improvement on SWE-bench, the industry's standard benchmark for measuring coding agent effectiveness on actual GitHub issues. In this episode of BUILDERS, I sat down with Anand Kulkarni, CEO of CoreStory, to explore how his team is enabling the shift to AI-native engineering and seeding the category of spec-driven development across Microsoft, GitHub, and Amazon. Topics Discussed: Building with GPT-3 API 18 months before ChatGPT went public Why even GPT-5 and Opus 4.5 struggle with enterprise codebases on SWE-bench The narrative shift required when selling AI pre- and post-ChatGPT CoreStory's 44% improvement in coding agent performance through context intelligence How "spec-driven development" got adopted by Microsoft, GitHub, and Amazon without formal analyst relations The parallel between JIRA monetizing Agile and CoreStory enabling AI-native engineering Three-channel distribution: direct enterprise, coding agent partnerships via MCP, and hyperscaler/GSI routes Why specs become the source of truth while code becomes disposable in the AI era GTM Lessons For B2B Founders: Match your narrative precision to technical depth: CoreStory deploys three distinct positioning strategies based on audience sophistication. For AI practitioners tracking benchmarks, they lead with "44% SWE-bench improvement"—a metric that immediately signals meaningful progress on the hardest problem in the space. For engineering leaders aware of AI tooling but not deep in the research, they focus on velocity gains and ROI metrics. For executives, they describe reverse-engineering codebases into machine-readable specs. The key insight: technical audiences dismiss vague value props, while non-technical audiences get lost in benchmark details. Map your positioning to how your audience measures success in their world. Seed category language through earned adoption, not manufactured consensus: Anand initially called their approach "requirements-driven development" before simplifying to "spec-driven development." Rather than pitching analysts, they used the term consistently in customer conversations, gave talks at GitHub Universe, and shipped demos showing the workflow. When customers naturally adopted the language and community leaders began using similar terminology independently, Microsoft and GitHub followed with their own implementations (like GitHub's SpecKit). The lesson: category language sticks when practitioners choose to use it because it clarifies their work, not because a vendor pushed it. Focus on customer adoption as proof of concept before seeking broader market validation. Position against emergent practices, not just incumbent products: CoreStory doesn't position against legacy code analysis tools—they position as the enabler of AI-native engineering, the discipline that will displace Agile. Anand's insight from watching JIRA's success: "People don't love JIRA. What they love is Agile as a way to move away from waterfall." CoreStory is betting that 10x velocity gains from AI-native practices will drive the same categorical shift. When you're early in a technology wave, attach to the practice change (how teams will work differently) rather than feature comparisons with existing tools. Movements create markets. Design channel strategy around customer problem awareness: CoreStory's three channels map to different stages of buyer sophistication. Direct enterprise comes from teams already deep in AI engineering who've hit the context limitation wall. Coding agent partnerships (via MCP integration with tools like Cognition and Factory) serve builders wanting better AI tooling who haven't diagnosed the context problem yet. Hyperscalers and GSIs distribute into modernization and maintenance projects where AI enablement is emerging as a requirement. Each channel serves a distinct buyer journey stage. Don't force one go-to-market motion—design multiple paths based on where different customer segments are in understanding the problem you solve. Navigate pre-legitimacy markets by hiding the breakthrough: Before ChatGPT, selling anything AI-driven faced immediate skepticism about whether it was "real" or just smoke and mirrors. Anand couldn't lead with AI without triggering disbelief. CoreStory focused on delivered outcomes—"here's what you'll be able to do"—with AI as the mechanism, not the message. Post-ChatGPT, the challenge flipped: everyone expects AI, but now the differentiation question becomes harder. If you're building on emerging technology before market consensus forms, deemphasize the technology until buyers have context to evaluate it. Once the market validates the technology category, shift to demonstrating your specific technical advantage within it. // Sponsors: Front Lines — We help B2B tech companies launch, manage, and grow podcasts that drive demand, awareness, and thought leadership. www.FrontLines.io The Global Talent Co. — We help tech startups find, vet, hire, pay, and retain amazing marketing talent that costs 50-70% less than the US & Europe. www.GlobalTalent.co // Don't Miss: New Podcast Series — How I Hire Senior GTM leaders share the tactical hiring frameworks they use to build winning revenue teams. Hosted by Andy Mowat, who scaled 4 unicorns from $10M to $100M+ ARR and launched Whispered to help executives find their next role. Subscribe here: https://open.spotify.com/show/53yCHlPfLSMFimtv0riPyM
Julie Daugherty, engineering associate and process project leader at Corning Incorporated, joins us live at WE25 to discuss how introverts can embrace their strengths and turn them into career superpowers. In conversation with Larry Guthrie, director of content strategy at SWE, hear how planning ahead builds confidence, how to survive “mandatory fun” networking events, and why finding extroverted champions can be key to career growth in STEM. You'll also learn how managers can support introverted engineers, what it means to be an ambivert, and why personality diversity leads to stronger teams and better problem-solving. — The Society of Women Engineers is a powerful, global force uniting 50,000 members of all genders spanning 85 countries. We are the world's largest advocate and catalyst for change for women in engineering and technology. To join and access all the exclusive benefits to elevate your professional journey, visit membership.swe.org.
I detta avsnitt av Kulturbarnen berättar Pontus om projektet han drivit i smyg i snart ett år, re/works SWE för Epidemic Sound, i vilket han bjudit in ett stort antal svenska artister och låtit dem omtolka musik från ett stort låtbibliotek dedikerat för synk och ljudläggning av internet. Välkomna!
Zoe, SWENext Influencer and FIRST Robotics leader, and Charlee, a member of the 2025 STEM Next Flight Crew youth ambassador program, join us live from the WE25 Diverse Podcast Studio in New Orleans to share their experiences as young STEM leaders. In conversation with Larry Guthrie, director of content strategy at SWE, they reflect on how Invent It. Build It. sparked powerful connections, what leadership looks like as a precollege student, and how they navigate bias, responsibility, and pressure as “the first” in their communities. Hear their experiences starting STEM clubs and expanding access in rural areas, plus why a strong community is critical to shaping a more inclusive future in engineering. — The Society of Women Engineers is a powerful, global force uniting 50,000 members of all genders spanning 85 countries. We are the world's largest advocate and catalyst for change for women in engineering and technology. To join and access all the exclusive benefits to elevate your professional journey, visit membership.swe.org.
This is a link post. Improving model performance by scaling up inference compute is the next big thing in frontier AI. But the charts being used to trumpet this new paradigm can be misleading. While they initially appear to show steady scaling and impressive performance for models like o1 and o3, they really show poor scaling (characteristic of brute force) and little evidence of improvement between o1 and o3. I explore how to interpret these new charts and what evidence for strong scaling and progress would look like. From scaling training to scaling inference The dominant trend in frontier AI over the last few years has been the rapid scale-up of training — using more and more compute to produce smarter and smarter models. Since GPT-4, this kind of scaling has run into challenges, so we haven't yet seen models much larger than GPT-4. But we have seen a recent shift towards scaling up the compute used during deployment (aka 'test-time compute' or ‘inference compute'), with more inference compute producing smarter models. You could think of this as a change in strategy from improving the quality of your employees' work via giving them more years of training in which acquire [...] --- First published: February 2nd, 2026 Source: https://forum.effectivealtruism.org/posts/zNymXezwySidkeRun/inference-scaling-and-the-log-x-chart Linkpost URL:https://www.tobyord.com/writing/inference-scaling-and-the-log-x-chart --- Narrated by TYPE III AUDIO. ---Images from the article:
This episode is sponsored by BD. Showing up as your authentic self in engineering isn't always easy. In this conversation, Christine Kearney Hawkins, senior staff R&D engineer in BD's Peripheral Intervention business and SWE life member, shares her 20+ year journey navigating authenticity, leadership, and innovation in STEM. From being told that she couldn't be both an engineer and a mom, to learning that her bubbly enthusiasm is a strength and not a liability, Christine reflects on how embracing who she is shaped her career and impact. In conversation with host Sam East, hear how authentic leadership fuels better innovation outcomes, what to do when workplace feedback conflicts with your core values, and practical advice to create cultures where people feel safe bringing their whole selves to work. — The Society of Women Engineers is a powerful, global force uniting 50,000 members of all genders spanning 85 countries. We are the world's largest advocate and catalyst for change for women in engineering and technology. To join and access all the exclusive benefits to elevate your professional journey, visit membership.swe.org.
Stepping into STEM leadership doesn't require a senior title or having everything figured out. In this episode, Katie Ashley, volunteer coordinator of the SWE Early Career Professionals Affinity Group (SWE ECP AG), is joined by Zoe Husted, ECP AG conferences and awards co-chair and president of the SWE Golden Gate Section, and Kathryn Wittek, ECP AG design coordinator and president of the SWE Baltimore-Washington Section, to explore what leadership can look like in the beginning years of an engineering career. Drawing from their experiences in SWE, team sports, and technical roles, Zoe and Kathryn share when they started seeing themselves as leaders, how they navigate leading more experienced colleagues, and why learning and leading often happen at the same time. Hear their tips to find community as an early-career engineer, plus how the skills they have developed through SWE have translated into the workplace. The SWE ECP AG was formed to equip individuals with the support, resources, and inclusive community to excel in the first ten years of their career. Get involved and find out about upcoming ECP AG events at https://earlycareerprofessionalsag.swe.org/. — The Society of Women Engineers is a powerful, global force uniting 50,000 members of all genders spanning 85 countries. We are the world's largest advocate and catalyst for change for women in engineering and technology. To join and access all the exclusive benefits to elevate your professional journey, visit membership.swe.org.
From creating SWE-bench in a Princeton basement to shipping CodeClash, SWE-bench Multimodal, and SWE-bench Multilingual, John Yang has spent the last year and a half watching his benchmark become the de facto standard for evaluating AI coding agents—trusted by Cognition (Devin), OpenAI, Anthropic, and every major lab racing to solve software engineering at scale. We caught up with John live at NeurIPS 2025 to dig into the state of code evals heading into 2026: why SWE-bench went from ignored (October 2023) to the industry standard after Devin's launch (and how Walden emailed him two weeks before the big reveal), how the benchmark evolved from Django-heavy to nine languages across 40 repos (JavaScript, Rust, Java, C, Ruby), why unit tests as verification are limiting and long-running agent tournaments might be the future (CodeClash: agents maintain codebases, compete in arenas, and iterate over multiple rounds), the proliferation of SWE-bench variants (SWE-bench Pro, SWE-bench Live, SWE-Efficiency, AlgoTune, SciCode) and how benchmark authors are now justifying their splits with curation techniques instead of just “more repos,” why Tau-bench's “impossible tasks” controversy is actually a feature not a bug (intentionally including impossible tasks flags cheating), the tension between long autonomy (5-hour runs) vs. interactivity (Cognition's emphasis on fast back-and-forth), how Terminal-bench unlocked creativity by letting PhD students and non-coders design environments beyond GitHub issues and PRs, the academic data problem (companies like Cognition and Cursor have rich user interaction data, academics need user simulators or compelling products like LMArena to get similar signal), and his vision for CodeClash as a testbed for human-AI collaboration—freeze model capability, vary the collaboration setup (solo agent, multi-agent, human+agent), and measure how interaction patterns change as models climb the ladder from code completion to full codebase reasoning.We discuss:* John's path: Princeton → SWE-bench (October 2023) → Stanford PhD with Diyi Yang and the Iris Group, focusing on code evals, human-AI collaboration, and long-running agent benchmarks* The SWE-bench origin story: released October 2023, mostly ignored until Cognition's Devin launch kicked off the arms race (Walden emailed John two weeks before: “we have a good number”)* SWE-bench Verified: the curated, high-quality split that became the standard for serious evals* SWE-bench Multimodal and Multilingual: nine languages (JavaScript, Rust, Java, C, Ruby) across 40 repos, moving beyond the Django-heavy original distribution* The SWE-bench Pro controversy: independent authors used the “SWE-bench” name without John's blessing, but he's okay with it (”congrats to them, it's a great benchmark”)* CodeClash: John's new benchmark for long-horizon development—agents maintain their own codebases, edit and improve them each round, then compete in arenas (programming games like Halite, economic tasks like GDP optimization)* SWE-Efficiency (Jeffrey Maugh, John's high school classmate): optimize code for speed without changing behavior (parallelization, SIMD operations)* AlgoTune, SciCode, Terminal-bench, Tau-bench, SecBench, SRE-bench: the Cambrian explosion of code evals, each diving into different domains (security, SRE, science, user simulation)* The Tau-bench “impossible tasks” debate: some tasks are underspecified or impossible, but John thinks that's actually a feature (flags cheating if you score above 75%)* Cognition's research focus: codebase understanding (retrieval++), helping humans understand their own codebases, and automatic context engineering for LLMs (research sub-agents)* The vision: CodeClash as a testbed for human-AI collaboration—vary the setup (solo agent, multi-agent, human+agent), freeze model capability, and measure how interaction changes as models improve—John Yang* SWE-bench: https://www.swebench.com* X: https://x.com/jyangballinFull Video EpisodeTimestamps00:00:00 Introduction: John Yang on SWE-bench and Code Evaluations00:00:31 SWE-bench Origins and Devon's Impact on the Coding Agent Arms Race00:01:09 SWE-bench Ecosystem: Verified, Pro, Multimodal, and Multilingual Variants00:02:17 Moving Beyond Django: Diversifying Code Evaluation Repositories00:03:08 Code Clash: Long-Horizon Development Through Programming Tournaments00:04:41 From Halite to Economic Value: Designing Competitive Coding Arenas00:06:04 Ofir's Lab: SWE-ficiency, AlgoTune, and SciCode for Scientific Computing00:07:52 The Benchmark Landscape: TAU-bench, Terminal-bench, and User Simulation00:09:20 The Impossible Task Debate: Refusals, Ambiguity, and Benchmark Integrity00:12:32 The Future of Code Evals: Long Autonomy vs Human-AI Collaboration00:14:37 Call to Action: User Interaction Data and Codebase Understanding Research Get full access to Latent.Space at www.latent.space/subscribe
From creating SWE-bench in a Princeton basement to shipping CodeClash, SWE-bench Multimodal, and SWE-bench Multilingual, John Yang has spent the last year and a half watching his benchmark become the de facto standard for evaluating AI coding agents—trusted by Cognition (Devin), OpenAI, Anthropic, and every major lab racing to solve software engineering at scale. We caught up with John live at NeurIPS 2025 to dig into the state of code evals heading into 2026: why SWE-bench went from ignored (October 2023) to the industry standard after Devin's launch (and how Walden emailed him two weeks before the big reveal), how the benchmark evolved from Django-heavy to nine languages across 40 repos (JavaScript, Rust, Java, C, Ruby), why unit tests as verification are limiting and long-running agent tournaments might be the future (CodeClash: agents maintain codebases, compete in arenas, and iterate over multiple rounds), the proliferation of SWE-bench variants (SWE-bench Pro, SWE-bench Live, SWE-Efficiency, AlgoTune, SciCode) and how benchmark authors are now justifying their splits with curation techniques instead of just "more repos," why Tau-bench's "impossible tasks" controversy is actually a feature not a bug (intentionally including impossible tasks flags cheating), the tension between long autonomy (5-hour runs) vs. interactivity (Cognition's emphasis on fast back-and-forth), how Terminal-bench unlocked creativity by letting PhD students and non-coders design environments beyond GitHub issues and PRs, the academic data problem (companies like Cognition and Cursor have rich user interaction data, academics need user simulators or compelling products like LMArena to get similar signal), and his vision for CodeClash as a testbed for human-AI collaboration—freeze model capability, vary the collaboration setup (solo agent, multi-agent, human+agent), and measure how interaction patterns change as models climb the ladder from code completion to full codebase reasoning. We discuss: John's path: Princeton → SWE-bench (October 2023) → Stanford PhD with Diyi Yang and the Iris Group, focusing on code evals, human-AI collaboration, and long-running agent benchmarks The SWE-bench origin story: released October 2023, mostly ignored until Cognition's Devin launch kicked off the arms race (Walden emailed John two weeks before: "we have a good number") SWE-bench Verified: the curated, high-quality split that became the standard for serious evals SWE-bench Multimodal and Multilingual: nine languages (JavaScript, Rust, Java, C, Ruby) across 40 repos, moving beyond the Django-heavy original distribution The SWE-bench Pro controversy: independent authors used the "SWE-bench" name without John's blessing, but he's okay with it ("congrats to them, it's a great benchmark") CodeClash: John's new benchmark for long-horizon development—agents maintain their own codebases, edit and improve them each round, then compete in arenas (programming games like Halite, economic tasks like GDP optimization) SWE-Efficiency (Jeffrey Maugh, John's high school classmate): optimize code for speed without changing behavior (parallelization, SIMD operations) AlgoTune, SciCode, Terminal-bench, Tau-bench, SecBench, SRE-bench: the Cambrian explosion of code evals, each diving into different domains (security, SRE, science, user simulation) The Tau-bench "impossible tasks" debate: some tasks are underspecified or impossible, but John thinks that's actually a feature (flags cheating if you score above 75%) Cognition's research focus: codebase understanding (retrieval++), helping humans understand their own codebases, and automatic context engineering for LLMs (research sub-agents) The vision: CodeClash as a testbed for human-AI collaboration—vary the setup (solo agent, multi-agent, human+agent), freeze model capability, and measure how interaction changes as models improve — John Yang SWE-bench: https://www.swebench.com X: https://x.com/jyangballin Chapters 00:00:00 Introduction: John Yang on SWE-bench and Code Evaluations 00:00:31 SWE-bench Origins and Devon's Impact on the Coding Agent Arms Race 00:01:09 SWE-bench Ecosystem: Verified, Pro, Multimodal, and Multilingual Variants 00:02:17 Moving Beyond Django: Diversifying Code Evaluation Repositories 00:03:08 Code Clash: Long-Horizon Development Through Programming Tournaments 00:04:41 From Halite to Economic Value: Designing Competitive Coding Arenas 00:06:04 Ofir's Lab: SWE-ficiency, AlgoTune, and SciCode for Scientific Computing 00:07:52 The Benchmark Landscape: TAU-bench, Terminal-bench, and User Simulation 00:09:20 The Impossible Task Debate: Refusals, Ambiguity, and Benchmark Integrity 00:12:32 The Future of Code Evals: Long Autonomy vs Human-AI Collaboration 00:14:37 Call to Action: User Interaction Data and Codebase Understanding Research
Karen Horting, Executive Director and CEO of the Society of Women Engineers, talks about SWE's archives at the Reuther Library and shares how the 75-year-old organization leverages its history to advocate for the inclusion of women in science, technology, engineering, and mathematics (STEM). Related Resources: Society of Women Engineers 75th Anniversary SWE Archives Virtual Tour [Part 1] SWE Archives Virtual Tour [Part 2] Related Collections: Society of Women Engineers Records (LR001539) Society of Women Engineers Publications (LR002487) Episode Credits Interviewee: Karen Horting Producers: Dan Golodner and Troy Eller English Music: Bart Bealmear
Emily breaks down her 2025 Holiday Gift Guide through the lens of the five pillars of sexual intelligence—embodiment, health, self-knowledge, self-acceptance, and collaboration. She explores how shifting from performance to presence can transform your relationship with pleasure, offering curated tools and practices that help you slow down, feel your body, and understand yourself as a sexual being. This episode is for anyone ready to move beyond quick fixes and embrace a more holistic approach to sexuality, turning intimacy into something that nourishes your whole system. In this episode, you'll learn: • The five pillars of Sex IQ create a holistic framework for understanding your sexuality and taking responsibility for your pleasure—moving you from performance to presence • Embodiment practices like breathwork and sonic wave toys help you overcome the mind-body disconnect during sex by bringing you back to physical sensations instead of staying in your head worrying about how you look or what you need to do • Self-acceptance isn't about waiting for your body to change—it's about recognizing and reframing the pleasure thieves (stress, trauma, and shame) that tell you you're unworthy, and actively replacing negative thought patterns with affirmations that honor your body right now More Dr. Emily: • The 2025 Sex With Emily Holiday Gift Guide • Apply for Emily's 1:1 Coaching Opportunity HERE or reach out to enrollment@sexwithemily.com for more information • Shop With Emily! Explore Emily's favorite toys, pleasure accessories, bedroom essentials, and more — designed to support your pleasure and confidence. Free shipping on orders $99+ (some exclusions apply). • Join the SmartSX Membership: Access exclusive sex coaching, live expert sessions, community building, and tools to enhance your pleasure and relationships with Dr. Emily Morse. • Yes! No! Maybe? List & Other Sex With Emily Guides: Explore pleasure, deepen connections, and enhance intimacy using these Sex With Emily downloadable guides. • The only sex book you'll ever need: Smart Sex: How to Boost Your Sex IQ and Own Your Pleasure • Want more? Visit the Sex With Emily Website • Let's get social: Instagram | X | Facebook | TikTok | Threads | YouTube • Let's text: Sign up here • Want me to slide into your email inbox? Sign Up Here for sex tips on the regular. Shop the Holiday Gift Guide now! (See the full Gift Guide HERE) Embodiment: • LELO SONA 3- Use code EMILY20 for 20% on top of ongoing sales at lelo.com • Common Confidential Massage Butter- Use code SEXWITHEMILY for 15% off at Commonconfidential.com. • Cornbread Hemp - Use code SWE for a discount at cornbreadhemp.com/swe Health: • Bathmate Hydro Pump- Use SWE10 for 10% off your order at bathmatedirect.com • HigherDOSE Infrared Sauna Blanket- Head to higherdose.com • V-Health Vaginal Rejuvenation Gel- Use code EMILY10 for 10% off at getvhealth.com • Kroma Wellness Beauty Matcha Latte - Use code SEXWITHEMILY for 15% off at kromawellness.com Collaboration: • Promescent Delay Spray- Head to promescent.com • LELO F2S- Use code EMILY20 for 20% on top of ongoing sales at lelo.com • Crave Leather Handcuffs- Head to shop.sexwithemily.com/crave Self- Knowledge: • Magic Wand Mini - Head to shop.sexwithemily.com/magicwand • Je Joue Hera Flex Rabbit Vibrator - Go to sexwithemily.com/hera and use code EMILY20 for 20% off • SmartSX- Head to sexwithemily.com/smartsx Self- Acceptance: • The Class - Head to theclass.com • Droplette- Head to droplette.io This episode is sponsored by… Biolouve- Get 15% off with code EMILYPOWERMOVE @ https://biolouve.com/ Timestamps 0:00 - Introduction 1:25 - The 5 Pillars of Sex IQ Framework 2:36 - Pillar 1: Embodiment 10:23 - Pillar 2: Health 17:52 - Pillar 3: Collaboration 24:44 - Pillar 4: Self-Knowledge 29:39 - Pillar 5: Self-Acceptance 34:21 - Wrap-Up
This episode is sponsored by Bechtel. Josefina Alvarez and Kira McKay, supplier quality representatives at Bechtel, sit down with SWE President-Elect Kerrie Greenfelder to discuss how they landed their first jobs through the SWE and SHPE career fairs and also discovered a side of engineering they never knew existed. Recorded live at the WE25 Diverse Podcast Studio in New Orleans, hear about the behind-the-scenes components of engineering — from supply chain to quality systems — and how these roles make iconic projects possible. Kira and Josefina share candid advice for engineering students and new grads, what they've learned inside the Bechtel Supplier Quality and Expediting (BSQE) program, and how mentorship, curiosity, and saying “yes” to unfamiliar paths shaped their early careers. — The Society of Women Engineers is a powerful, global force uniting 50,000 members of all genders spanning 85 countries. We are the world's largest advocate and catalyst for change for women in engineering and technology. To join and access all the exclusive benefits to elevate your professional journey, visit membership.swe.org.
This episode features Dianne Na Penn, a senior product leader at Anthropic, discussing the launch of Claude Opus 4.5 and the evolution of frontier AI models. The conversation explores how Anthropic approaches model development—balancing ambitious capability roadmaps with user feedback, making strategic bets on areas like agentic coding and computer use while deliberately avoiding others like image generation. Dianne shares insights on the shifting nature of AI evaluation (moving beyond saturated benchmarks like SWE-bench toward more open-ended measures), the evolution of scaffolding from "training wheels" to intelligence amplifiers, and why she believes we're closer to transformative long-running AI than most people think. She also discusses Anthropic's distinctive culture of authenticity, the under appreciated benefits of model alignment for producing independent-thinking AI, and why the real bottleneck to AI agents isn't model capability anymore but product innovation. (0:00) Intro(0:57) Starting the Work on Opus 4.5(2:04) Model Capabilities and Surprises(5:59) Computer Use and Practical Applications(7:21) Pricing and Positioning(10:02) Customer Feedback and Early Access(16:44) The Reality of Enterprise Agents(18:47) Future of AI and Long-Running Intelligence(28:06) Anthropic's Culture and Decision Making(30:31) Key Decisions and Fun Moments(33:45) Quickfire With your co-hosts: @jacobeffron - Partner at Redpoint, Former PM Flatiron Health @patrickachase - Partner at Redpoint, Former ML Engineer LinkedIn @ericabrescia - Former COO Github, Founder Bitnami (acq'd by VMWare) @jordan_segall - Partner at Redpoint
The common advice for career growth is to “move up” into management — but what if your true passion lies in staying close to the technology itself? In this episode, host Sam East speaks with Deb Whitis, Ph.D., and Amrita Maguire, both of the SWE Technical Career Path Affinity Group, about what it means to grow, lead, and make an impact as engineers while staying on the technical career path. From developing nickel-based superalloys that power jet engines to advancing ergonomic standards and AI-enabled design, Deb and Amrita reflect on their careers and share how technical leadership can be just as influential as managing people. They also highlight the work of SWE's Technical Career Path Affinity Group, including a new mentorship program helping women chart their own path as innovators, inventors, and subject matter experts. Get involved with the SWE Technical Career Path Affinity Group: https://affinitygroups.swe.org/technical-career-path/ — The Society of Women Engineers is a powerful, global force uniting 50,000 members of all genders spanning 85 countries. We are the world's largest advocate and catalyst for change for women in engineering and technology. To join and access all the exclusive benefits to elevate your professional journey, visit membership.swe.org.
Google just released Gemini 3, and I tested it with 3 real business use cases: prepping for a $200K sales call, building an interactive sales dashboard, and creating a complete website from scratch—all in minutes.In this video, I show you LIVE examples of:✅ Gemini Agent prepping an entire sales call (research, pitch angles, objection handling, follow-ups)✅ Dynamic dashboards with real-time calculations and interactive sliders✅ Full website generation with 921 lines of working code in under 60 seconds✅ Custom image generation and prompt engineeringThis isn't theory—I'm screen recording everything as it happens so you can see the actual speed and quality.⏱️ TIMESTAMPS:0:00 - Intro: Why Gemini 3 is a Big Deal1:15 - Benchmark Breakdown (Why This Matters)2:27 - Use Case #1: $200K Sales Call Prep4:46 - Visual Layout & Interactive Infographic Demo7:03 - Use Case #2: Building a Website from Scratch (Live Code)9:27 - Use Case #3: Interactive Q4 Sales Dashboard11:51 - Testing the Prompt Generator Live12:45 - Final Thoughts & Why This Changes Everything
Grappling Rewind: Breakdowns of Professional BJJ and Grappling Events
This week on the show Maine recaps the 2025 ADCC North American trials live from Orlando Florida. We recap all the divisions starting with the Men'sWe discuss the -66.0 kg Dorian Olivarez-77.0 kg Jacob Bornemann-88.0 kg Jon Blank-99.0 kg Achilles Rocha+99.0 kg Brandon ReedThen move into the recap of the womens division-55.0 kgAna Mayordomo-65.0 kg Morgan "Mo" Black+65.0 kg Maia MatalonRecorded 11-17-2025
In honor of National Native American Heritage Month, SWE CEO and Executive Director Karen Horting sits down with Sarah EchoHawk, president and CEO of Advancing Indigenous People in STEM (AISES), to discuss visibility, allyship, and access for Indigenous engineers. Sarah shares her family's deep legacy of public service, the role of tribal colleges in reclaiming education, and how Indigenous knowledge systems — from fire science to environmental stewardship — can help solve global challenges. Plus, hear how employers, educators, and organizations like SWE can strengthen partnerships with AISES to ensure Indigenous voices are included in the future of STEM. — The Society of Women Engineers is a powerful, global force uniting 50,000 members of all genders spanning 85 countries. We are the world's largest advocate and catalyst for change for women in engineering and technology. To join and access all the exclusive benefits to elevate your professional journey, visit membership.swe.org.
This episode is sponsored by Resideo. In this episode, two engineers share how curiosity, empathy, and innovation shape their work creating smart home technologies. Helen Meza, systems integration lead engineer at Resideo, shares her inspiring path from Peru to the U.S. and discusses how she leads global teams with empathy and adaptability. Kyra Neal, mechanical design engineer at Resideo, reflects on how she went from designing 3D printers to designing thermostats and dehumidifiers that make tangible differences in people's lives — and how curiosity played a role in her STEM career. In conversation with Larry Guthrie, director of content strategy at SWE, hear Helen's insights on incorporating cultural awareness into leadership, Kyra's lessons on finding growth through every challenge, and how Resideo fosters an environment where engineers can make an impact. — The Society of Women Engineers is a powerful, global force uniting 50,000 members of all genders spanning 85 countries. We are the world's largest advocate and catalyst for change for women in engineering and technology. To join and access all the exclusive benefits to elevate your professional journey, visit membership.swe.org.
Jessica reports LIVE from Jakarta on all the details from day two of women's podium training. World Championships Headquarters Videos, Interviews, Podcasts, Fantasy, Guides Extended Episode + Live Q&A (Members) +30 extra minutes of analysis, behind-the-scenes secret stories, plus member questions. Here's how to ask questions live. Can't make it live? Add Club bonus episodes to your favorite podcast player (instructions here). Chapters 00:00 – Show Intro 01:02 – Zhang Qingying beam world champion prediction 03:00 – FIG Press Conference recap: AI D-scores and visa issue 08:40 – Spencer's updates: where to watch & fantasy game deadlines 11:45 – U.S. Women's Team podium training report (Josc, Skye, Dulcy, Leanne) 17:20 – Can Josc vault? Exclusive Olympic Channel interview 19:45 – Equipment update: white mats and “China mat overlay” 22:10 – Mixed Zone highlights (Malabuyo, South Africa, Asia's coach impression) 25:05 – Italy updates: Perotti, Asia D'Amato, Fioravanti AA potential 29:45 – Melnikova and Russia (AIN) podium impressions 31:30 – Flavia Saraiva's 10.0 leotard and Brazilian updates 33:10 – Funniest & coolest skills of the day (Chile, India, Portugal) 33:55 – BTS Teaser begins 34:00 – Embarrassing moments & Watanabe press conference story 36:40 – Beam fall hilarity (NZL gymnast) 38:15 – Opposite of Canadian medical intervention 40:00 – The great Indonesian tampon saga 42:25 – Sub 4: NZL, LIE, USA, CRO, BAN, GBR, POL 45:10 – Ruby Evans Amanar, GB bars, Alia Leat injury update 47:05 – Sub 5: MAS, SUI, ITA, FRA, VIE, ISL, MAR 49:00 – Thelma's floor, Osyssek's beam, Ming Van Eijken vaults 51:05 – Sub 6: AUS, EGY, BEL, LAT, ROU, MGL, SWE, CRC 53:00 – Voinea full Gothic mode, Golgota AA, Romanian updates 56:20 – Sub 7: INA, TUN, COL, PHI, MEX, SYR 58:00 – Finnegan & Malabuyo AA, Seema Tello debut 1:00:10 – Sub 8: NOR, BRA, QAT, IND, RSA, CHI 1:02:15 – Flavia & Brazil updates, Rooskrantz, Chilean grandmas 1:05:00 – Sub 9: AIN, NAM, POR, THA, BUL, SLO, CMR 1:07:25 – Melnikova Cheng, Cameroon floor joy, AIN medal watch 1:10:10 – Sub 10: ESP, AIN, HUN, HKG, CHN, KZN, CZE 1:12:25 – Zhou Yaqin & Zhang Qingying on beam, Deng Yalan vault 1:15:30 – Alba Petisco all-around standout 1:17:10 – Feedback: listener comments from Dr. Ben & Absolutely Not 1:21:20 – Show Close: Women's qualifying preview & thanks How Do I Watch the Competition? All sessions of the competition will be streamed on Eurovision Sport. Follow along here! Gymnastics Indonesia's YouTube channel will stream all qualification sessions Live scores from the FIG and Swiss Timing Check out NBC's behind-the-scenes mini-doc on the US Women's World Trials Headlines What happened at podium training today? Should we be worried about the US women? From the Olympic Channel: Joscelyn Roberson has been struggling to "find her block" on vault Skye's HUGE front-handspring front on beam Who else from Florida came to join the 2025 World Championships party? Giulia Perotti (Italy) looks ready to win all the medals Who will be the second Italian competing all-around? The D'Amato vs. Fioravanti dilemma Angelina Melnikova is so back How did her vaults look? WE NEED TO TALK ABOUT BRAZIL'S GENIUS LEOS Flavia showed beam and floor - how'd it go? Who wins the award for coolest/best/most fun skill from podium training? What were Jessica's mixed zone highlights? The FIG held a press conference today. What information did we learn? The FIG announced that "spectators will be able to see AI D-scores," but what does this mean? The FIG addressed the visa vs. FIG rules issue. What did FIG president Watanabe have to say? Jakarta Updates GymCastic Updates Subscribe to our YouTube Channel Coming Up 6 days of LIVE podcasts at World Championships in Jakarta Club members get extended coverage and can join us live to ask questions immediately after the meet Play our World Championships Fantasy Game! Win a Club Gym Nerd Scholarship: Go to our Forum > Show Stuff > GymCastic Scholarship We are matching every new sponsorship If you would like access to the club content, but aren't currently in a position to purchase a membership, all you need to do is fill out the form that's linked in our message board If you would also like to sponsor a scholarship, please email editor@gymcastic.com. Thank you! Support Our Work Club Gym Nerd: Join Here Become a Sponsor: GymCastic is matching all donations Nearly 50 scholarships have been awarded so far Learn More Headstand Game: Play Now Forum: Start Chatting Merch: Shop Now Thank you to our Sponsors Gymnastics Medicine Beam Queen Bootcamp's Overcoming Fear Workshop Resources Jakarta schedule & times: See our live podcast times on the Worlds HQ schedule Guides: Download the quick-reference guide on the Jakarta Headquarters page The Balance Beam Situation: Spencer's GIF Code of Points Gymnastics History and Code of Points Archive from Uncle Tim Kensley's men's gymnastics site Neutral Deductions Unlock the Extended Episode Join Club Gym Nerd → Choose a plan Complete checkout — your site account is created. Log in here → /my-account/ Return to this page and refresh. The extended player appears automatically.
Ken Rosenthal's Bad LookLetter From Beyond The GraveHe Must Be Really Good!TEMU For QB'sWe're Not Doing Stonehenge ToniteWe Talkin' About Practice!About That Injury Report ThingHerm Edwards, Your Rant Is SafeAnother Reason To Loathe The ChiefsA Day Later, SorryDo You Even Kicker, Bro?That Play You Saw? No You Didn'tJoe Burrow To The SlaughterGo To the Game, It'll Be Fun!Cracker Barrell Total Surrender!Our Sponsors:* Check out Hims: https://hims.com/CZABE* Check out Indeed: https://indeed.com/CZABEAdvertising Inquiries: https://redcircle.com/brandsPrivacy & Opt-Out: https://redcircle.com/privacy
Join us in this exciting episode of The Edge of Show as we dive deep into the world of Bitcoin DeFi with two industry leaders: MacLane Wilkison, co-founder and CEO of Threshold Labs, and Jameel Khalfan, head of ecosystem development at Sui Foundation.In this episode, we explore:The transformative potential of TBDC (Threshold Bitcoin Decentralized Currency) and how it is powering Bitcoin's breakout moment on the SWE ecosystem.The rapid evolution of Bitcoin DeFi and the seamless bridging and yield strategies that are emerging.Insights into the future of Bitcoin as a dynamic financial asset and its role in the broader DeFi landscape.Discover how the integration of Bitcoin with decentralized finance is reshaping the financial ecosystem, making it more accessible and user-friendly. We also discuss the importance of security, liquidity, and the collaborative efforts between Threshold and SWE to foster innovation and growth in the Bitcoin space.Whether you're a seasoned crypto enthusiast or just starting your journey, this episode is packed with valuable insights and exciting developments in the world of Bitcoin and DeFi.Don't forget to like, subscribe, and hit the notification bell to stay updated on our latest episodes!Support us through our Sponsors! ☕