Podcasts about mellanox

Israeli-American multinational supplier of computer networking products

  • 66PODCASTS
  • 118EPISODES
  • 37mAVG DURATION
  • ?INFREQUENT EPISODES
  • Apr 30, 2025LATEST
mellanox

POPULARITY

20172018201920202021202220232024


Best podcasts about mellanox

Latest podcast episodes about mellanox

Unholy: Two Jews on the news
77 and counting - with Eyal Waldman (featuring Emily Damari)

Unholy: Two Jews on the news

Play Episode Listen Later Apr 30, 2025 63:54


Unholy is going live in London! Join Yonit Levi and Jonathan Freedland for a special night of news and great guests: Yuval Noah Harari, Andy Nyman and Mira Awad—live on stage, June 8th 2025. If you've ever wanted to see the podcast come to life, now's your chance. Reserve your seat now via the link—space is limited, and we'd love to see you there! https://bit.ly/UnholyLondonLive Join our Patreon community to get access to bonus episodes, discounts on merch and more: https://bit.ly/UnholyPatreonVisiting London or Tel Aviv? We've got special edition T-shirts in the Unholy Store! https://bit.ly/UnholyStore As literal winds of fire sweep across Israel, the country marks its 77th Independence Day. But even in a week that should have offered unity and reflection, the political winds howled as well: a public clash between the head of the Shin Bet and the prime minister came to a head, and internal divisions between left and right spilled over—alarmingly—into acts of violence, even amid the solemnity of Memorial Day.Looking for a note of hope as Israel enters its 78th year, we turned to Eyal Waldman—tech visionary, founder of the Mellanox company which sold to Nvidia for $6.9 billion, and, more recently, a father in mourning. His daughter, Danielle, was murdered at the Nova music festival on October 7th. Eyal speaks candidly about grief, Israel's fracturing political landscape, whether he sees a role for himself in public life—and about the perils and promise of artificial intelligence.And in a special moment: a cameo from the remarkable Emily Damari, who offers a few heartfelt words to our listeners - and a reminder of what binds us.  Unholy is going live in London!Join Yonit Levi and Jonathan Freedland for a special night of news and surprises—live on stage, June 8th 2025. If you've ever wanted to see the podcast come to life, now's your chance. Reserve your seat now via the link—space is limited, and we'd love to see you there! https://bit.ly/UnholyLondonLive

China Daily Podcast
英语新闻丨英伟达被查合法合理

China Daily Podcast

Play Episode Listen Later Dec 11, 2024 5:25


No one should question China's resolve to continue to promote high-quality opening-up through the rule of law. It is mainly thanks to the consistent legislative efforts China has made over the years to safeguard the legitimate rights and interests of foreign enterprises that the country has turned itself into one of the world's top destinations for foreign direct investment.任何人都不应质疑中国继续用法治推进高质量开放的决心。中国发展成为全球外商直接投资首选地之一,得益于多年来中国不断通过立法维护在华外企合法权益。Those who want to use the antitrust probe the Chinese authorities have launched into US chipmaker Nvidia as a pretext to point an accusing finger at China for allegedly politicizing the business environment for foreign companies are not only turning a blind eye to the facts of the case, but also taking a double standard when it comes to antitrust scrutiny.那些想以中国当局对美国芯片企业英伟达发起反垄断调查为借口,指责中国涉嫌将外国公司的商业环境政治化的人,不仅无视了事实,还在反垄断调查上持双重标准。The investigation that China's top market regulator initiated on Monday into Nvidia for suspected monopolistic behaviors has caught a lot of media attention given the high-profile position the company holds as the world's leading chip producer and a key player driving the artificial intelligence revolution.12月9日,中国国家市场监管总局对英伟达涉嫌垄断行为的调查引起了媒体的广泛关注。英伟达作为全球芯片行业巨头,在推动全球AI革命中扮演着关键角色。While much of the Western media have portrayed the probe as a tactical move in the Sino-US trade war ahead of the new US administration taking office or tried to link it with what they hype up as intensified geopolitical rivalry focused on a battle for AI dominance between the two countries, the actual reason is more prosaic. According to the State Administration for Market Regulation, Nvidia is suspected of violating China's Anti-Monopoly Law, as well as the commitments it made in 2020 after it acquired Israeli chip manufacturer Mellanox Technologies in 2019.多家西方媒体报道称,在美国新政府上台前对英伟达展开调查,是中国对中美贸易战采取的一项战术性措施;还有媒体试图对此进行炒作,称调查与中美两国争夺AI发展主导权而导致地缘竞争日渐激烈有关。然而,真正原因很简单:据中国国家市场监管总局,英伟达涉嫌违反《反垄断法》及其于2020年完成收购以色列芯片企业迈络思科技时的承诺。The merger further strengthened Nvidia's market dominance in the semiconductor field, which might have the effect of excluding or restricting competition in the global and Chinese markets for GPU accelerator, dedicated network interconnection equipment and high-speed Ethernet adapter. Nvidia therefore submitted to China some measures to resolve the competition problems making clear commitments, including that it should continue to supply Nvidia GPU accelerators, Mellanox high-speed network interconnection equipment and related software and accessories to the Chinese market after the deal, based on "fair, reasonable and nondiscriminatory principles". After that, China approved the transaction.英伟达收购迈络思,进一步巩固了英伟达在半导体行业的领导地位。此项集中对全球及中国GPU加速器、专用网络互联设备、高速以太网适配器市场,具有或可能具有排除、限制竞争效果。为此,英伟达提交了解决该交易带来竞争问题的措施,作出了明确承诺,其中就包括交易双方和集中后实体“应依据公平、合理、无歧视原则”向中国市场继续供应英伟达GPU加速器、迈络思高速网络互联设备和相关软件、配件。经评估,中国国家市场监管总局批准了该交易。Yet, Nvidia has stopped supplying a number of GPU accelerator products to China in recent years on the grounds of the US government's export controls. This action has infringed upon the legitimate rights and interests of relevant Chinese enterprises. Therefore, it is not surprising that Nvidia is under investigation for allegedly violating antitrust laws. Article 46 of the Anti-Monopoly Law stipulates that antitrust enforcement agencies are authorized to investigate and take action against suspected monopolistic behaviors. Effectively implementing the conditions attached to the merger approval is both a proactive commitment from Nvidia and a legal obligation.但近年来,英伟达以美国政府不断扩大半导体出口管制为由,陆续停止了多款GPU加速器产品对中国的供应,侵害了中国相关企业合法权益。英伟达涉嫌违反《反垄断法》规定,对其立案调查也在意料之中。《反垄断法》第四十六条规定,反垄断执法机构依法对涉嫌垄断行为进行调查。有效执行审查决定附加的限制性条件,既是英伟达的主动承诺,也成为了其法定义务。In fact, Nvidia is also facing an antitrust investigation in the United States, as the Justice Department is looking into claims that Nvidia is potentially cornering the market and pressuring its customers to unfairly retain business. That includes allegations of Nvidia threatening to punish those who buy products from both itself and its competitors at the same time. The European Union's antitrust regulators are also investigating Nvidia for possible unfair sales practices.事实上,英伟达在美国也面临反垄断调查。美国司法部正对英伟达展开了反垄断调查,评估英伟达是否垄断市场,阻止客户使用竞品。有指控称,英伟达威胁客户,若同时从英伟达及其竞争对手购入商品,将受到惩罚。同时,欧盟反垄断监管机构也在调查英伟达可能存在的不公平销售行为。That Nvidia has so far responded in a low-key way—saying only "we are happy to answer any questions regulators may have about our business"—points to the confidence it has in China's legal environment. Actually, the company has taken China as one of its key global markets, with about 16 percent of its revenue coming from the country, second only to its US-generated revenue, according to data firm FactSet. Nvidia's Chief Executive Officer Jensen Huang has called China "a very important market for the technology industry", and warned there would be "enormous damage" to the US companies if they were unable to trade with China.迄今,英伟达低调回应此次反垄断调查,表示:“我们很乐意回答监管机构对我们业务提出的任何疑问”,显示出对中国法治化营商环境的信心。实际上,英伟达已将中国视为其重要国际市场之一,据FactSet的数据显示,英伟达约16%的收入来自中国,仅次于美国。英伟达CEO黄仁勋曾表示,中国市场“对科技产业非常重要”,并警告称,若无法与中国贸易,美国企业将遭受“巨大损失”。China has made it one of its top priorities to attract and use foreign investment. That it attracted 1.13 trillion yuan ($158.7 billion) in foreign investment in 2023, the third-highest in history, as compared with 941.52 billion yuan in 2019, is an indication that the country still enjoys strong competitiveness in the global investment market.中国把吸引和利用外资作为政策重点之一。2023年,中国实际使用外资金额为1.13万亿元人民币(约1587亿美元),处于历史第三高,与2019年的9415.2亿元相比,表明中国在全球投资市场上仍享有强大竞争力。Contrary to any attempts to use the Nvidia investigation as a means to discredit China's efforts to create a level playing field for foreign businesses, the probe shows that China's business environment operates under the law. As it has affirmed on many occasions, the country will continue to develop a market-oriented, legalized and internationalized first-class business environment in which foreign companies can enter the Chinese market and share the country's development dividends.对英伟达展开反垄断调查,不仅不能抹煞中国为在华外企营造公平竞争环境的努力,反而凸显出中国法治化的营商环境。正如中国在各种场合多次重申的那样,中国将继续营造市场化、法治化、国际化一流营商环境,欢迎外企来华,共享中国发展红利。prosaicadj.乏味的;平淡无奇的hypen.(新闻媒体的)大肆宣传,炒作dividendn.红利,股息

The Daily Crunch – Spoken Edition
ByteDance asks appeals court to temporarily block sell-or-ban law, Apple sued over abandoning CSAM detection for iCloud ... and more tech news

The Daily Crunch – Spoken Edition

Play Episode Listen Later Dec 10, 2024 9:25


ByteDance and TikTok filed an emergency motion on Monday asking an appeals court to temporally block the law that would ban TikTok in the U.S. unless the social network divests from Chinese ownership by January 19. Also, Apple is being sued over its decision not to implement a system that would have scanned iCloud photos for child sexual abuse material (CSAM). The lawsuit argues that by not doing more to prevent the spread of this material, it's forcing victims to relive their trauma, according to The New York Times; A U.S. breast-screening program claims to demonstrate the potential benefits of using artificial intelligence (AI) in mammography screening, with women who paid for AI-enhanced scans 21% more likely to have cancer detected. DeepHealth, an AI firm owned by radiology giant RadNet, presented its findings at the annual meeting of the Radiological Society of North America; When it comes to market capitalization, Nvidia is currently the second-biggest public company in the world, behind Apple. That's why all eyes are on Nvidia these days. And now, as Bloomberg spotted, China Central Television, a public TV broadcaster, is reporting that China's market regulator has opened a probe into Nvidia's acquisition of Mellanox; Nikola Corp., a producer of battery and hydrogen-electric trucks, has taken several steps to repay its debts and raise equity, including offering up to $100 million in a common stock sale.  Learn more about your ad choices. Visit podcastchoices.com/adchoices

Morgans Financial Limited
Morgans AM: Wednesday, 11 December 2024

Morgans Financial Limited

Play Episode Listen Later Dec 10, 2024 6:06


US equity markets declined ahead of the latest inflation figures tonight AEST - Dow retreated for a fourth straight session, down -154-points or -0.35%. Nvidia Corp fell -2.69%, extending its two-day slide to over >5% after China's State Administration for Market Regulation said it was investigating the company over possible violations of the country's antimonopoly law, opening an investigation into the chipmaker in relation to the acquisition of Mellanox and some agreements made during the acquisition. Nvidia's revenue in China totalled US$13.5B in the past four quarters, accounting for ~12% of its global total, according to The Wall Street Journal (WSJ). Caterpillar Inc (-2.72%) and Merck & Co Inc (-2.69%) both fell over >2.5%. Boeing Co rallied +4.50% after the aerospace giant said it had restarted production of its 737 MAX jets. Production was paused for more than 12 weeks because of a seven-week labour strike that began in mid-September and settled in early November.

Morgans AM
Wednesday, 11 December 2024: US equity markets declined ahead of the latest inflation figures

Morgans AM

Play Episode Listen Later Dec 10, 2024 6:07


US equity markets declined ahead of the latest inflation figures tonight AEST - Dow retreated for a fourth straight session, down -154-points or -0.35%.  Nvidia Corp fell  -2.69%, extending its two-day slide to over >5% after China's State Administration for Market Regulation said it was investigating the company over possible violations of the country's antimonopoly law, opening an investigation into the chipmaker in relation to the acquisition of Mellanox and some agreements made during the acquisition. Nvidia's revenue in China totalled US$13.5B in the past four quarters, accounting for ~12% of its global total, according to The Wall Street Journal (WSJ). Caterpillar Inc (-2.72%) and Merck & Co Inc (-2.69%) both fell over >2.5%. Boeing Co rallied +4.50% after the aerospace giant said it had restarted production of its 737 MAX jets. Production was paused for more than 12 weeks because of a seven-week labour strike that began in mid-September and settled in early November.

NY to ZH Täglich: Börse & Wirtschaft aktuell
Nvidia schwach, Palantir stark | New York to Zürich Täglich | Swissquote

NY to ZH Täglich: Börse & Wirtschaft aktuell

Play Episode Listen Later Dec 9, 2024 15:39


Letzte Woche gab es sehr überzeugende Zahlen zum Wochenausklang und der Wochenstart ist zunächst verhalten. Keine Konjunkturdaten aus den USA, stattdessen verarbeiten die Marktteilnehmer die schwächeren Konjunkturdaten aus China und eine Meldung aus dem Land der Mitte zu NVIDIA. Deren Aktien standen am Montag unter Druck, nachdem eine chinesische Aufsichtsbehörde erklärt hatte, sie ermittle gegen den Chiphersteller wegen möglicher Verstöße gegen das Kartellrecht des Landes. Die staatliche Marktregulierungsbehörde habe im Zusammenhang mit der Übernahme von Mellanox eine Untersuchung gegen den Chiphersteller eingeleitet, teilte die chinesische Regierung am Montag mit. Die USA haben Nvidia und anderen wichtigen Halbleiterherstellern den Verkauf ihrer fortschrittlichsten KI-Chips an China untersagt, um die Stärkung des Militärs des Landes zu verhindern. Vorbörslich geht es gute 2 % nach unten mit der Aktie. Palantir kooperiert mit dem Hersteller autonomer Waffensysteme Anduril. Die Aktie klettert vorbörslich 7,2 % Abonniere den Podcast, um keine Folge zu verpassen! ____ Folge uns, um auf dem Laufenden zu bleiben: • Facebook: http://fal.cn/SQfacebook • Twitter: http://fal.cn/SQtwitter • LinkedIn: http://fal.cn/SQlinkedin • Instagram: http://fal.cn/SQInstagram

Wall Street mit Markus Koch
NVIDIA unter Druck | Palantir mit neuem AZH?

Wall Street mit Markus Koch

Play Episode Listen Later Dec 9, 2024 27:46


EXKLUSIVER NordVPN Deal ➼ https://nordvpn.com/Wallstreet Jetzt risikofrei testen mit einer 30-Tage-Geld-zurück-Garantie! +++ Alle Rabattcodes und Infos zu unseren Werbepartnern findet ihr hier: https://linktr.ee/wallstreet_podcast +++ Ein Podcast - featured by Handelsblatt. Letzte Woche gab es sehr überzeugende Zahlen zum Wochenausklang und der Wochenstart ist zunächst verhalten. Keine Konjunkturdaten aus den USA, stattdessen verarbeiten die Marktteilnehmer die schwächeren Konjunkturdaten aus China und eine Meldung aus dem Land der Mitte zu NVIDIA. Deren Aktien standen am Montag unter Druck, nachdem eine chinesische Aufsichtsbehörde erklärt hatte, sie ermittle gegen den Chiphersteller wegen möglicher Verstöße gegen das Kartellrecht des Landes. Die staatliche Marktregulierungsbehörde habe im Zusammenhang mit der Übernahme von Mellanox eine Untersuchung gegen den Chiphersteller eingeleitet, teilte die chinesische Regierung am Montag mit. Die USA haben Nvidia und anderen wichtigen Halbleiterherstellern den Verkauf ihrer fortschrittlichsten KI-Chips an China untersagt, um die Stärkung des Militärs des Landes zu verhindern. Vorbörslich geht es gute 2 % nach unten mit der Aktie. Palantir kooperiert mit dem Hersteller autonomer Waffensysteme Anduril. Die Aktie klettert vorbörslich 7,2 %

Morgans Financial Limited
Morgans AM: Tuesday, 10 December 2024

Morgans Financial Limited

Play Episode Listen Later Dec 9, 2024 8:51


US equity markets retreated, with the S&P 500 and Nasdaq pulling back from record closing highs set last Friday (6 December) - Dow fell -241-points or -0.54%, with International Business Machines (IBM) Corp (down 3.38%) and Travelers Companies (-3.53%) both down over >3%. Nvidia Corp fell -2.55% after China's State Administration for Market Regulation said it was investigating the company over possible violations of the country's antimonopoly law, opening an investigation into the chipmaker in relation to the acquisition of Mellanox and some agreements made during the acquisition.

Morgans AM
Tuesday, 10 December 2024: S&P 500 and Nasdaq pulling back from record closing highs set last Friday

Morgans AM

Play Episode Listen Later Dec 9, 2024 8:52


US equity markets retreated, with the S&P 500 and Nasdaq pulling back from record closing highs set last Friday (6 December) - Dow fell -241-points or -0.54%, with International Business Machines (IBM) Corp (down 3.38%) and Travelers Companies (-3.53%) both down over >3%.  Nvidia Corp fell  -2.55% after China's State Administration for Market Regulation said it was investigating the company over possible violations of the country's antimonopoly law, opening an investigation into the chipmaker in relation to the acquisition of Mellanox and some agreements made during the acquisition.

Zonebourse
Nvidia : Génie de l'IA ou bulle prête à éclater ?

Zonebourse

Play Episode Listen Later Dec 3, 2024 10:02


Comment Nvidia a-t-elle révolutionné l'industrie technologique et atteint les sommets de la bourse ? Découvrez l'incroyable parcours de l'entreprise fondée par Jensen Huang, de ses débuts difficiles à sa domination actuelle dans les secteurs du gaming et de l'intelligence artificielle. Dans cette vidéo, je vous explique comment les GPU, initialement conçus pour les graphismes, sont devenus essentiels pour des domaines comme l'IA, la recherche ou encore les data centers. Nous parlerons aussi de CUDA, une innovation logicielle clé, et des décisions stratégiques comme l'acquisition de Mellanox qui ont permis à Nvidia de construire un écosystème quasi imbattable. Enfin, je m'intéresserai aux défis auxquels l'entreprise fait face : concurrence, dépendance à TSMC, et survalorisation potentielle. Nvidia est-elle toujours une bonne affaire pour les investisseurs ?

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

It's return guest season here at Latent Space! We last talked to Kanjun in October and Jonathan in May (and December post Databricks acquisition): Imbue and Databricks are back for a rare treat: a double-header interview talking about DBRX from Databricks and Imbue 70B, a new internal LLM that “outperforms GPT-4o” zero-shot on a range of reasoning and coding-related benchmarks and datasets, while using 7x less data than Llama 3 70B.While Imbue, being an agents company rather than a model provider, are not releasing their models today, they are releasing almost everything else: * Cleaned-up and extended versions of 11 of the most popular NLP reasoning benchmarks* An entirely new code-focused reasoning benchmark* A fine-tuned 70B model, built with Meta Llama 3, to identify ambiguity* A new dataset of 450,000 human judgments about ambiguity* Infrastructure scripts for bringing a cluster from bare metal to robust, high performance training* Our cost-aware hyperparameter optimizer, CARBS, which automatically and systematically fine-tunes all hyperparameters to derive optimum performance for models of any sizeAs well as EXTREMELY detailed posts on the infrastructure needs, hyperparameter search, and clean versions of the sorry state of industry standard benchmarks. This means for the FIRST TIME (perhaps since Meta's OPT-175B in 2022?) you have this level of educational detail into the hardware and ML nitty gritty of training extremely large LLMs, and if you are in fact training LLMs of this scale you now have evals, optimizers, scripts, and human data/benchmarks you can use to move the industry forward together with Imbue.We are busy running the sold-out AI Engineer World's Fair today, and so are unable to do our usual quality writeup, however, please enjoy our show notes and the excellent conversation! Thanks also to Kanjun, Ashley, Tom and the rest of team Imbue for setting up this interview behind the scenes.Video podTimestamps* [00:00:00] Introduction and catch up with guests* [00:01:55] Databricks' text to image model release* [00:03:46] Details about the DBRX model* [00:05:26] Imbue's infrastructure, evaluation, and hyperparameter optimizer releases* [00:09:18] Challenges of training foundation models and getting infrastructure to work* [00:12:03] Details of Imbue's cluster setup* [00:18:53] Process of bringing machines online and common failures* [00:22:52] Health checks and monitoring for the cluster* [00:25:06] Typical timelines and team composition for setting up a cluster* [00:27:24] Monitoring GPU utilization and performance* [00:29:39] Open source tools and libraries used* [00:32:33] Reproducibility and portability of cluster setup* [00:35:57] Infrastructure changes needed for different model architectures* [00:40:49] Imbue's focus on text-only models for coding and reasoning* [00:42:26] CARBS hyperparameter tuner and cost-aware optimization* [00:51:01] Emergence and CARBS* [00:53:18] Evaluation datasets and reproducing them with high quality* [00:58:40] Challenges of evaluating on more realistic tasks* [01:06:01] Abstract reasoning benchmarks like ARC* [01:10:13] Long context evaluation and needle-in-a-haystack tasks* [01:13:50] Function calling and tool use evaluation* [01:19:19] Imbue's future plans for coding and reasoning applications* [01:20:14] Databricks' future plans for useful applications and upcoming blog postsTranscriptSWYX [00:00:00]: Welcome to the Latent Space Podcast, another super special edition. Today, we have sort of like a two-header. John Frankel from Mosaic Databricks, or Databricks Mosaic, and Josh Albrecht from MBU. Welcome.JOSH [00:00:12]: Hey, glad to be here.SWYX [00:00:14]: Thank you for having us. Hey, so both of you are kind of past guests. Jonathan, you were actually one of the most popular episodes from last year talking about MPT7B. Remember the days when we trained large models and there was 7B?JONATHAN [00:00:30]: Yeah, back when reproducing LLAMA1-7B was considered a huge accomplishment for the field. Those are the good old days. I miss that.SWYX [00:00:38]: As the things have accelerated a lot. Actually, let's do a quick catch up and Josh, you can chime on in as well. So Databricks got acquired. I talked to you at New York.JONATHAN [00:00:45]: Mosaic got acquired, although sometimes it feels like Mosaic acquired Databricks because, you know, we're having a lot of fun being here. But, you know, yeah.SWYX [00:00:52]: Yeah. I mean, you are chief scientist now of Databricks.JONATHAN [00:00:55]: Chief AI scientist. Careful with the title. As much as I would love to understand how Spark works, I'm going to have to defer that to much smarter people than me.SWYX [00:01:03]: Got it. And I don't know about like what you would highlight so far as a post-acquisition, but the most recent news is that you guys released DBRX. Is that the thing that most people should be aware of?JONATHAN [00:01:13]: Actually, that's no longer the most recent news. Honestly, the most recent news, we announced this, but it was at our Data and AI Summit last week. So it was announced among like 100,000 other things, is that we finally released our text to image model, which has been a year in the making through a collaboration directly with Shutterstock. There was a lot of work put into finding a dataset that we were comfortable with working on and trying to build a model that honestly, I felt like I could trust and that others might be able to trust to put out in the world. So that model was released last week. It's unfortunately just available via API due to the fact that the data is quite sensitive and quite valuable. It's Shutterstock's entire business in a lot of ways, but I'm still really excited that there's now a model that is trained on a dataset where the provenance of every single image is known, and it's a damn good model. So I'm really proud of the team on that.SWYX [00:01:55]: Yeah, amazing. Josh, do you have any thoughts on image model questions?JOSH [00:01:59]: That is not my area of expertise, but I was excited to see the release of it last week as well, and very happy that you guys did a nice job on the data side of everything there. So that was cool to see.SWYX [00:02:09]: I think what's unusual is like, I think Shutterstock's doing multiple deals in multiple labs. So what is the Shutterstock model? Like, I guess, is this the house model for Shutterstock? Is this Databricks' version of the Shutterstock model? Like, what is this?JONATHAN [00:02:22]: The way that I would think about it is that Shutterstock is doing an amazing business in AI across the board. Their dataset is kind of widely known to be the best stock photos dataset in the world, the most comprehensive, the biggest. When you think about like, what dataset am I going to train a multimodal model on? You call Shutterstock. And I, at least I've heard in the news, like OpenAI, Google, Meta, Apple have all called Shutterstock and made those deals. So a lot of models have had Shutterstock data incorporated into them. But this is the only model I know of so far where it was, you know, exclusively and specifically trained just on the vanilla Shutterstock data. There was nothing else mixed in. We didn't go and scrape the web and find other data or combined datasets or anything like that. And so this is, in some sense, the house blend. But the other piece is that it's just a dataset where the provenance of every image is known in public. Where did the data come from? It is the Shutterstock collection. That's it. You know, nothing less, nothing more. And certainly being at Databricks, if I've learned one thing, I've learned about enterprise customers and what they want out of AI. And one of the things they ask for most is just, what can you tell me about the data the model was trained on? And here, especially for text to image models, where images are just tricky subject matter, there's been a lot of kind of legal conversation about images, especially. It's nice to just have something where I can point to it and say, you know, if you want to know where the images came from, these are what they are and this is how they got there.SWYX [00:03:36]: I will talk a little bit about Databricks because it's relevant to the rest of today's episode. So Databricks, sorry, I keep misspeaking. It's DBRX.JONATHAN [00:03:46]: DBRX, actually, there's been a pronunciation update. It is now D-B-Rex. So we have decided to add a dinosaur mascot because what model doesn't like a mascot? So literally, I wish I could pull it up. There is a little plush dinosaur that we had made. It's like the world's cutest dinosaur, but it is the official mascot of D-B-Rex. And there's a little dinosaur logo that, you know, you'll probably see around a little bit more because DBRX is a mouthful, but D-B-Rex, like, you know, it's just kind of...SWYX [00:04:13]: Rolls off the tongue. I love mascots. Like every company should have a mascot. And I think Hugging Face got it right. You need an emoji mascot because that's the minimal viable image.JONATHAN [00:04:21]: I probably shouldn't talk at all about, you know, Velociraptor, but, you know, that's a, maybe that's something we can talk about later in the summer. I'll just leave it at that.SWYX [00:04:28]: Okay. That's a hint to names. I feel like your names leak a lot of alpha. So just to quickly cover the headline details, DBRX, as Make Sure Experts model, that's fairly big, 132 billion total parameters, so 36 billion active on any input, pre-trained on 12 trillion tokens of text and code, and did really well on evals to the point where you had to dye your hair blue. That's my high level conclusion.JONATHAN [00:04:53]: Never make a bet with your team two weeks out from model launch, even when, you know, human eval is looking quite bad. Because if you set some bar, even if it's arbitrary and you think there's no way in hell they're going to hit it, apparently money doesn't motivate people anymore. Humiliating their boss motivates people. So Josh, you should really take a hint from this. You know, you cannot pay someone enough money to make up for you dyeing your hair blue.JOSH [00:05:15]: I'll keep that in mind for our next model.SWYX [00:05:17]: It works. So speaking of Imbue's next model, perhaps Josh, you want to actually just say hi to the general sort of latent space audience and talk about what we're releasing today. Yeah.JOSH [00:05:26]: I'm Josh, CTO of Imbue, and we're not releasing the model. We're not releasing the weights, but we are releasing a bunch of different things that should make it easier for other people to make their own models. So I think right now, training foundation models from scratch is like a very difficult, time-consuming, expensive, kind of risky endeavor, especially for smaller companies. And the things that we're releasing hopefully make that at least a little bit easier. So the things that we're releasing fall into kind of three different buckets. One is infrastructure and scripts for dealing with the kind of hardware and hardware failures and understanding how well is the actually lowest level of thing actually working so that you can actually do your training at all and at a reasonable speed without having to constantly restart, etc. So infrastructure and training scripts. A second set of things is around the evaluation. So after you've trained it, like how well is this actually working and how do you know how well it's working? We're releasing a whole bunch of different data there, a new benchmark about code, reasoning, understanding, as well as our own private versions of 11 different open source benchmarks. So things like pool queue or ANLI, where we've gone through and kind of cleaned up the data as much as possible by looking at all the ones that models get wrong or that are flagged for ambiguity and also our own kind of private reproductions of those where we've done like a kind of clean room black box, like, okay, this is what the data set is supposed to be. Here are some examples. Let's make our own version of this to make sure that there is no data contamination, etc. To make sure that we're actually, you know, not testing on train. And then I think a final thing that we're releasing there is around 450,000 human judgments about ambiguity and question quality, which we used in the process of cleaning these evaluations and we also hope will be helpful for other people training kind of similar models. And then the third thing is CARBS, our hyperparameter, our cost-aware hyperparameter optimizer, which was especially helpful for being able to experiment at much smaller scales and then scale those experiments up to the much larger scale kind of on the first try without having to retry it. You don't want to be training, you know, 10, 20 different 70B models. You really want to get these larger modelsSWYX [00:07:30]: right on the first try.JOSH [00:07:30]: And so the ability to kind of tune things very precisely and learn scaling laws, not just for, you know, the like data and flops, but also for learning rate and all the other hyperparameters and see like how should you scale these things up was extremely valuable to us as we were training the larger models. Yeah, that's a lot of stuff.SWYX [00:07:49]: Yeah, exactly. So there's a bunch of stuffJOSH [00:07:50]: we'll have to go through all of it.JONATHAN [00:07:52]: Yeah, I just want to throw in how excited I am about this. This is the stuff that nobody ever talks about. That is the difference between success and failure in this stuff. Like, can you get your cluster to run? Can you get software on your cluster? Can you figure out what broke? Because fault tolerance is still not really built into any of the fundamental primitives of training models. And so if something breaks, you have to go figure out what broke, your job stops, you have to restart your job. It is a nightmare just to get to the point where anything can train on the cluster. A basic MPI hello world that has the GPUs talk to each other is hard enough, let alone actually training a model, let alone getting good performance out of the GPUs, let alone actually getting a model that converges to anything interesting. There's so many levels of things you have to accomplish. This is the kind of stuff that matters. I think to a point that Josh made earlier, before we got on here, there are plenty of weights out there. Nobody's released this.JOSH [00:08:46]: Yeah, that was part of the motivation actually is that there are lots of other things that are complimentary, but I have not seen nearly as much discussion about some of these other things that we think are pretty important. I mean, in some sense,SWYX [00:08:56]: I'm very excited to have Jonathan on because this is a little bit, you're a bread and butter with Mosaic. And I think you've released some part with Composer. And I think it's just really interesting to see like a different take, basically a full stack take that's kind of open source today.JONATHAN [00:09:18]: Yeah, it's really kind of, it's been an ordeal to figure this out. And every time something changes, whether it's a new GPU or even a new driver update, you get new creative errors and new things go wrong. And, you know, we've dealt with the weirdest things from, you know, our InfiniBand cables getting stolen from the data center twice, like in boxes before they arrived at the data center. Like, you know, Porch Pirate basically had stolen our InfiniBand cables back when those were hard to come by. To like, you know, weird recalls of switches to like the strangest stuff has happened. I have my favorite GPU failures I've seen, like ones where the GPU doesn't fail, it has a correctable memory issue and the memory correction causes the GPU to become a straggler and hold up the whole job. Like weird stuff happens and figuring out how to not just identify all of that, but then eventually productize it, is in some sense, the entire story of Mosaic and now Databricks in terms of our ML offering. Really, the thing we offer is we have gone through this suffering and figured out how to even productize that. It has been a pain in the butt.SWYX [00:10:20]: Yeah, it's a lot of work.JOSH [00:10:20]: I think my favorite failure was GPU is just giving wrong math. Like if they give errors, great, because you can see the errors, but if they just give you the wrong math back, not so fun.SWYX [00:10:30]: When did they give you wrong math?JOSH [00:10:32]: Like literally you could just, you know, add two things. For example, the numbers come back. They're not the numbers that they're supposed to be.JONATHAN [00:10:40]: I think it's important to say at this stage, just because like it, I think it goes without saying for Josh and I, but it's worth saying here, this isn't to say that like anything is wrong with us. It's not like NVIDIA did a bad job or, you know, Mellanox did a bad job or the like the server builder, the data center operator, the cloud provider, like the million other parties that are involved in building this. We are running these insane chips that are huge and complicated and built on tiny transistors at insane frequencies with insane heat in data centers that for the most part, were not built remotely for this kind of power or heat and have been retrofitted for this. Like failures happen on a good day with normal CPUs. And this is not a good day and not a normal CPU for the most part. It's fun to joke about all the weird things we see. This is not to say anybody's done anything wrong. This is just kind of part and parcel of working on a massive cluster running at multiple megawatts of power at a time.SWYX [00:11:32]: It's crazy. Yeah.JONATHAN [00:11:33]: So optical cables, like all sorts, like everything.SWYX [00:11:37]: I'll take the opportunity to start going to the sort of infra piece. There's just like a description of the infra just to give people a sense of what we talk about when we talk about massive clusters. So I'm just going to read off the blog post here. This post is about one cluster that has 4,092 H100 GPUs spread across 511 computers. They use unified fabric manager nodes, which manage the infinite band network. And you talk a little bit about your networking. Is there anything unusual about this setup that you'll call out to people?JOSH [00:12:03]: Yeah, actually this particular cluster is a little bit non-standard. The normal, like vanilla setup for these large clusters as vanilla as it can be is what's normally like a 127 node cluster. So closer to like 1024 GPUs instead of 4,000. Here we have a larger cluster. As you start to get into the larger clusters, the networking becomes a little bit more custom. It's a little bit more, it's a little bit trickier. It's a little bit more difficult to get these things to all be able to talk to each other at the same speed. And so this has, in this particular case, this is a three tier network architecture instead of two tiers, kind of the normal one. So most of the clusters are a little bit smaller. As you get to even larger scales, then this becomes even much more complicated,SWYX [00:12:43]: much more expensive.JOSH [00:12:43]: So we chose this particular scale, kind of knowing our own workloads and kind of what we wanted to do. This was kind of the right size for us. But yeah, I think it's not exactly vanilla already. It's already getting into kind of the custom territory.SWYX [00:12:54]: So my understanding is that there, and is there any part of this that comes with the Voltage Park deal that you guys had? Is that part of the hardware that you got from the deal with them?JOSH [00:13:04]: Yeah, so we worked really closely with Voltage Park to set up all their clusters and infrastructure and everything and kind of decide even like what to order, how should the networking work? Like we were very involved in kind of the construction and bring up of this. And that's what this post is about, is about that process of like bringing up all these, there's like different clusters in different places of different scales. So in this particular post, we're talking about this one 4096 GPU, but there are other clusters that they have as well. And we were very closely involved with figuring out the exact architecture and kind of the trade-offs that go along with picking, you know, those exact components. You really don't want to like place the wrong order because it takes months to get it and it's very expensive. So yeah, we were happy to help out with that.JONATHAN [00:13:43]: And then your bit of good cables get stolen.SWYX [00:13:44]: Yeah, yeah, exactly.JOSH [00:13:47]: We wanted to make sure that we ended up with compute that would work for us and that would also work for their other customers. And so we kind of helped design something so that we would get exactly what we were looking for. We knew that these kinds of details would be super important and that getting down to the level of the hardware and like having these good scripts and everything was going to be a core part of like actually getting this to work. I'm very glad that we did that. I don't think that most companies kind of take that full stack approach, but for us, it certainly paid off.SWYX [00:14:12]: Yeah, it's basically sort of built to spec. It's interesting that relationship because you usually, for the rest of us who don't operate at your scale, we take whatever we can get from cloud providers, but you are basically co-designing from the single machine up. And you described that a little bit. Do you want to take us through the process that you described here?JOSH [00:14:27]: Yeah, so for the actual, like the blog post and kind of bringing these machines online.SWYX [00:14:32]: Yeah.JOSH [00:14:32]: So yeah, I think the process, as we have it broken down in the blog post, there's kind of a few different layers. First is like getting the individual machines to work at all and then getting the machines to actually be able to talk to each other. So getting the InfiniBand networking to work and then getting to a point where, you know, not just the machines are working and they can talk to each other, but everything is actually working correctly. There's a big gap between like it's working at all to it's working perfectly correctly. And then after you have all this stuff working perfectly correctly, nice and healthy, then now you get into kind of the software data, like training issues. And then after that, you're still not done. Like now, even once you're training at full speed, things are going to fail over time. Things are going to change. There's going to be new, you know, firmware updates. Like how do you kind of deal with this change and flux over time without going crazySWYX [00:15:16]: and pulling your hair out,JOSH [00:15:16]: trying to like reproduce things or understand why there were regressions. And so there's a lot of work to kind of automate the infrastructure tooling as well. And kind of the first step, like bringing these things online in the first place, you know, you have hundreds of machines at this point. So you don't necessarily want to be like walking around with like a CD-ROM or a USB drive, like plugging it in with your keyboard, like hitting next, next, next on the OS install. That's not how this works. You do that for one machine. And then you use, we use this thing called Metal as a Service to bring up all the other machines. So it's a kind of server that can kind of install the operating system on these other machines. So most like when you're talking about these machines, like each machine is, you know, on the order of hundreds of thousands of dollars. So they usually come with a kind of out-of-band management interface as well. So they don't, they have their InfiniBand networking. They have their normal 100 gigabit per second Ethernet networking. These are like dual, redundant, et cetera. And then you also have this extra out-of-band management network. So you can log in and you can see like the boot screen or you can see the blue screen of death. You can like get in there and actually see what was wrong, which is pretty fun. And it makes it like possible to automate a lot of this work. So the beginning of that, and the blog post goes into much more detail about like exactly how we set these up and kind of the other errors that we ran into. When you're bringing these online, you'll definitely have failures. Even if they all worked in the factory, they get shipped, some parts come loose, something fails, something goes wrong. So when you're bringing them online, there'll be some that don't quite work for all sorts of reasons. As you start to be working with machines at this scale, like if something happens one in a thousand times, you're like pretty likely to see it. And so you can get pretty rare, weird things, especially since we had fairly early builds and fairly early versions of this hardware. Like these are some of the like first machines that were ever produced, some of the first GPUs. So you've got some extra special things there. We definitely worked with Dell, for example, on making fixes in the firmware level to be like, okay, like this thing is wrong. Like we need to update this at the firmware to like actually fix this particular thing. So we worked pretty closely with Dell and Nvidia. Yeah, that's what I'm saying. Like this stuff gets complicated. And the thing is like, you know, taking a step back, the whole reason we're doing this, right, is that we knew that this was going to be complicated. There would be these kinds of failures. And if we're just using, you know, AWS or some other cloud provider, these errors are still gonna be there and you're gonna have no way to know and no way to debug this and no way to diagnose what's going wrong. And so we would much rather be able to like call up Dell and say, hey, this isn't working. And they're like, yep, okay, cool. Let's debug it together. Oh, I see. Yeah, cool. We'll ship a firmware update and actually fix this for you. That was a much better experience than like, great, just magically fails. I guess we restart and hope that that machine goes away. Like that's not a very good place to be. So yeah, that's kind of the first place is getting to a place where like GPU training is working on your single node machines. You can observe stuff. We have tons of tooling around like, you know, Prometheus and all sorts of other tools for understanding what's going on in these machines because you don't want to be like logging into each one and looking at the temperature or something you really need to have tooling to collect all these metrics, et cetera. Unfortunately, all of the scripts that we have for this are like for this entire cluster and for all this infrastructure are a little bit like special purpose for our particular thing. So it's not that every script that we have, it's not that you can just like take this and plug this in. Even if we did open source all the tooling that we have, you'd still have to do like a lot of work to open source it. What we are releasing is as many of the things that we can that are going to be useful for other people. You're still going to have to have some way of kind of managing these things, making your own like logging aggregators, et cetera, et cetera. So that's kind of bringing them up to the like, you know, the single nodes that are working. From there, it goes into, I'm happy to keep going if you want. Well, I just want to leave the opportunity for JohnSWYX [00:18:53]: to comment if there's anything that's different from how he runs things.JONATHAN [00:18:57]: Oh, I mean, all I'll say is I'll endorse this and say this s**t is hard. Like this is really, really hard. And, you know, I have a special props to, you know, the folks in Vue because they were building this from the ground up. You know, at Databricks and at Mosaic, we typically work with cloud providers because some of this stuff is just, there's too much to handle. It's complicated. There's a lot to deal with. And this doesn't even get into things like physical security, you know, securing power if you're the data center operator. Like this gets infinitely complicated and you have to abstract somewhere. Like, you know, and then you get to the folks who are literally building their own custom chips and like, good God.SWYX [00:19:36]: Like, oh my God, that's, you know,JONATHAN [00:19:38]: if you're one of those folks, you're having, you know, pour one out for the infra people at some of the AI chip startups who are having a really, really interesting time right now. But this stuff is really hard. And I don't think we talk about it much because there's so many other things that are hard. But the other hard things, I think everybody's becoming pretty familiar with at this point. This is something that I don't think there's ever really been a comprehensive discussion of, at least not that I've seen.SWYX [00:20:00]: Yeah, so my impression is that you guys, Mosaic, have your own software for sort of spinning up and down machines, just like Imbue had to build. But Imbue probably, it sounds like Imbue, you guys went fuller stack. I don't know how to describe it. Like Mosaic is not working with Dell on like their firmware.JONATHAN [00:20:21]: No, no, we're typically working with like, you know, pick your cloud provider on their Dell firmware or what have you. Like, it's kind of, I think one of the things, I don't know, Josh, you can correct me on this. It's kind of impossible if you're doing training to not go all the way through the entire stack, regardless of what happens. Like somehow I'm still chatting with cloud providers about power contracts, even though the whole point of dealing with the cloud provider is not to have to think about power contracts. Somehow I'm still asking them about which InfiniBand provider they used this time to see if this is part of the bad batch of cables I encountered on that cloud provider or what have you. Or like, we're still talking about a firmware update from pick your provider. You can't not do this. It's convenient that they have data center staff who are worrying about what to send back to which provider when, and they have people who can go and wait for the InfiniBand cables so they don't get stolen outside. But, you know, it's kind of, it's impossible not to really go full stack if you're thinking about the infrastructure at all. I don't know, Josh, correct me. No, I think that's right.JOSH [00:21:17]: That's what we expected from the beginning as well, is that we would inevitably have to get into the details here. And I'm glad that we kind of just planned for it. I think it made it a lot easier from our perspective to have direct control over this. Instead of having to go to the cloud provider that goes to the data center, that goes to the supplier, we could just go direct to NVIDIA or DellSWYX [00:21:37]: or the data center,JOSH [00:21:37]: whoever was responsible and be like, hey, this thing needs to change. And they're like, oh, okay. Yeah, that is our responsibility. Great, we can fix that. So it was just a lot easier for us to fix these bugs than if we had to go through an extra layer of email.SWYX [00:21:48]: Something we discussed in the pre-show was that you had a rule of thumb for your cluster of reliability. You say here in the post, by and large, you expect around 3% of your machines to break every week. So you're basically going to turn through all your machines in a year.JOSH [00:22:04]: As it says in the post. So that would be true if it was a uniform failure like that. But as it says in the post, it's usually these kind of problematic nodes. And to be clear, that is the number that we've heard from other people is like they're having about 3%. I don't think we're experiencing failure rates that are that high. I think ours is actually quite a bit lower than that, probably because we've taken the time to like dig into a large, maybe larger number than we should have of these failures and get to the root cause of it and be like, oh, okay, like that's exactly what's going wrong.SWYX [00:22:33]: How do we fix this?JOSH [00:22:33]: How do we prevent this from happening? How do we make automated checks for this so that if it does happen, it just goes back to whoever owns that particular part of the process and they can fix it immediately.SWYX [00:22:43]: And that's part of what you're also open sourcing, which is the health checks, right? You got the NIC health checks, GPU health check, this space health check, Docker D message. I don't know what that is.JOSH [00:22:52]: That one is just a lot of stuff.SWYX [00:22:54]: Yeah.JOSH [00:22:55]: That one is one where we realized that actually like when these machines boot, sometimes they wouldn't actually boot cleanly all the way. Or when they rebooted, they had problems that they didn't have when they were working before, which was kind of frustrating. Like usually if you restart your computer,SWYX [00:23:08]: it gets better.JOSH [00:23:08]: Here you restart. It did not get better.SWYX [00:23:10]: It got worse.JOSH [00:23:10]: That was very frustrating. So this health check looks at every particular line we've ever seen from the boot, like in D message, like every single log line that your computer emitsSWYX [00:23:21]: and says like,JOSH [00:23:21]: have we ever seen this before?SWYX [00:23:23]: Is this expected?JOSH [00:23:23]: Is this in the right order? Or is there something out of place? If there's anything out of place, let me say, okay, great. Like now it goes into this, like longer, more triage list of like, all right, great. Like, is this acceptable?SWYX [00:23:33]: Should we flag this?JOSH [00:23:33]: Like, should someone take a look at this? So we're looking down at a very, very granular detail level, what's happening on these computers to make sure that nothing is out of place. And that's critical because without that, if you're running your training, as Jonathan said, and this thing is slow, like what are you supposed to do? Right?SWYX [00:23:49]: Like you really,JOSH [00:23:49]: you really want to be very certain that like all 4,000 of these GPUs are working like they're supposed to.SWYX [00:23:54]: We know that.JOSH [00:23:54]: And so if it's slow, it's because like we messed up the config or something else and not because of this earlier thing that's like really hard to detect in software later.JONATHAN [00:24:01]: Yeah. I think the, I'm just curious to ask,SWYX [00:24:03]: like, you know,JONATHAN [00:24:03]: suppose you were to set up another, let's say another H100 cluster and it were at a different data center. And instead of the vendor being Dell, it was super micro or what have you. How much of this would be repeatable? And how much of this would you have to redo? I, you know, I genuinely don't know.SWYX [00:24:18]: A decent amount.JOSH [00:24:19]: I think it would go a lot faster the second time. I think there's lots of learnings that we had. And also the blog post,SWYX [00:24:24]: you know, yes,JOSH [00:24:24]: we are releasing the health checks, releasing some scripts, but a lot of the valuable stuff is also in the blog post itself, in the details and kind of the, you know, the learnings that we've had and the sort of errors that we run into. We tried to as much as possible surface those to other peopleSWYX [00:24:36]: could learn from thoseJOSH [00:24:36]: and avoid the same mistakes or failures as well. But I think it would go a lot faster.SWYX [00:24:41]: Although, yes,JOSH [00:24:41]: there would certainly be some things that'd be a little bit different. I mean, there'd probably be different CPUsSWYX [00:24:46]: or whatever,JOSH [00:24:46]: but I think a lot of that stuff is less,SWYX [00:24:49]: it's less,JOSH [00:24:49]: that's the like, that's less variable. I think most of it would apply the second time around. Although I'm sure next timeSWYX [00:24:56]: we're building one,JOSH [00:24:56]: it'll probably be, you know, at a scale that's 10x as big with a different chip or something like this.SWYX [00:25:00]: And then who knows?JOSH [00:25:01]: Yeah, with Kinect X8,JONATHAN [00:25:02]: that will have its own fun behavior and all that good stuff. Yeah.SWYX [00:25:06]: Perhaps there's something that people don't discuss about, and you don't even talk about this in the blog, but I always wonder is what is the timeline that's like kind of reasonable for this amount of work, at least the initial stages? And also what does the team composition look like for setting up a cluster, right? Like what are the mix of skills that you typically would require to get all this going?JOSH [00:25:27]: I'm, I can't really speak to typical. One thing I am very proud of is how much we accomplished with such a ridiculously small team. Like our infrastructure team is like, you know, fluctuates from week to week, depending on like how many things are on fire and how much we need to build. But it's like between like three and six people, like it's small. It's not like some huge team of like tons and tons of engineers. But those people are very, very good at what they do. And so that has allowed us to get a lot of mileage out of out of these things. I think it's not that we're building everything, right? It's not that three to six people build this whole thing. I definitely want to like, you know, say thanks very much to Dell and H5 and NVIDIA and the other people that have done a lot of the work, like to bring up this cluster, you know, with 4000 GPUs and three tier networking, networking architecture, you have 12,000 cables. So that's 24,000 things that need to be plugged in. Like that's just a lot of stuff to plug in, right? And you don't want to mess it up. Like each one needs to be done correctly. Like it's a little bit loose. Like it doesn't really work.SWYX [00:26:23]: If you break it,JOSH [00:26:23]: you need to replace it. Like there's a lot of workSWYX [00:26:26]: that goes into this.JOSH [00:26:27]: Yeah.SWYX [00:26:28]: And then, you know,JOSH [00:26:28]: that's just like that's it. That's if you were to do everything right the first time.SWYX [00:26:32]: And if you didn'tJOSH [00:26:32]: have to fix anything. But inevitably, you know, you will have to replace something, which means like taking all the wires out, pulling the thing out, taking all the GPUs out, going and fixing some cable, putting it all back correctly, putting it back in, doing this every time. So there were a lot of people at Dell, NVIDIA and at H5 that all helped a ton with this stuff. I don't know the exact size of the Dell team. It also fluctuated over time.SWYX [00:26:55]: Yeah, excellent. And then, you know, you so you have all the hardware set up and now you're firing it up for a single node. There's a long description that you guys have about just like monitoring the MFU, right? And what each situation might look might be indicative of. One of the most interesting things to me that I saw from here is like, you know, if training immediately starts off at 60 to 80% MFU, something's wrong.SWYX [00:27:24]: But like, you know, like what what are like, you know, some anecdotes or, you know, notable scenarios here that you might you might call out as maybe counterintuitive or super interesting.JOSH [00:27:36]: There's just so many of them. I mean, one of them, which I think is probably pretty common, like common knowledge by this point. But like we did have a sort of likeSWYX [00:27:46]: which one was this exactly?JOSH [00:27:47]: I think for the MFU, like gradually getting worse over time. I think that one, when we saw that the first time we were like, what the heck is going on? Like, why does it get just like a little bit worse? This is so strange. Like, what is it getting lazy or tired or something? Like, is it heat? Like what's going on? And in this particular case, it was memory fragmentation. Because you have hundreds of machines, they're doing garbage collection slightly different times. And then they get slightly further apart and slightly more and more jittered until eventually they're all happening kind of at random times. And just like really messing up each one of your steps. So you just turn off garbage collection and call it a day, basically,SWYX [00:28:20]: to be honest.JOSH [00:28:20]: There's other things you can do if you want to be a little bit more sophisticated about it. But you can also just manuallyJONATHAN [00:28:25]: have it all garbage collect on some interval. Like that's what we've done. We just have a garbage collection callback that just runs. But I've seen the exact same thing.JOSH [00:28:33]: Yeah, yeah, exactly. So I thought that one was kind of funny. And we did trace that one down and look and we did find the actual call. Like, again, this goes to like having good tools. So we had really good tools where we could look at a bunch of like actual traces in C and be like, OK, cool. This is the thing that's taking a lot of time. Or like, you know, this is the thing that doesn't quite line up here. Like, oh, I guess it's garbage collection. OK, cool.SWYX [00:28:52]: Interesting.JOSH [00:28:52]: Yeah, let's just try taking it off.SWYX [00:28:54]: OK, great.JOSH [00:28:54]: That's what it was. Now we can fix it. So for each of them, like basically bugs are not hard if you have good tools. But if you don't have good tools, bugs can be very, very hard. So similarly for like heat, another thing that we saw was like, oh, you know, the CPU is getting throttled. OK, well, it's easy to see if you're monitoring the CPU throttling or monitoring the heat. If you're not monitoring that, it's really hard to know why it's just suddenly one of them is going slower. I noticed also in the pieceSWYX [00:29:17]: that you mentioned FSDP with 0.3. Actually, we met, I went to iClear and Guanhua from the DSP team was there presenting 0++. I was wondering if you want to make any call outs to, you know, particular open source or open library or open whatever implementation teams that were super helpful in your process. I think we ended up actuallyJOSH [00:29:39]: pulling from a whole bunch of different ones to pull things in into our own particular pipeline. So we use things from NVIDIA's, you know, Megatron stuff. We use stuff from probably DeepSpeed. I think we pulled in a bunch of different pieces from a bunch of different places. So it was really nice to see all these working open source like examples. I think I really appreciate all the effort that has gone into actually tuning these things because you can tune them, but it's a lot of work to like tune this stuff and do all this stuff from scratch. It's really nice to have like a working example. I think those are probably the two biggest ones, DeepSpeed and Megatron alone, but there are probably other ones as well.SWYX [00:30:13]: Is there a particular thing in the ecosystem where you would call out as like, you know, there should be something here that is open source, but like it's not really, it's like everyone kind of builds it on their own. I want to say something with the file system because everyone talks about the file system eventually.JOSH [00:30:28]: The file system actually was,SWYX [00:30:30]: I mean, we did somethingJOSH [00:30:31]: kind of dumb there. Like we have our own sort of local mirror so that we can, you know, like a crappy version of S3SWYX [00:30:38]: that's local,JOSH [00:30:38]: but it's just a pretty simple script, right?SWYX [00:30:41]: Like I think we run likeJOSH [00:30:41]: a little web server that just like serves files and then, you know, it can upload themSWYX [00:30:45]: and download them.JOSH [00:30:45]: Okay, great. And part of the reason we did that is that our internet connectionSWYX [00:30:50]: in the beginningJOSH [00:30:50]: was not the like full speedSWYX [00:30:52]: one that we wouldJOSH [00:30:52]: eventually have. And so we are a little bit more kind of bottlenecked in terms of internet bandwidth. And so we had this. I think we looked at a bunch of services out there like Minio and some other ones, but a lot of these like come with a lot of extra overhead and maintenance. And since we already have so much infrastructureSWYX [00:31:09]: to deal with,JOSH [00:31:09]: we kind of didn't want to, you know, bring in a whole other like cloud provider, virtualize something, something.SWYX [00:31:14]: We just wanted something simple.JOSH [00:31:14]: So we went with that, which has been quite helpful. Like our toolsSWYX [00:31:19]: are usually quite simple.JOSH [00:31:19]: It's like Bash and Python and SSH and Docker. Like we'd like to keep things simple so that's easier to debug, like less layers of infrastructure, less layers of abstraction, make it a lot easier to work with. Like we don't use Kubernetes,SWYX [00:31:30]: for example,JOSH [00:31:30]: and we just directly launch these things. And it's just been much easier to debug this way. One tool actually that does come into mind that I will call out is Kraken from Uber. That was great. We love that tool. We were a little bit skeptical. What is it?SWYX [00:31:44]: I'm sorry. Yeah.JOSH [00:31:45]: So Kraken is this, yeah, it's a distributed like Docker registry, basically, that uses BitTorrent to like transfer things between the machines in a sort of nice optimal way. Like in the very beginning, the naive way is like you have this one Docker registry, which was outside of the cluster. So every time we change an image, you know, there's many gigabytes that each of the 500 machines needs to download.SWYX [00:32:07]: So that just takesJOSH [00:32:07]: a really long time. So what this thing does is like just one of them downloads it and then like they all sort of broadcast all the pieces to each other. And it was just like a really nice, fast way of getting these images down. And it was very robust.SWYX [00:32:19]: Like there's a lotJOSH [00:32:19]: going on under the hood, but I think it's a pretty cool tool that we haven't really had any bugs with it at all. Amazing.SWYX [00:32:26]: Yeah. I mean, that's all my questions, I guess, for the info piece. I don't know if, John, you had something that you were sort of burning to ask or.JONATHAN [00:32:33]: No, all I can say is just sameSWYX [00:32:36]: in a lot of places, like, you know, and they're done thatJONATHAN [00:32:38]: seeing this plus one. I think the one big difference, you know, perhaps in philosophies is we've tried to basically standardize on as much commodity stuff as possible, just because, you know, I think the reason I asked about trying to do thisSWYX [00:32:50]: on multiple differentJONATHAN [00:32:50]: pieces of infrastructure is like, I think we're running on like six or seven different clouds right now. And everybody has done something slightly different. And my gosh, the little differences add up as you know, you've seen. And so, you know,SWYX [00:33:04]: our philosophy has been like, whatever the hellJONATHAN [00:33:05]: we can standardize, please let's standardize it. Like vanilla off the shelf FSDB.SWYX [00:33:10]: And like, you know,JONATHAN [00:33:10]: we wrote our own data loader, but we've tried to make that as much of a standard as we can across our infrastructure and in Databricks, because things just start getting really complicatedSWYX [00:33:18]: or like we useJONATHAN [00:33:18]: Kubernetes extensively because it at least gives us a uniform set of APIs. Like that's our hardware abstraction layer to a certain extent for everything else. So it's just, you know, a difference in philosophy there. But otherwise, like, yeah, this stuff is really, really hard. And I feel like we take for granted how much of this, you know, is done for us when you go and you just query chat GPT, for example. Like, oh my God, everything going on underneath that, you know, it's kind of a miracle that the machines boot up, let alone that you can like query a giant language model that's probably doing inference across multiple machines and was trained across thousands of machines. Like, you know, minor miracle.SWYX [00:33:54]: Yeah, it is an awesome amount of power that we invoke with a single API call that we take for granted these days. It's absurd. Yeah, I mean, like Kubernetes, like that point about Kubernetes, I will say as a former AWS employee, like it seems like it would be ideal for imbue to at some point make it more abstracted or agnostic because you're going to want to, you know, replicate your setup. We do have our ownJOSH [00:34:19]: sort of replacement. It's just a much simpler version of Kubernetes. Kubernetes is really designed for running services, not for running experiments. Like that's not its like main architecture. And so for us, like we have everything that's like, cool, you're going to run an experiment. So you want it to run to completion, right?SWYX [00:34:34]: OK, great.JOSH [00:34:34]: Like the primitives are sort of built around a slightly different style. And that makes it a lot easier, like just a lot simpler to fit that the nature of like these machines are going to disappear. They will need to be rebooted for infrastructure upgrades. They will like something will happen to the GPUs. Failure is like baked into this as like a core part of our infrastructure. So it's not that we don't have an abstraction. It's that it's a sort of simpler, more tailored abstraction for the particular work that we're doing.JONATHAN [00:34:58]: Yeah, I think it all depends on what your goals are. And like, I think the challenge in a lot of the deep learning stuff right now is that people are trying to like, people often build things that are more complicated than necessary to get the job done. And the complication is the enemy of everything. You know, don't use a fancier parallelism strategy than you have to. Don't use a fancier set of libraries than you have to.SWYX [00:35:18]: Don't do anythingJONATHAN [00:35:18]: that you don't have to do because it's hard enough as it is. Like, don't overcomplicateSWYX [00:35:23]: your own life.JONATHAN [00:35:23]: Don't try to bring in more tools or more fancy architecture tweaks if you absolutely don't have to.SWYX [00:35:29]: Like getting to the minimumJONATHAN [00:35:30]: necessary to get the job done. And it's really tempting to want to try to use everything. So like, I totally understand that one.SWYX [00:35:37]: I think the last piece I'll maybe call out is that I'm just going to weave this in just because I see the opportunity to do it. Are there any infrastructure shifts that need to be, that need to rise because of changing architecture? So I think, for example,SWYX [00:35:57]: you're announcing a dense model, a 70B dense model, whereas John just worked on DBRX and the image-to-text model, which presumably has different bottlenecks.JONATHAN [00:36:10]: That's correct for us. You know, we train both dense and mixture of expert models. The one we happened to, you know, kind of get permission to open source was a mixture of expert model. And those models are very demanding when it comes to network bandwidth, at least if you're training them in kind of FSTP 03 style, where there's just a lot of parameters getting shuffled back and forth. And your ratio of kind of compute to amount of data that you have to shuffle back and forth becomes a lot worse because you're now, you know, you're only using a fraction of the parameters for every token instead of all the parameters. And so we had to really push the envelope on getting all the stuff to the right places on time. And so actually the networking part of DBRX was the single hardest thing, I think, of the entire process. Just get MOE training, working at scale across a big cluster. We still managed to, I think, do it all with commodity parts, which was very exciting. You know, we were using FSTP and we eventually used HSTP so that we could have HSTP as a version of FSTP where you have multiple smaller replicas and you're doing data parallel within those replicas. And that helped a lot with network latency issues that we were running into just because we were transmitting so much data, you know, for every single part of the process. I think it actually, like, it was instructive for how Google designs their hardware and software together personally. Their training, as far as I understand, using kind of a 03 style of training and have been for a while. They also train mixture of expert models. TPUs have a very different network bandwidth to compute ratio. They have a lot more bandwidth just objectively. And TPUs per chip tend to be a little bit less compute intensive and have a little bit less memory. You know, it's just a different design choice. So the ratio of flops to bandwidth is very different. And that means that it's much easier for Google to be able to pull offSWYX [00:37:54]: some of this stuff.JONATHAN [00:37:54]: They also have interesting, you know, Torus style network architecture or Torus style, like, literal network architectureSWYX [00:38:00]: is not like the model,JONATHAN [00:38:00]: but the network.SWYX [00:38:02]: Is this the sort of block attention? I forgot what you call it. So this is just more or the,JONATHAN [00:38:07]: yeah, this is more, not the ring attention, but these are the ring all reduces. Like you have three different dimensions of rings because they kind of put you in these three dimensional Toruses from what I understand. And so like, you know, Google's infrastructure in some sense is kind of, I wouldn't say built for this, but maybe the way that Google trains models is built for a slightly different bit of infrastructure they have. And it's kind of neat to think about that. You know, as one thing that I think NVIDIA announced for, you know, for, for both the GH200 and the GB200 is this hybrid networking where you'll have blocks of NVLink network chips. I think for the GB200, I think it's like groups of 72 GPUs will all have NVLink to each other. So higher bandwidth, then you'll have normal networking of some kind, InfiniBand or Rocky or what have you between these blocks. And that's kind of a, you know, it's a change due to the fact that, you know, it's hard to build really high bandwidth networks over very large groups, but it is now a blocked networking. And you have to think about how you architect your model and your parallelism differently. You also have to think about fault tolerance differently because it now matters where you lose a GPU, whereas it didn't before. So, you know, it's, it's, it's just all really interesting and really fun speaking personally, but it's going to mean new nightmares when we all move to that generation and have to think about, you know, new versions of these problems.JOSH [00:39:20]: As you go up to larger scales, it gets quite different. Like right now, you know, if you're experiencing, let's say, for example, you experience a GPU failure every day, that's fine.SWYX [00:39:31]: Just restart.JOSH [00:39:31]: If you make your thing 24 times as big, now it's once an hour. Now it stops being quite as easy to just restart, right? So now you have to kind of break, like bake in this sort of redundancy that you didn't have before. So I think as you go up in scale, you end up running into like a lot of really interesting problems that also inform the, the actual like design. Yeah, I mean, as an orchestration guy,SWYX [00:39:52]: this is why I always emphasize like very cheap storage or very fast storage. So you can checkpoint more, but I don't think that's probably not the best solution to for fast, you know, training.JONATHAN [00:40:05]: Which works fine when you're doing language and then you move to vision or video. And then, you know, you have multi petabyte datasetsSWYX [00:40:12]: and getting, you know,JONATHAN [00:40:13]: cheap, fast multi petabyte storage starts to bite. Like I've certainly encountered issues where the literal data center where my GPUs were did not have enough, you know, object store to fit the datasets that people wanted to bring into that data center from whichever users were, were trying to bring them in. And then you get to a wholeSWYX [00:40:31]: different world of hurtJONATHAN [00:40:31]: where you have to keep your data in a different region because the region is just out of storage. So things get fun really fast.SWYX [00:40:39]: Speaking of vision, Josh, actually, you know, Embu is an agents company, but you're only, you're announcing a text-only model. What, where does, where does the vision side come in?JOSH [00:40:49]: I think we've actually done a lot of work in the past and people can see kind of our blog posts about sort of self-supervised learning and some other kind of vision-related stuff in the past as well. So we're very familiar with, with that stuff. But I think our main focus right now is on kind of, as we say, coding and reasoning. And there, there's certainly a visual component to some problems. But, you know, it's not necessarily required for all problems. And actually we found that for most of the kind of like code writing and, and reasoning problems that we care about, the visual part isn't really a huge important part of it. Sometimes if you really need to, you can maybe describeSWYX [00:41:24]: the thing.JOSH [00:41:24]: There are other like, you know, multimodal models that you can use off the shelf to sort of plug in for those particular piecesSWYX [00:41:30]: that you need, right?JOSH [00:41:30]: Like if something is driving a browser or whatever, like you can sometimes get away with not having to have that baked into the original model. So our folk were, you know, in a sense, we kind of do a lot across the stack. We're working on our own infrastructure and pre-training and RL and fine tuning and products and everything. But in another sense, we're very narrowly focused on the application side. So all of the stuff across the stack is kind of going toward a very particular purpose. And so that particular purpose right now doesn't really need vision. So we think that people are going to make all sorts of really cool image modelsSWYX [00:42:00]: like Jonathan, right?JOSH [00:42:00]: And all sorts of interesting multimodal models into the future. We'll let them go do that. That's great. We'll take advantage of that, partner with those people in the future. And right now we're really focused on kind of the core reasoning and coding capabilities and aspects of the model.SWYX [00:42:14]: I wanted to go into carbs since that's kind of the next layer of the stack. We talked about carbs in the first episode with Kanjin because you've actually had a blog post about it like a couple of years ago. Maybe let's introduce it.JONATHAN [00:42:26]: Has that been a couple of years now?JOSH [00:42:28]: No, it must have been at least one year. Hopefully it's not multiple years.SWYX [00:42:32]: Sorry, I'm counting AI time. Yeah, yeah. Yeah, I was going to sayJONATHAN [00:42:35]: you're making me feel really old right now.SWYX [00:42:39]: I count everything before the generally intelligent rename as like, you know, prehistory. Yeah. And now sort of modernity, right? So I actually thought carbs was more about hyperparameter optimization in a sense of like sort of parameters, hyperparameter search. Whereas, you know, when you introduced it, especially in this blog post, it's more about scaling laws and predictability of like, are we sort of in the right ballpark before we scale things up? Maybe sort of recount the history of carbs.JOSH [00:43:10]: Yeah, so it really is a little bit of both. So carbs is, it's maybe a backronym, but it's for cost aware Pareto region Bayesian search. So this is about technically how it works, but carbs is like, you know, we like pastries and stuff.SWYX [00:43:26]: So great, why not? But the point is thatJOSH [00:43:29]: it's a cost aware hyperparameter tuner. So most hyperparameter tuners, you kind of say, OK, here's this objective function. I want you to make this number as big as possible or as small as possible, whichever direction you want to go. So yeah, just go make this number, you know, as small as possible. OK, so it'll try a bunch of differentSWYX [00:43:46]: hyperparameters,JOSH [00:43:46]: a bunch of different configurationsSWYX [00:43:48]: to figure out, like,JOSH [00:43:48]: how do I tweak your network and architecture, et cetera, to get the kind of best performance I possibly can. That's usually saying, like, you know, almost all of these hyperparameter configurations are, let's say they're all going to use the same number of GPUs or the same number of nodes.SWYX [00:44:01]: So it's going to runJOSH [00:44:01]: for the same amount of time.SWYX [00:44:03]: So you can do that.JOSH [00:44:03]: You can get a number out and that's great. But what carbs does is it says,SWYX [00:44:07]: OK, actually,JOSH [00:44:07]: what if we relax that constraint? What if we say each of these different points, we're going to model how expensive it will be to sample this configuration. So if what if we train with just one one hundredth of the data? Like, how well can we do?SWYX [00:44:19]: What if we trainJOSH [00:44:19]: with one tenth of the data? What if we train with all the data? That way you can understand, like, as we get more and more data, as we spend more and more compute,SWYX [00:44:26]: as we make a biggerJOSH [00:44:26]: and bigger network, how does performance change with these things that change? Like how expensive it is to even explore this data point. So by doing that, we can see the scaling laws for not just, you know,SWYX [00:44:36]: the scaling lawsJOSH [00:44:36]: from like the, you know, Chantilla paper, the scaling laws for all parameters. We can see how does how does the number of layers change with this? How does the, you know, the learning rate change? How do the like, you know, various types of regularization change? So you can see these nice scaling laws. And as you're going across costs, like how should this be changing as you're scaling up your model? So that, coupled with the kind of metric that we chose, which is a very precise way of measuring performance, allowed us to really like hone in on parameters that worked really wellSWYX [00:45:05]: and understand, like,JOSH [00:45:05]: how do we want to scale those up, especially as we're changingSWYX [00:45:08]: things about the network?JOSH [00:45:08]: Like one of the things that we did is we used a custom tokenizer. As we change this tokenizer, changes a bunch of other things about the model. So how should we scale up this entirely new tokenizer? Like no one has ever made a model this large with this tokenizer before. And so how do we want toSWYX [00:45:22]: change all these things?JOSH [00:45:22]: Harps kind of shows you, like, look, as you change these parameters, like these other ones are kind of dependent on this.SWYX [00:45:28]: Like this is the, these areJOSH [00:45:28]: the relationships between them. So you can better understand, like, OK, if I'm going to scale this up 10x or 100x, like, where do I want to be? I can only go so far. And so, you know, we did run, like, I think maybe it was like a 14b one or somethingSWYX [00:45:40]: like that to check.JOSH [00:45:41]: But and so we had a bunch of like 1b or 14b and then at 70b. I don't think we had a, I think we just did like one at 14b. So you can, we get to check that like, oh, is this on the curve? Like, is this where we expect? It was like right there. So then great, go on to the next one. Yeah, I mean, that makes a lot of sense.SWYX [00:45:56]: I wonder if, so one of the key questions, and correct me if I'm wrong, but like usually people do search or do their evals just based on loss. But you actually evaluate based on, you know, the sort of end state evals that people might expect, like HellaSwag and Lombata, whatever. What is the norm here? Is there a norm?JOSH [00:46:20]: Yeah, I don't know if there's a hundred percent.SWYX [00:46:21]: I don't know. I only see loss on most people's reports.JOSH [00:46:25]: I think it's easy to, like, loss is very nice because it's very precise. It will tell you, like, very fine grained differences between like really small changes in your hyperparameters or network architecture. Whereas, especially at the smaller scales, if you're looking at like accuracy, it's very noisy. Like it might be zero or a hundred or like, you know, fluctuating by like 10 or 20 percentage points, which makes it really hard to tell, like, did that change actually mean anything? So our loss is sort of a combination of these two. Instead of saying, like, let's just look at perplexity, we say, let's look at perplexity on the tasks that we care about for multiple choice questions effectively.SWYX [00:47:00]: So we're saying like, yes,JOSH [00:47:00]: this is formulated as a multiple choice question, and we're going to look at the, like, you know, the loss of perplexity for this particular answer token. And that ends up being something that's like both targeted to what you actually care about and also very precise. The nice thing about this though is that it's independent of the data that you train on. One thing that's annoying about perplexity or about loss is that as you change your data set, this is really obnoxious because now it fundamentally changes your loss, right? And so you can't tell, like, how do I tweak my data set? But because we have this held out evaluation dat

Kan English
Entrepreneur and bereaved father Eyal Waldman fights for Israel Prize

Kan English

Play Episode Listen Later May 13, 2024 7:36


Independence Day is the day the Israel Prize is handed out. This year's recipient of the Israel Prize for Entrepreneurship and Technological Innovation is being awarded to Eyal Waldman. One of the leading entrepreneurs in Israel, he was the founder of Mellanox, which produces components for high speed communication networks.  Waldman has been a harsh critic of the government and even more so since his daughter Danielle was murdered at the massacre at the Nova Party on October 7th. Education Minister Yoav Kish tried to keep the prize from Waldman, but he pushed ahead to receive the award. He spoke with reporter Arieh O'Sullivan about his efforts and his vision for Israel. (photo:Flash90)See omnystudio.com/listener for privacy information.

Chip Stock Investor Podcast
Episode 108: Nvidia's Blowout Victory Against AI Chip Competitors (Nvidia Unveils Blackwell and GTC)

Chip Stock Investor Podcast

Play Episode Listen Later Mar 20, 2024 14:36


Nvidia CEO Jensen Huang unveiled the new Blackwell “GPU platform” for all things AI at the company's annual GTC 2024 event, and it didn't disappoint. But for investors unfamiliar with chips and accelerated computing systems, all of the products used to build these massive AI factories can be confusing. Chip Stock Investors Nick and Kasey are here to break down all the pieces that make up a Blackwell “GPU,” and explain how the largely overlooked acquisition of Mellanox (announced in 2019) paved the way for Nvidia's dominance today. Access to all of our show notes and slides can be found on the Ko-Fi Shop, or via a subscription to our Discord server. And stay tuned later this week for more about semiconductor manufacturing, potential Nvidia competitors, and why not all “FOMO” is bad.

Podcast Notes Playlist: Latest Episodes
NVIDIA CEO Jensen Huang

Podcast Notes Playlist: Latest Episodes

Play Episode Listen Later Mar 17, 2024 89:47


Acquired Key Takeaways  The trick of entrepreneurship is convincing yourself that it is not that hard, even when it is “Your organization should be the architecture of the machinery of building the product.” – Jensen Huang   Pave the way to future opportunities; you cannot wait until the opportunity is in front of you to reach out for it  Many of the great all-time tech companies began by building products and services for developers You don't have to be that perfect if you can position yourself near opportunities Every person should learn how to use an AI to augment their productivityNvidia tries to position itself in a way that serves a need that is yet to emergePush your chips in when you know it is going to work   Don't do everything; prioritize your time, and make sacrifices, but realize that there is plenty of time in life  Read the full notes @ podcastnotes.orgWe finally sit down with the man himself: Nvidia Cofounder & CEO Jensen Huang. After three parts and seven+ hours of covering the company, we thought we knew everything but — unsurprisingly — Jensen knows more. A couple teasers: we learned that the company's initial motivation to enter the datacenter business came from perhaps not where you'd think, and the roots of Nvidia's platform strategy stretch back beyond CUDA all the way to the origin of the company.We also got a peek into Jensen's mindset and calculus behind “betting the company” multiple times, and his surprising feelings about whether he'd go on the founder journey again if he could rewind time. We can't think of any better way to tie a bow on our Nvidia series (for now). Tune in!Editorial Note: We originally recorded this episode before the horrific terrorist attacks in Israel. It feels wrong to release this episode — where the nation of Israel and the Mellanox team are discussed — without sharing our profound sadness for all the families who had innocent loved ones or friends killed, injured, or taken hostage. Our hearts go out to everyone coping through this dark moment in history.Sponsors:Thanks to our fantastic partners, any member of the Acquired community can now get:Your product growth powered by StatsigScalable, clean and low-cost cloud AI compute from Crusoe (and listen to our recent ACQ2 interview with CEO Chase Lochmiller)Free access to Jensen's favorite business books on Blinkist, plus our favorites on Ben & David's BookshelfMore Acquired!:Get email updates with hints on next episode and follow-ups from recent episodesJoin the SlackSubscribe to ACQ2Become an LP and support the show. Help us pick episodes, Zoom calls and moreMerch Store!‍Note: Acquired hosts and guests may hold assets discussed in this episode. This podcast is not investment advice, and is intended for informational and entertainment purposes only. You should do your own research and make your own independent decisions when considering any financial transactions.

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Christopher Lochhead Follow Your Different™
335 Inside Israel with Dr. Giora Yaron, former Chairman of Tel Aviv University

Christopher Lochhead Follow Your Different™

Play Episode Listen Later Oct 27, 2023 72:12


Today on Christopher Lochhead: Follow Your Different, we have a special episode featuring a good friend of mine, Dr. Giora Yaron. We talk about what's happening in Israel now, the realities of the situation Israel faces and what's likely to happen next. Dr. Giora Yaron is considered a legend in the startup tech world. He's known as one of the key players in creating the tech startup VC ecosystem. He started his career as a Senior Executive in National Semiconductor in the United States. And subsequent to that he's founded, co-founded, and/or been the chairman of more than 25 Deep-tech startups. He's also the former chairman of Tel Aviv University. Dr. Yaron is also a decorated Israeli Defense Forces Combat officer. And today, he serves as a strategic adviser to the Israeli Ministry of Defense. No matter what you think about this war, no matter how much you think you might know, there's a lot to learn in this riveting captivating, in depth, no BS conversation with a living Israeli legend. Also, it's important to note this episode was recorded on October 26 2023. You're listening to Christopher Lochhead: Follow Your Different. We are the real dialogue podcast for people with a different mind. So get your mind in a different place, and hey ho, let's go. Dr. Giora Yaron on the current situation in Israel Christopher Lochhead and Dr. Giora Yaron discuss the situation in Israel. Dr. Yaron shares how his family was safe living far from conflict zones, although they hosted affected families initially. He mentioned the challenges faced by IDF with a significant number drafted and the delicate balance in completing the mission while saving hostages. Dr. Yaron also highlighted past incidents, comparing the current situation to previous attacks in 1973 and 2002. He expressed concerns about dealing with barbarian savages and the challenge of maintaining Israeli values while addressing the crisis. Dr. Giora Yaron on the conflict's impact on civilians The conversation then shifts to the topic of the recent conflict in Israel and its impact on civilians. Dr. Yaron discusses the strategic and moral dilemmas faced by Israel in dealing with groups like Hamas and the challenges in differentiating between combatants and civilians. He emphasizes the need to combat extremist groups aiming to establish an Islamic state and the importance of military action to achieve this. Christopher notes that many veterans, like Colin Powell, become peacemakers later in life and discussed the heroic efforts of civilians in the conflict. But Dr. Yaron responds that the situation isn't about pursuing peace but dealing with an ongoing conflict. Dr. Giora Yaron on Cultural Differences and how it affects perception in the West Dr. Yaron shares his concerns about the disconnect between Western sympathies for Palestinians and the harsh realities faced by Israelis due to terrorist attacks. He emphasizes the need for a practical approach and shared personal experiences, such as Mellanox's tragic incident, to illustrate the challenges faced in pursuing peace in the region. He further underscores the complexities of the situation and the clash between idealistic hopes for peace and the harsh realities on the ground. To hear more from Dr. Giora Yaron and the clash of ideals in Israel, download and listen to this episode. Bio Dr. Giora Yaron is the former Chairman of Tel Aviv University (Executive Council), and on the board of Amdocs (DOX). Dr. Yaron serves on the advisory board of the Israeli Ministry of Defense. He is also an active Founding Investor and Founder of a group of high-tech and med-tech companies; P-cube, (acquired by Cisco), PentaCom (acquired by Cisco), Qumranet (acquired by Redhat), Comsys (acquired by Conexant, Texas Instruments), Exanet (acquired by Dell) Hyperwise Security (acquired by Checkpoint) Qwilt, Itamar Medical, Excelero, Equalum and, Aqua Security. Dr. Yaron has been serving as board member and/or Chairman of the Boards of these com...

Squawk Pod
Grief, Loss, & Hope in Israel & Gaza 10/24/23

Squawk Pod

Play Episode Listen Later Oct 24, 2023 25:32


Apple supplier Foxconn is reportedly facing a tax investigation in China. Apple's other issue these days: Chinese e-commerce retailers are offering iPhone 15s at a steep discount, raising questions about demand for Apple products. Professor, former CEO, and biographer of both Steve Jobs and Elon Musk Walter Isaacson discusses the tricky terrain for a corporation like Apple, navigating Chinese and American geopolitics. Plus, a pivot to the Middle East. One of many, a tech entrepreneur lost his daughter in the Israel-Hamas War. Eyal Waldman, co-founder of Nvidia-acquired Mellanox, grieves for his daughter, who was among those killed in the Hamas attack on the Nova Music Festival. He shares his perspective on the conflict and on the road forward, hoping for a two-state road to a peaceful future.  Walter Isaacson - 01:39Eyal Waldman - 13:39 In this episode:Walter Isaacson, @walterisaacsonAndrew Ross Sorkin, @andrewrsorkinJoe Kernen, @JoeSquawkBecky Quick, @BeckyQuickKatie Kramer, @Kramer_Katie

The Data Center Frontier Show
VAST Data's Andy Pernsteiner On the Underpinnings of Data-Intensive AI/ML Compute Strategies

The Data Center Frontier Show

Play Episode Listen Later Oct 24, 2023 37:06


For this episode of the Data Center Frontier Show Podcast, we sat down for a chat with Andy Pernsteiner, Field CTO of VAST Data. The VAST Data Platform embodies a revolutionary approach to data-intensive AI computing which the company says serves as "the comprehensive software infrastructure required to capture, catalog, refine, enrich, and preserve data" through real-time deep data analysis and deep learning. In September, VAST Data announced a strategic partnership with CoreWeave, whereby CoreWeave will employ the VAST Data Platform to build a global, NVIDIA-powered accelerated computing cloud for deploying, managing and securing hundreds of petabytes of data for generative AI, high performance computing (HPC) and visual effects (VFX) workloads. That announcement followed news in August that Core42 (formerly G42 Cloud), a leading cloud provider in the UAE and VAST Data had joined forces in an ambitious strategic partnership to build a central data foundation for a global network of AI supercomputers that will store and learn from hundreds of petabytes of data. This week, VAST Data has announced another strategic partnership with Lambda, a, Infrastructure-as-a-Service and compute provider for public and private NVIDIA GPU infrastructure, that will enable a hybrid cloud dedicated to AI and deep learning workloads. The partners will build an NVIDIA GPU-powered accelerated computing platform for Generative AI across both public and private clouds. Lambda selected the VAST Data Platform to power its On-Demand GPU Cloud, providing customer GPU deployments for LLM training and inference workloads. The Lambda, CoreWeave and Core42 announcements represent three burgeoning AI cloud providers within the short space of three months who've chosen to standardize with VAST Data as the scalable data platform behind their respective clouds. Such key partnerships position VAST Data to innovate through a new category of data infrastructure that will build the next-generation public cloud, the company contends As Field CTO at VAST Data, Andy Pernsteiner is helping the company's customers to build, deploy, and scale some of the world's largest and most demanding computing environments. Andy spent the past 15 years focused on supporting and building large scale, high performance data platform solutions. As recounted by his biographical statement, from his humble beginnings as an escalations engineer at pre-IPO Isilon, to leading a team of technical ninjas at MapR, Andy has consistently been on the frontlines of solving some of the toughest challenges that customers face when implementing big data analytics and new-generation AI technologies. Here's a timeline of key points discussed on the podcast: 0:00 - 4:12 - Introducing the VAST Data Platform; recapping VAST Data's latest news announcements; and introducing VAST Data's Field CTO, Andy Pernsteiner. 4:45 - History of the VAST Data Platform. Observations on the growing "stratification" of AI computing practices. 5:34 - Notes on implementing the evolving VAST Data managed platform, both now and in the future. 6:32 - Andy Pernsteiner: "It won't be for everybody...but we're trying to build something that the vast majority of customers and enterprises can use for AI/ML and deep learning." 07:13 - Reading the room, when very few inside that have heard of "a GPU..." or know what its purpose and role is inside AI/ML infrastructure. 07:56 - Andy Pernsteiner: "The fact that CoreWeave exists at all is proof that the market doesn't yet have a way of solving for this big gap between where we are right now, and where we need to get tom in terms of generative AI and in terms of deep learning." 08:17 - How VAST started as a data storage platform, and was extended to include an ambitious database geared for large-scale AI training and inference. 09:02 - How another aspect of VAST is consolidation, "considering what you'd have to do to stitch together a generative AI practice in the cloud." 09:57 - On how the biggest customer bottleneck now is partly the necessary infrastructure, but also partly the necessary expertise. 10:25 - "We think that AI shouldn't just be for hyperscalers to deploy" - and how CoreWeave fits that model. 11:15 - Additional classifications of VAST Data customers are reviewed. 12:02 - Andy Pernsteiner: "One of the unique things that CoreWeave does is they make it easy to get started with GPUs, but also have the breadth and scale to achieve a production state - versus deploying at scale in the public cloud." 13:15 - VAST Data sees themselves bridging the gap between on-prem and in the cloud. 13:35 - Can we talk about NVIDIA for a minute? 14:13 - Notes on NVIDIA's GPU Direct Storage, which VAST Data is one of only a few vendors to enable. 15:10 - More on VAST Data's "strong, fruitful" years-long partnership with NVIDIA. 15:38 - DCF asks about the implications of recent reports that NVIDIA has asked about leasing data center space for its DGX Cloud service. 16:39 - Bottom line: NVIDIA wants to give customers an easy way to use their GPUs. 18:13 - Is VAST Data being positioned as a universally adopted AI computing platform? 19:22 - Andy Pernsteiner: "The goal was always to evolve into a company and into a product line that would allow the customer to do more than just store the data." 20:24 - Andy Pernsteiner: "I think that in the space that we're putting much of our energy into, there isn't really a competitor." 21:12 - How VAST Data is unique in its support of both structured and unstructured data. 22:08 - Andy Pernsteiner: "In many ways, what sets companies like CoreWeave apart from some of the public cloud providers is they focused on saying, we need something extremely high performance for AI and deep learning. The public cloud was never optimized for that - they were optimized for general purpose. We're optimized for AI and deep learning, because we started from a place where performance, cost and efficiency were the most important things." 23:03 - Andy Pernsteiner: "We're unique in this aspect: we've developed a platform from scratch that's optimized for massive scale, performance and efficiency, and it marries very well with the deep learning concept." 24:20 - DCF revisits the question of bridging the perceptible gap in industry knowledge surrounding AI infrastructure readiness. 25:01 - Comments on the necessity of VAST partnering with organizations to build out infrastructure. 26:12 - Andy Pernsteiner: "It's very fortunate that Nvidia acquired Mellanox in many ways, because it gives them the ability to be authoritative on the networking space as well. Because something that's often overlooked when building out AI and deep learning architectures is that you have GPUs and you have storage, but in order to feed it, you need a network that's very high speed and very robust, and that hasn't been the design for most data centers in the past." 27:43 - Andy Pernsteiner: "One of the unique things that we do, is we can bridge the gap between the high performance networks and the enterprise networks." 28:07 - Andy Pernsteiner: "No longer do people have to have separate silos for high performance and AI and for enterprise workloads. They can have it in one place, even if they keep the segmentation for their applications, for security and other purposes. We're the only vendor that I'm aware of that can bridge the gaps between those two worlds, and do so in a way that lets customers get the full value out of all their data." 28:58 - DCF asks: Armed with VAST Data, is a company like CoreWeave ready to go toe-to-toe with the big hyperscale clouds -  or is that not what it's about? 30:38 - Andy Pernsteiner: "We have an engineering organization that's extremely large now that is dedicated to building lots of new applications and services. And our focus on enabling these GPU cloud providers is one of the top priorities for the company right now." 32:26 - DCF asks: Does a platform like VAST Data's address the power availability dilemma that's going to be involved with data centers' widespread uptake of AI computing? Here are some links to some recent related DCF articles: Nvidia is Seeking to Redefine Data Center Acceleration Summer of AI: Hyperscale, Colocation Data Center Infrastructure Focus Tilts Slightly Away From Cloud AI and HPC Drive Demand for Higher Density Data Centers, New As-a-Service Offerings How Intel, AMD and Nvidia are Approaching the AI Arms Race Nvidia is All-In on Generative AI

Acquired
NVIDIA CEO Jensen Huang

Acquired

Play Episode Listen Later Oct 16, 2023 89:48


We finally sit down with the man himself: Nvidia Cofounder & CEO Jensen Huang. After three parts and seven+ hours of covering the company, we thought we knew everything but — unsurprisingly — Jensen knows more. A couple teasers: we learned that the company's initial motivation to enter the datacenter business came from perhaps not where you'd think, and the roots of Nvidia's platform strategy stretch back beyond CUDA all the way to the origin of the company.We also got a peek into Jensen's mindset and calculus behind “betting the company” multiple times, and his surprising feelings about whether he'd go on the founder journey again if he could rewind time. We can't think of any better way to tie a bow on our Nvidia series (for now). Tune in!Editorial Note: We originally recorded this episode before the horrific terrorist attacks in Israel. It feels wrong to release this episode — where the nation of Israel and the Mellanox team are discussed — without sharing our profound sadness for all the families who had innocent loved ones or friends killed, injured, or taken hostage. Our hearts go out to everyone coping through this dark moment in history.Sponsors:Thanks to our fantastic partners, any member of the Acquired community can now get: Your product growth powered by Statsig Scalable, clean and low-cost cloud AI compute from Crusoe (and listen to our recent ACQ2 interview with CEO Chase Lochmiller) Free access to Jensen's favorite business books on Blinkist, plus our favorites on Ben & David's Bookshelf More Acquired!: Get email updates with hints on next episode and follow-ups from recent episodes Join the Slack Subscribe to ACQ2 Become an LP and support the show. Help us pick episodes, Zoom calls and more Merch Store! ‍Note: Acquired hosts and guests may hold assets discussed in this episode. This podcast is not investment advice, and is intended for informational and entertainment purposes only. You should do your own research and make your own independent decisions when considering any financial transactions.

The Alan Sanders Show
A little background and a timeline to counter anti-Israel lies and Americans suffering through Bidenomics

The Alan Sanders Show

Play Episode Listen Later Oct 16, 2023 69:01


Today opens with a short history and timeline of the State of Israel. I realized, over the weekend, the only reason so much pro-Palestinian and anti-Israeli sentiment exists is from a dreadful lack of basic facts and information. I cram a lot in the first half of the show, but I suggest sharing it with those who use any of the following propagandized and factually incorrect talking points: • The land of Israel was historically Muslim land • Israel is responsible for how the land has been divided among inhabitants • Israel colonized Arab lands and expelled all of the Arabs  • Israel is an apartheid state I then play a soundbite of a Palestinian woman who lives freely in Canada. It is now easy to see why so much propaganda leads people to cheer for Hamas as if they are the resistance. Eyal Waldman is an Israeli billionaire, high-tech magnate and founder of Mellanox. He built R&D centers in the West Bank and the Gaza Strip to employ Palestinian developers in order to build better Israeli-Palestinian relations. Hamas murdered his daughter, Danielle, at the music festival. On a separate note, a Wisconsin couple who lived Kfar Aza were pro-Palestinian supporters and were against Israeli policies in the Gaza. Turns out, when Hamas stormed their home, they didn't ask them about their pro-Palestinian philosophy. They killed them all the same. At a UNHRC meeting, a former member of Hamas, Mosab Hassan Yousef, addressed his former group and called them out for the terrorists they are. If you doubt me or anyone else, why would you doubt someone who was once a member of that terror organization. Closer to home, a record 49% of Americans say high prices are eroding their standards of living. The world is in such chaos because our President is inept at everything he does, thinks and believes. Rep. Ritchie Torres (D-NY) put out a blistering op-ed in the NY Post over the weekend. The title of the piece pretty much says it all: Democratic socialists are “indoctrinating” young Americans with anti-Israel hate in “moral monstrosity.” On a positive note, it seems Pfizer had to revise it's forecast down for lack of people wanted to use their anti-Covid drugs. It seems only 2% of Americans have even taken the “fall” Covid shot. And praise for Rep. Cory Mills (R-FL) who, along with his team, has now rescued 96 Americans from Israel, 77 of which he rescued himself. While our White House is a joke on a the world stage, we have a law-maker who remembers what the role of government is supposed to be. Take a moment to rate and review the show and then share the episode on social media. You can find me on Facebook, X, Instagram, GETTR and TRUTH Social by searching for The Alan Sanders Show. You can also support the show by visiting my Patreon page!

Emmy 追劇時間
秒懂哈馬斯恐攻以色列背景!以巴戰局對世界和台灣產業、政治、經濟影響?伊朗中國北韓俄羅斯誰是巴勒斯坦背後影武者?攻敵必救緩解俄烏戰爭泥淖?黎巴嫩真主黨爽領軍援!邪惡軸心成型

Emmy 追劇時間

Play Episode Listen Later Oct 13, 2023 18:29


哈馬斯閃電進犯恐攻,以色列火速召集大軍反擊,列強都在觀望時機,更大衝突箭在弦上,但背後到底是誰在搞鬼?影子戰爭越演越烈,除了伊朗、俄羅斯……中國又是你!你也有份! 而中國積極布局中東,除了支援伊朗外,還與巴勒斯坦發表共同聲明宣告要建立戰略夥伴關係,一個邪惡軸心國已然成形撼動美國領導地位,美國將如何反制? 台灣竟是以色列的第十大出口國,兩國在半導體產業上其實超級像,且除了Intel以外,Nvidia也在以色列布局投資!Dov Frohman又是為什麼可以被稱為「以色列張忠謀」? 在戰火影響下,國際經濟情勢將被如何牽動?以色列央行拋售美元拯救幣值、以色列股市重挫、石油危機似要重演,我們從中又可以見到什麼投資的機會呢? 以色列、阿拉伯世界大戰一觸即發,邪惡軸心國蠢蠢欲動。想知道到底誰在搞鬼,世界政經局勢獨家大解密,快收看Emmy追劇時間! 成為這個頻道的會員並獲得獎勵: https://www.youtube.com/channel/UCUkwvRrpvWkocNdk9qIpRSw/join

Chip Stock Investor Podcast
Episode 37: Nvidia's Secret AI Business (NVDA), TSMC Stock, and 1 AI Chip Stock's Dismal 2023 (AMBA)

Chip Stock Investor Podcast

Play Episode Listen Later Aug 30, 2023 31:22


Mellanox was a little-followed acquisition that Nvidia made in 2020, and it's now paying off big time. Nick and Kasey do a little digging and discuss just how much of an effect that the acquisition has had on recent earnings. TSMC (Taiwan Semiconductor Manufacturing) is a key chip manufacturer and system packaging partner for NVIDIA, so that means its growth will be tied to the biggest AI chip designer, right? Well, maybe, but not just yet. Why is that? Computer vision chip designer Ambarella had a lackluster earnings update, and the outlook for the second half of 2023 was pretty bad. A few months ago, we said this stock was on our watchlist. It appears elevated inventory, and perhaps intense competition from peers like indie Semiconductor, is eating into Ambarella's progress. With the company once again moving on from earlier design work to focus on a new trend (computer vision AI inference), we're no longer interested here at Chip Stock Investor. All this and more in this episode of Chip Stock Investor. Content in this video is for general information or entertainment only and is not specific or individual investment advice. Forecasts and information presented may not develop as predicted and there is no guarantee any strategies presented will be successful. All investing involves risk, and you could lose some or all of your principal. Nicholas and Kasey Rossolillo owns shares of Nvidia.

泰度Voice
S2E4丨搭建数据流通的高铁网络,DPU能否使算力“狂飙”?

泰度Voice

Play Episode Listen Later Jun 12, 2023 47:03


数字经济发展潮流中的每一次技术和应用的飞跃,都离不开强劲算力的支持。当前,加速计算与生成式AI(人工智能)两大技术变革相交汇,如何提升算力集群之间的数据互连互通问题,成为智算场景下的重要挑战。本期节目嘉宾鄢贵海的“解题思路”,是用DPU搭建起CPU和GPU之间的“高铁网络”,让算力节点之间的数据流通效率提升,从而降低“数据中心税”。在探索项技术市场化落地的过程中,他从一名科学家、高校老师,转身成为一位创业者、企业家。 本期节目中,华泰创新投资总监刘诚与中科驭数创始人、CEO鄢贵海,围绕人工智能三要素之一“算力”,展开一场非常硬核、烧脑的科技向对谈,不仅聊到DPU的技术原理及应用场景,算力提升的难点及产业意义,鄢老师也从科学家创业的心路历程谈到芯片投资周期的看法。欢迎收听并在评论区互动留言。 聊天的人 鄢贵海 中科驭数创始人、CEO 刘诚 华泰创新投资总监、投资二部负责人 时间轴 06:46 光有CPU和GPU还不够,DPU构成连点成网的“高铁系统” 09:10 算力的提升极大提升了生活的便捷度和丰富性 12:27 能效比不是单一评价维度,算力有不同评价维度 17:41 创业前的两个深思熟虑:一是市场需求与落地场景,二是产业化条件 18:23 空载状态下,数据中心CPU的占用率高达20%到30% 25:20 让CPU干DPU的活,相当于让公司研发人员搞行政 27:08 通过“软硬结合”,做到逼近极限的“低时延” 37:03 虽然我是老师,但我不喜欢做人生导师 44:51 泡沫或冷热,一方面是分歧也意味着关注 泰度小课堂 DPU:即数据处理器。最早由硅谷创业公司Fungible提出,英伟达收购网络解决方案厂商Mellanox后重新包装定义。主流的DPU定义由英伟达提出,即DPU是集数据中心基础架构于芯片的通用处理器。 数据中心税:即Datacenter Tax。在大型数据中心中,流量处理占到了计算的30%左右,这部分「浪费」的算力被称为数据中心税。 NDPP:全称为Nano-latency Data Processing Platform。这是一款基于中科驭数标准超低时延底座,采用软件可编程框架提供的计算开发平台。 DMA:即Direct Memory Access,直接存储器存取,是一种快速传送数据的机制。 加德纳技术成熟度曲线:1995年由咨询公司加德纳提出,又被称为技术循环曲线。加德纳认为,一个新技术成熟商业化都会经过一个过程包括:萌芽期、过热期、低谷期、复苏期和成熟期。 制作团队 主编:原瑞阳 项目统筹:韦晔 制作:高海博 声音设计:马若晨、陆佳杰 节目运营:小米粒 本节目录制于2023年5月16日,本播客不保证节目播出时援引数据信息的及时、准确、完整。 法律声明 本播客不是华泰证券股份有限公司研究报告(下称”华泰证券”)的发布平台,旨在为公众提供宏观、产业、市场热点解读,不构成华泰证券开展证券投资咨询业务或提供任何的投资建议、投资分析意见。本播客不构成任何合同或承诺的基础,不因任何单纯订阅本播客的行为而将订阅人视为华泰证券客户。任何读者在订阅本播客前,请自行评估接收相关推送内容的适当性,且若使用本播客所载内容,务必寻求专业投资顾问的指导及解读。 本播客内容可能涉及华泰证券分析师对华泰证券已发布研究报告的解读,或转发、摘编华泰证券已发布研究报告的部分内容及观点,完整的分析应以报告发布当日的完整研究报告内容为准。订阅者仅使用本播客内容,可能会因缺乏对完整报告的了解或缺乏相关的解读而产生理解上的歧义。如需了解完整内容,请具体参见华泰证券所发布的完整报告。 就本播客内容涉及的嘉宾言论,华泰证券已事先提醒嘉宾其言论及信息来源应合法合规,不得泄露内幕信息、上市公司重大未公开信息或其他敏感信息,不得侵犯第三方任何合法权益。本播客内容中的嘉宾言论仅代表嘉宾个人意见,不代表华泰证券立场,也不构成对读者的投资建议。 华泰证券对本播客节目文字、音频、图片、链接等形式所载信息的准确性、可靠性、时效性及完整性不作任何明示或暗示的保证。播客内容所述意见、观点和预测仅作为音频录制日的观点和判断。该等意见、评估及预测无需通知即可随时更改。 在任何情况下,本播客文字、音频、图片、链接等形式所载信息均不构成对任何人的投资建议。订阅者不应单独依靠本播客内容而取代自身独立的判断,应自主做出投资决策并自行承担投资风险。对依据或者使用本播客内容所造成的任何后果,华泰证券及节目嘉宾均不承担任何形式的责任。 本播客所有内容的版权均为华泰证券所有。未经华泰证券书面许可,任何机构和个人不得以任何形式转发、转载或部分转载、发表或引用本播客任何内容。 本节目由华泰证券出品,JustPod制作,小宇宙、喜马拉雅、苹果播客同步上线。

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
Debugging the Internet with AI agents – with Itamar Friedman of Codium AI and AutoGPT

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later May 25, 2023 62:36


We are hosting the AI World's Fair in San Francisco on June 8th! You can RSVP here. Come meet fellow builders, see amazing AI tech showcases at different booths around the venue, all mixed with elements of traditional fairs: live music, drinks, games, and food! We are also at Amplitude's AI x Product Hackathon and are hosting our first joint Latent Space + Practical AI Podcast Listener Meetup next month!We are honored by the rave reviews for our last episode with MosaicML! They are also welcome on Apple Podcasts and Twitter/HN/LinkedIn/Mastodon etc!We recently spent a wonderful week with Itamar Friedman, visiting all the way from Tel Aviv in Israel: * We first recorded a podcast (releasing with this newsletter) covering Codium AI, the hot new VSCode/Jetbrains IDE extension focused on test generation for Python and JS/TS, with plans for a Code Integrity Agent. * Then we attended Agent Weekend, where the founders of multiple AI/agent projects got together with a presentation from Toran Bruce Richards on Auto-GPT's roadmap and then from Itamar on Codium's roadmap* Then some of us stayed to take part in the NextGen Hackathon and won first place with the new AI Maintainer project.So… that makes it really hard to recap everything for you. But we'll try!Podcast: Codium: Code Integrity with Zero BugsWhen it launched in 2021, there was a lot of skepticism around Github Copilot. Fast forward to 2023, and 40% of all code is checked in unmodified from Copilot. Codium burst on the scene this year, emerging from stealth with an $11m seed, their own foundation model (TestGPT-1) and a vision to revolutionize coding by 2025.You might have heard of "DRY” programming (Don't Repeat Yourself), which aims to replace repetition with abstraction. Itamar came on the pod to discuss their “extreme DRY” vision: if you already spent time writing a spec, why repeat yourself by writing the code for it? If the spec is thorough enough, automated agents could write the whole thing for you.Live Demo Video SectionThis is referenced in the podcast about 6 minutes in.Timestamps, show notes, and transcript are below the fold. We would really appreciate if you shared our pod with friends on Twitter, LinkedIn, Mastodon, Bluesky, or your social media poison of choice!Auto-GPT: A Roadmap To The Future of WorkMaking his first public appearance, Toran (perhaps better known as @SigGravitas on GitHub) presented at Agents Weekend:Lightly edited notes for those who want a summary of the talk:* What is AutoGPT?AutoGPT is an Al agent that utilizes a Large Language Model to drive its actions and decisions. It can be best described as a user sitting at a computer, planning and interacting with the system based on its goals. Unlike traditional LLM applications, AutoGPT does not require repeated prompting by a human. Instead, it generates its own 'thoughts', criticizes its own strategy and decides what next actions to take.* AutoGPT was released on GitHub in March 2023, and went viral on April 1 with a video showing automatic code generation. 2 months later it has 132k+ stars, is the 29th highest ranked open-source project of all-time, a thriving community of 37.5k+ Discord members, 1M+ downloads.* What's next for AutoGPT? The initial release required users to know how to build and run a codebase. They recently announced plans for a web/desktop UI and mobile app to enable nontechnical/everyday users to use AutoGPT. They are also working on an extensible plugin ecosystem called the Abilities Hub also targeted at nontechnical users.* Improving Efficacy. AutoGPT has many well documented cases where it trips up. Getting stuck in loops, using instead of actual content incommands, and making obvious mistakes like execute_code("writea cookbook"'. The plan is a new design called Challenge Driven Development - Challenges are goal-orientated tasks or problems thatAuto-GPT has difficulty solving or has not yet been able to accomplish. These may include improving specific functionalities, enhancing the model's understanding of specific domains, or even developing new features that the current version of Auto-GPT lacks. (AI Maintainer was born out of one such challenge). Itamar compared this with Software 1.0 (Test Driven Development), and Software 2.0 (Dataset Driven Development).* Self-Improvement. Auto-GPT will analyze its own codebase and contribute to its own improvement. AI Safety (aka not-kill-everyone-ists) people like Connor Leahy might freak out at this, but for what it's worth we were pleasantly surprised to learn that Itamar and many other folks on the Auto-GPT team are equally concerned and mindful about x-risk as well.The overwhelming theme of Auto-GPT's roadmap was accessibility - making AI Agents usable by all instead of the few.Podcast Timestamps* [00:00:00] Introductions* [00:01:30] Itamar's background and previous startups* [00:03:30] Vision for Codium AI: reaching “zero bugs”* [00:06:00] Demo of Codium AI and how it works* [00:15:30] Building on VS Code vs JetBrains* [00:22:30] Future of software development and the role of developers* [00:27:00] The vision of integrating natural language, testing, and code* [00:30:00] Benchmarking AI models and choosing the right models for different tasks* [00:39:00] Codium AI spec generation and editing* [00:43:30] Reconciling differences in languages between specs, tests, and code* [00:52:30] The Israeli tech scene and startup culture* [01:03:00] Lightning RoundShow Notes* Codium AI* Visualead* AutoGPT* StarCoder* TDD (Test-Driven Development)* AST (Abstract Syntax Tree)* LangChain* ICON* AI21TranscriptAlessio: [00:00:00] Hey everyone. Welcome to the Latent Space podcast. This is Alessio, Partner and CTO-in-Residence at Decibel Partners. I'm joined by my co-host, Swyx, writer and editor of Latent Space.Swyx: Today we have a special guest, Tamar Friedman, all the way from Tel Aviv, CEO and co-founder of Codium AI. Welcome.Itamar: Hey, great being here. Thank you for inviting me.Swyx: You like the studio? It's nice, right?Itamar: Yeah, they're awesome.Swyx: So I'm gonna introduce your background a little bit and then we'll learn a bit more about who you are. So you graduated from Teknion Israel Institute of Technology's kind of like the MIT of of Israel. You did a BS in CS, and then you also did a Master's in Computer Vision, which is kind of relevant.You had other startups before this, but your sort of claim to fame is Visualead, which you started in 2011 and got acquired by Alibaba Group You showed me your website, which is the sort of QR codes with different forms of visibility. And in China that's a huge, huge deal. It's starting to become a bigger deal in the west. My favorite anecdote that you told me was something about how much sales use you saved or something. I forget what the number was.Itamar: Generally speaking, like there's a lot of peer-to-peer transactions going on, like payments and, and China with QR codes. So basically if for example 5% of the scanning does not work and with our scanner we [00:01:30] reduce it to 4%, that's a lot of money. Could be tens of millions of dollars a day.Swyx: And at the scale of Alibaba, it serves all of China. It's crazy. You did that for seven years and you're in Alibaba until 2021 when you took some time off and then hooked up with Debbie, who you've known for 25 years, to start Codium AI and you just raised your $11 million seed rounds with TlB Partners and Vine. Congrats. Should we go right into Codium? What is Codium?Itamar: So we are an AI coding assistant / agent to help developers reaching zero bugs. We don't do that today. Right now, we help to reduce the amount of bugs. Actually you can see people commenting on our marketplace page saying that they found bugs with our tool, and that's like our premise. Our vision is like for Tesla zero emission or something like that, for us it's zero bugs.We started with building an IDE extension either in VS Code or in JetBrains. And that actually works alongside the main panel where you write your code and I can show later what we do is analyze the code, whether you started writing it or you completed it.Like you can go both TDD (Test-Driven Development) or classical coding. And we offer analysis, tests, whether they pass or not, we further self debug [00:03:00] them and make suggestions eventually helping to improve the code quality specifically on code logic testing.Alessio: How did you get there? Obviously it's a great idea. Like, what was the idea, maze? How did you get here?Itamar: I'll go back long. So, yes I was two and a half times a CTO, VC backed startup CTO where we talked about the last one that I sold to Alibaba. But basically I'm like, it's weird to say by 20 years already of R&D manager, I'm not like the best programmer because like you mentioned, I'm coming more from the machine learning / computer vision side, one, one of the main application, but a lot of optimization. So I'm not necessarily the best coder, but I am like 20 year R&D manager. And I found that verifying code logic is very hard thing. And one of the thing that really makes it difficult to increase the development velocity.So you have tools related to checking performance.You have tools for vulnerabilities and security, Israelis are really good at that. But do you have a tool that actually helps you test code logic? I think what we have like dozens or hundreds, even thousands that help you on the end to end, maybe on the microservice integration system. But when you talk about code level, there isn't anything.So that was the pain I always had, especially when I did have tools for that, for the hardware. Like I worked in Mellanox to be sold to Nvidia as a student, and we had formal tools, et cetera. [00:04:30] So that's one part.The second thing is that after being sold to Alibaba, the team and I were quite a big team that worked on machine learning, large language model, et cetera, building developer tools relate with, with LLMs throughout the golden years of. 2017 to 2021, 2022. And we saw how powerful they became.So basically, if I frame it this way, because we develop it for so many use cases, we saw that if you're able to take a problem put a framework of a language around it, whether it's analyzing browsing behavior, or DNA, or etc, if you can put a framework off a language, then LLMs take you really far.And then I thought this problem that I have with code logic testing is basically a combination of a few languages: natural language, specification language, technical language. Even visual language to some extent. And then I quit Alibaba and took a bit of time to maybe wrap things around and rest a bit after 20 years of startup and corporate and joined with my partner Dedy Kredo who was my ever first employee.And that's how we like, came to this idea.Alessio: The idea has obviously been around and most people have done AST analysis, kinda like an abstract syntax tree, but it's kind of hard to get there with just that. But I think these models now are getting good enough where you can mix that and also traditional logical reasoning.Itamar: Exactly.Alessio: Maybe talk a little bit more about the technical implementation of it. You mentioned the agent [00:06:00] part. You mentioned some of the model part, like what happens behind the scenes when Codium gets in your code base?Itamar: First of all, I wanna mention I think you're really accurate.If you try to take like a large language model as is and try to ask it, can you like, analyze, test the code, etc, it'll not work so good. By itself it's not good enough on the other side, like all the traditional techniques we already started to invent since the Greek times. You know, logical stuff, you mentioned ASTs, but there's also dynamic code analysis, mutation testing, etc. There's a lot of the techniques out there, but they have inefficiencies.And a lot of those inefficiencies are actually matching with AI capabilities. Let me give you one example. Let's say you wanna do fuzzy testing or mutation testing.Mutation testing means that you either mutate the test, like the input of the test, the code of the test, etc or you mutate the code in order to check how good is your test suite.For example, if I mutate some equation in the application code and the test finds a bug and it does that at a really high rate, like out of 100 mutation, I [00:07:30] find all of the 100 problems in the test. It's probably a very strong test suite.Now the problem is that there's so many options for what to mutate in the data, in the test. And this is where, for example, AI could help, like pointing out where's the best thing that you can mutate. Actually, I think it's a very good use case. Why? Because even if AI is not 100% accurate, even if it's 80% accurate, it could really take you quite far rather just randomly selecting things.So if I wrap up, just go back high level. I think LLM by themselves cannot really do the job of verifying code logic and and neither can the traditional ones, so you need to merge them. But then one more thing before maybe you tell me where to double click. I think with code logic there's also a philosophy question here.Logic different from performance or quality. If I did a three for in loop, like I loop three things and I can fold them with some vector like in Python or something like that. We need to get into the mind of the developer. What was the intention? Like what is the bad code? Not what is the code logic that doesn't work. It's not according to the specification. So I think like one more thing that AI could really help is help to match, like if there is some natural language description of the code, we can match it. Or if there's missing information in natural language that needs [00:09:00] to be asked for the AI could help asking the user.It's not like a closed solution. Rather open and leaving the developer as the lead. Just like moving the developer from, from being the coder to actually being like a pilot that that clicks button and say, ah, this is what I meant, or this is the fix, rather actually writing all the code.Alessio: That makes sense. I think I talked about it on the podcast before, but like the switch from syntax to like semantics, like developers used to be focused on the syntax and not the meaning of what they're writing. So now you have the models that are really good at the syntax and you as a human are supposed to be really good at the semantics of what you're trying to build.How does it practically work? So I'm a software developer, I want to use Codium, like how do I start and then like, how do you make that happen in the, in the background?Itamar: So, like I said, Codium right now is an IDE extension. For example, I'm showing VS code. And if you just install it, like you'll have a few access points to start Codium AI, whether this sidebar or above every component or class that we think is very good to check with Codium.You'll have this small button. There's other way you can mark specific code and right click and run code. But this one is my favorite because we actually choose above which components we suggest to use code. So once I click it code, I starts analyzing this class. But not only this class, but almost everything that is [00:10:30] being used by the call center class.But all and what's call center is, is calling. And so we do like a static code analysis, et cetera. What, what we talked about. And then Codium provides with code analysis. It's right now static, like you can't change. It can edit it, and maybe later we'll talk about it. This is what we call the specification and we're going to make it editable so you can add additional behaviors and then create accordingly, test that will not pass, and then the code will, will change accordingly. So that's one entrance point, like via natural language description. That's one of the things that we're working on right now. What I'm showing you by the way, could be downloaded as is. It's what we have in production.The second thing that we show here is like a full test suite. There are six tests by default but you can just generate more almost as much as you want every time. We'll try to cover something else, like a happy pass edge case et cetera. You can talk with specific tests, okay? Like you can suggest I want this in Spanish or give a few languages, or I want much more employees.I didn't go over what's a call center, but basically it manages like call center. So you can imagine, I can a ask to make it more rigorous, etc, but I don't wanna complicate so I'm keeping it as is.I wanna show you the next one, which is run all test. First, we verify that you're okay, we're gonna run it. I don't know, maybe we are connected to the environment that is currently [00:12:00] configured in the IDE. I don't know if it's production for some reason, or I don't know what. Then we're making sure that you're aware we're gonna run the code that and then once we run, we show if it pass or fail.I hope that we'll have one fail. But I'm not sure it's that interesting. So I'll go like to another example soon, but, but just to show you what's going on here, that we actually give an example of what's a problem. We give the log of the error and then you can do whatever you want.You can fix it by yourself, or you can click reflect and fix, and what's going on right now is a bit a longer process where we do like chain of thought or reflect and fix. And we can suggest a solution. You can run it and in this case it passes. Just an example, this is a very simple example.Maybe later I'll show you a bug. I think I'll do that and I'll show you a bug and how we recognize actually the test. It's not a problem in the test, it's a problem in the code and then suggest you fix that instead of the code. I think you see where I'm getting at.The other thing is that there are a few code suggestion, and there could be a dozen of, of types that could be related to performance modularity or I see this case there is a maintainability.There could also be vulnerability or best practices or even suggestion for bugs. Like if we noticed, if we think one of the tests, for example, is failing because of a bug. So just code presented in the code suggestion. Probably you can choose a few, for example, if you like, and then prepare a code change like I didn't show you which exactly.We're making a diff now that you can apply on your code. So basically what, what we're seeing here is that [00:13:30] there are three main tabs, the code, the test and the code analysis. Let's call spec.And then there's a fourth tab, which is a code suggestion, if you wanna look at analytics, etc. Mm-hmm. Right now code okay. This is the change or quite a big change probably clicked on something. So that's the basic demo.Right now let's be frank. Like I wanted to show like a simple example. So it's a call center. All the inputs to the class are like relatively simple. There is no jsm input, like if you're Expedia or whatever, you have a J with the hotels, Airbnb, you know, so the test will be almost like too simple or not covering enough.Your code, if you don't provide it with some input is valuable, like adjacent with all information or YAMA or whatever. So you can actually add input data and the AI or model. It's actually by the way, a set of models and algorithms that will use that input to create interesting tests. And another thing is many people have some reference tests that they already made. It could be because they already made it or because they want like a very specific they have like how they imagine the test. So they just write one and then you add a reference and that will inspire all the rest of the tests. And also you can give like hints. [00:15:00] This is by the way plan to be like dynamic hints, like for different type of code.We will provide different hints. So we can help you become a bit more knowledgeable about how to test your code. So you can ask for like having a, a given one then, or you can have like at a funny private, like make different joke for each test or for example,Swyx: I'm curious, why did you choose that one? This is the pirate one. Yeah.Itamar: Interesting choice to put on your products. It could be like 11:00 PM of people sitting around. Let's choose one funny thingSwyx: and yeah. So two serious ones and one funny one. Yeah. Just for the listening audience, can you read out the other hints that you decided on as well?Itamar: Yeah, so specifically, like for this case, relatively very simple class, so there's not much to do, but I'm gonna go to one more thing here on the configuration. But it basically is given when then style, it's one of the best practices and tests. So even when I report a bug, for example, I found a bug when someone else code, usually I wanna say like, given, use this environment or use that this way when I run this function, et cetera.Oh, then it's a very, very full report. And it's very common to use that in like in unit test and perform.Swyx: I have never been shown this format.Itamar: I love that you, you mentioned that because if you go to CS undergrad you take so many courses in development, but none of them probably in testing, and it's so important. So why would you, and you don't go to Udemy or [00:16:30] whatever and, and do a testing course, right? Like it's, it's boring. Like people either don't do component level testing because they hate it or they do it and they hate it. And I think part of it it's because they're missing tool to make it fun.Also usually you don't get yourself educated about it because you wanna write your code. And part of what we're trying to do here is help people get smarter about testing and make it like easy. So this is like very common. And the idea here is that for different type of code, we'll suggest different type of hints to make you more knowledgeable.We're doing it on an education app, but we wanna help developers become smarter, more knowledgeable about this field. And another one is mock. So right now, our model decided that there's no need for mock here, which is a good decision. But if we would go to real world case, like, I'm part of AutoGPT community and there's all of tooling going on there. Right? And maybe when I want to test like a specific component, and it's relatively clear that going to the web and doing some search and coming back, I don't really need to do that. Like I know what I expect to do and so I can mock that part of using to crawl the web.A certain percentage of accuracy, like around 90, we will decide this is worth mocking and we will inject it. I can click it now and force our system to mock this. But you'll see like a bit stupid mocking because it really doesn't make sense. So I chose this pirate stuff, like add funny pirate like doc stringing make a different joke for each test.And I forced it to add mocks, [00:18:00] the tests were deleted and now we're creating six new tests. And you see, here's the shiver me timbers, the test checks, the call successful, probably there's some joke at the end. So in this case, like even if you try to force it to mock it didn't happen because there's nothing but we might find here like stuff that it mock that really doesn't make sense because there's nothing to mock here.So that's one thing I. I can show a demo where we actually catch a bug. And, and I really love that, you know how it is you're building a developer tools, the best thing you can see is developers that you don't know giving you five stars and sharing a few stuff.We have a discord with thousands of users. But I love to see the individual reports the most. This was one of my favorites. It helped me to find two bugs. I mentioned our vision is to reach zero bugs. Like, if you may say, we want to clean the internet from bugs.Swyx: So debugging the internet. I have my podcast title.Itamar: So, so I think like if we move to another exampleSwyx: Yes, yes, please, please. This is great.Itamar: I'm moving to a different example, it is the bank account. By the way, if you go to ChatGPT and, and you can ask me what's the difference between Codium AI and using ChatGPT.Mm-hmm. I'm, I'm like giving you this hard question later. Yeah. So if you ask ChatGPT give me an example to test a code, it might give you this bank account. It's like the one-on-one stuff, right? And one of the reasons I gave it, because it's easy to inject bugs here, that's easy to understand [00:19:30] anyway.And what I'm gonna do right now is like this bank account, I'm gonna change the deposit from plus to minus as an example. And then I'm gonna run code similarly to how I did before, like it suggests to do that for the entire class. And then there is the code analysis soon. And when we announce very soon, part of this podcast, it's going to have more features here in the code analysis.We're gonna talk about it. Yep. And then there is the test that I can run. And the question is that if we're gonna catch the bag, the bugs using running the test, Because who knows, maybe this implementation is the right one, right? Like you need to, to converse with the developer. Maybe in this weird bank, bank you deposit and, and the bank takes money from you.And we could talk about how this happens, but actually you can see already here that we are already suggesting a hint that something is wrong here and here's a suggestion to put it from minus to to plus. And we'll try to reflect and, and fix and then we will see actually the model telling you, hey, maybe this is not a bug in the test, maybe it's in the code.Swyx: I wanna stay on this a little bit. First of all, this is very impressive and I think it's very valuable. What user numbers can you disclose, you launched it and then it's got fairly organic growth. You told me something off the air, but you know, I just wanted to show people like this is being adopted in quite a large amount.Itamar:  [00:21:00] First of all, I'm a relatively transparent person. Like even as a manager, I think I was like top one percentile being transparent in Alibaba. It wasn't five out of five, which is a good thing because that's extreme, but it was a good, but it also could be a bad, some people would claim it's a bad thing.Like for example, if my CTO in Alibaba would tell me you did really bad and it might cut your entire budget by 30%, if in half a year you're not gonna do like much better and this and that. So I come back to a team and tell 'em what's going on without like trying to smooth thing out and we need to solve it together.If not, you're not fitting in this team. So that's my point of view. And the same thing, one of the fun thing that I like about building for developers, they kind of want that from you. To be transparent. So we are on the high numbers of thousands of weekly active users. Now, if you convert from 50,000 downloads to high thousands of weekly active users, it means like a lot of those that actually try us keep using us weekly.I'm not talking about even monthly, like weekly. And that was like one of their best expectations because you don't test your code every day. Right now, you can see it's mostly focused on testing. So you probably test it like once a week. Like we wanted to make it so smooth with your development methodology and development lifecycle that you use it every day.Like at the moment we hope it to be used weekly. And that's what we're getting. And the growth is about like every two, three weeks we double the amount of weekly and downloads. It's still very early, like seven weeks. So I don't know if it'll keep that way, but we hope so. Well [00:22:30] actually I hope that it'll be much more double every two, three weeks maybe. Thanks to the podcast.Swyx: Well, we, yeah, we'll, we'll add you know, a few thousand hopefully. The reason I ask this is because I think there's a lot of organic growth that people are sharing it with their friends and also I think you've also learned a lot from your earliest days in, in the private beta test.Like what have you learned since launching about how people want to use these testing tools?Itamar: One thing I didn't share with you is like, when you say virality, there is like inter virality and intra virality. Okay. Like within the company and outside the company. So which teams are using us? I can't say, but I can tell you that a lot of San Francisco companies are using us.And one of the things like I'm really surprised is that one team, I saw one user two weeks ago, I was so happy. And then I came yesterday and I saw 48 of that company. So what I'm trying to say to be frank is that we see more intra virality right now than inter virality. I don't see like video being shared all around Twitter. See what's going on here. Yeah. But I do see, like people share within the company, you need to use it because it's really helpful with productivity and it's something that we will work about the [00:24:00] inter virality.But to be frank, first I wanna make sure that it's helpful for developers. So I care more about intra virality and that we see working really well, because that means that tool is useful. So I'm telling to my colleague, sharing it on, on Twitter means that I also feel that it will make me cool or make me, and that's something maybe we'll need, still need, like testing.Swyx: You know, I don't, well, you're working on that. We're gonna announce something like that. Yeah. You are generating these tests, you know, based on what I saw there. You're generating these tests basically based on the name of the functions. And the doc strings, I guess?Itamar:So I think like if you obfuscate the entire code, like our accuracy will drop by 50%. So it's right. We're using a lot of hints that you see there. Like for example, the functioning, the dog string, the, the variable names et cetera. It doesn't have to be perfect, but it has a lot of hints.By the way. In some cases, in the code suggestion, we will actually suggest renaming some of the stuff that will sync, that will help us. Like there's suge renaming suggestion, for example. Usually in this case, instead of calling this variable is client and of course you'll see is “preferred client” because basically it gives a different commission for that.So we do suggest it because if you accept it, it also means it will be easier for our model or system to keep improving.Swyx: Is that a different model?Itamar: Okay. That brings a bit to the topic of models properties. Yeah. I'll share it really quickly because Take us off. Yes. It's relevant. Take us off. Off. Might take us off road.I think [00:25:30] like different models are better on different properties, for example, how obedient you are to instruction, how good you are to prompt forcing, like to format forcing. I want the results to be in a certain format or how accurate you are or how good you are in understanding code.There's so many calls happening here to models by the way. I. Just by clicking one, Hey Codium AI. Can you help me with this bank account? We do a dozen of different calls and each feature you click could be like, like with that reflect and fix and then like we choose the, the best one.I'm not talking about like hundreds of models, but we could, could use different APIs of open AI for example, and, and other models, et cetera. So basically like different models are better on different aspect. Going back to your, what we talked about, all the models will benefit from having those hints in, in the code, that rather in the code itself or documentation, et cetera.And also in the code analysis, we also consider the code analysis to be the ground truth to some extent. And soon we're also going to allow you to edit it and that will use that as well.Alessio: Yeah, maybe talk a little bit more about. How do I actually get all these models to work together? I think there's a lot of people that have only been exposed to Copilot so far, which is one use case, just complete what I'm writing. You're doing a lot more things here. A lot of people listening are engineers themselves, some of them build these tools, so they would love to [00:27:00] hear more about how do you orchestrate them, how do you decide which model the what, stuff like that.Itamar: So I'll start with the end because that is a very deterministic answer, is that we benchmark different models.Like every time this there a new model in, in town, like recently it's already old news. StarCoder. It's already like, so old news like few days ago.Swyx: No, no, no. Maybe you want to fill in what it is StarCoder?Itamar: I think StarCoder is, is a new up and coming model. We immediately test it on different benchmark and see if, if it's better on some properties, et cetera.We're gonna talk about it like a chain of thoughts in different part in the chain would benefit from different property. If I wanna do code analysis and, and convert it to natural language, maybe one model would be, would be better if I want to output like a result in, in a certain format.Maybe another model is better in forcing the, a certain format you probably saw on Twitter, et cetera. People talk about it's hard to ask model to output JSON et cetera. So basically we predefine. For different tasks, we, we use different models and I think like this is for individuals, for developers to check, try to sync, like the test that now you are working on, what is most important for you to get, you want the semantic understanding, that's most important? You want the output, like are you asking for a very specific [00:28:30] output?It's just like a chat or are you asking to give a output of code and have only code, no description. Or if there's a description of the top doc string and not something else. And then we use different models. We are aiming to have our own models in in 2024. Being independent of any other third party, like OpenAI or so, but since our product is very challenging, it has UI/UX challenges, engineering challenge, statical and dynamical analysis, and AI.As entrepreneur, you need to choose your battles. And we thought that it's better for us to, to focus on everything around the model. And one day when we are like thinking that we have the, the right UX/UI engineering, et cetera, we'll focus on model building. This is also, by the way, what we did in in Alibaba.Even when I had like half a million dollar a month for trading one foundational model, I would never start this way. You always try like first using the best model you can for your product. Then understanding what's the glass ceiling for that model? Then fine tune a foundation model, reach a higher glass ceiling and then training your own.That's what we're aiming and that's what I suggest other developers like, don't necessarily take a model and, and say, oh, it's so easy these days to do RLHF, et cetera. Like I see it's like only $600. Yeah, but what are you trying to optimize for? The properties. Don't try to like certain models first, organize your challenges.Understand the [00:30:00] properties you're aiming for and start playing with that. And only then go to train your own model.Alessio: Yeah. And when you say benchmark, you know, we did a one hour long episode, some benchmarks, there's like many of them. Are you building some unique evals to like your own problems? Like how are you doing that? And that's also work for your future model building, obviously, having good benchmarks. Yeah.Itamar:. Yeah. That's very interesting. So first of all, with all the respect, I think like we're dealing with ML benchmark for hundreds of years now.I'm, I'm kidding. But like for tens of years, right? Benchmarking statistical creatures is something that, that we're doing for a long time. I think what's new here is the generative part. It's an open challenge to some extent. And therefore, like maybe we need to re rethink some of the way we benchmark.And one of the notions that I really believe in, I don't have a proof for that, is like create a benchmark in levels. Let's say you create a benchmark from level one to 10, and it's a property based benchmark. Let's say I have a WebGPT ask something from the internet and then it should fetch it for me.So challenge level one could be, I'm asking it and it brings me something. Level number two could be I'm asking it and it has a certain structure. Let's say for example, I want to test AutoGPT. Okay. And I'm asking it to summarize what's the best cocktail I could have for this season in San Francisco.So [00:31:30] I would expect, like, for example, for that model to go. This is my I what I think to search the internet and do a certain thing. So level number three could be that I want to check that as part of this request. It uses a certain tools level five, you can add to that. I expect that it'll bring me back something like relevance and level nine it actually prints the cocktail for me I taste it and it's good. So, so I think like how I see it is like we need to have data sets similar to before and make sure that we not fine tuning the model the same way we test it. So we have one challenges that we fine tune over, right? And few challenges that we don't.And the new concept may is having those level which are property based, which is something that we know from software testing and less for ML. And this is where I think that these two concepts merge.Swyx: Maybe Codium can do ML testing in the future as well.Itamar: Yeah, that's a good idea.Swyx: Okay. I wanted to cover a little bit more about Codium in the present and then we'll go into the slides that you have.So you have some UI/UX stuff and you've obviously VS Code is the majority market share at this point of IDE, but you also have IntelliJ right?Itamar: Jet Brains in general.Swyx: Yeah. Anything that you learned supporting JetBrains stuff? You were very passionate about this one user who left you a negative review.What is the challenge of that? Like how do you think about the market, you know, maybe you should focus on VS Code since it's so popular?Itamar: Yeah. [00:33:00] So currently the VS Code extension is leading over JetBrains. And we were for a long time and, and like when I tell you long time, it could be like two or three weeks with version oh 0.5, point x something in, in VS code, although oh 0.4 or so a jet brains, we really saw the difference in, in the how people react.So we also knew that oh 0.5 is much more meaningful and one of the users left developers left three stars on, on jet brands and I really remember that. Like I, I love that. Like it's what do you want to get at, at, at our stage? What's wrong? Like, yes, you want that indication, you know, the worst thing is getting nothing.I actually, not sure if it's not better to get even the bad indication, only getting good ones to be re frank like at, at, at least in our stage. So we're, we're 9, 10, 10 months old startup. So I think like generally speaking We find it easier and fun to develop in vs code extension versus JetBrains.Although JetBrains has like very nice property, when you develop extension for one of the IDEs, it usually works well for all the others, like it's one extension for PyCharm, and et cetera. I think like there's even more flexibility in the VS code. Like for example, this app is, is a React extension as opposed that it's native in the JetBrains one we're using. What I learned is that it's basically is almost like [00:34:30] developing Android and iOS where you wanna have a lot of the best practices where you have one backend and all the software development like best practices with it.Like, like one backend version V1 supports both under Android and iOS and not different backends because that's crazy. And then you need all the methodology. What, what means that you move from one to 1.1 on the backend? What supports whatnot? If you don't what I'm talking about, if you developed in the past, things like that.So it's important. And then it's like under Android and iOS and, and you relatively want it to be the same because you don't want one developer in the same team working with Jet Brains and then other VS code and they're like talking, whoa, that's not what I'm seeing. And with code, what are you talking about?And in the future we're also gonna have like teams offering of collaboration Right now if you close Codium Tab, everything is like lost except of the test code, which you, you can, like if I go back to a test suite and do open as a file, and now you have a test file with everything that you can just save, but all the goodies here it's lost. One day we're gonna have like a platform you can save all that, collaborate with people, have it part of your PR, like have suggested part of your PR. And then you wanna have some alignment. So one of the challenges, like UX/UI, when you think about a feature, it should, some way or another fit for both platforms be because you want, I think by the way, in iOS and Android, Android sometimes you don't care about parity, but here you're talking about developers that might be on the same [00:36:00] team.So you do care a lot about that.Alessio: Obviously this is a completely different way to work for developers. I'm sure this is not everything you wanna build and you have some hint. So maybe take us through what you see the future of software development look like.Itamar: Well, that's great and also like related to our announcement, what we're working on.Part of it you already start seeing in my, in my demo before, but now I'll put it into a framework. I'll be clearer. So I think like the software development world in 2025 is gonna look very different from 2020. Very different. By the way. I think 2020 is different from 2000. I liked the web development in 95, so I needed to choose geocities and things like that.Today's much easier to build a web app and whatever, one of the cloud. So, but I think 2025 is gonna look very different in 2020 for the traditional coding. And that's like a paradigm I don't think will, will change too much in the last few years. And, and I'm gonna go over that when I, when I'm talking about, so j just to focus, I'm gonna show you like how I think the intelligence software development world look like, but I'm gonna put it in the lens of Codium AI.We are focused on code integrity. We care that with all this advancement of co-generation, et cetera, we wanna make sure that developers can code fast with confidence. That they have confidence on generated code in the AI that they are using that. That's our focus. So I'm gonna put, put that like lens when I'm going to explain.So I think like traditional development. Today works like creating some spec for different companies, [00:37:30] different development teams. Could mean something else, could be something on Figma, something on Google Docs, something on Jira. And then usually you jump directly to code implementation. And then if you have the time or patience, or will, you do some testing.And I think like some people would say that it's better to do TDD, like not everyone. Some would say like, write spec, write your tests, make sure they're green, that they do not pass. Write your implementation until your test pass. Most people do not practice it. I think for just a few, a few reason, let them mention two.One, it's tedious and I wanna write my code like before I want my test. And I don't think, and, and the second is, I think like we're missing tools to make it possible. And what we are advocating, what I'm going to explain is actually neither. Okay. It's very, I want to say it's very important. So here's how we think that the future of development pipeline or process is gonna look like.I'm gonna redo it in steps. So, first thing I think there do I wanna say that they're gonna be coding assistance and coding agents. Assistant is like co-pilot, for example, and agents is something that you give it a goal or a task and actually chains a few tasks together to complete your goal.Let's have that in mind. So I think like, What's happening right now when you saw our demo is what I presented a few minutes ago, is that you start with an implementation and we create spec for you and test for you. And that was like a agent, like you didn't converse with it, you just [00:39:00] click a button.And, and we did a, a chain of thought, like to create these, that's why it's it's an agent. And then we gave you an assistant to change tests, like you can converse it with it et cetera. So that's like what I presented today. What we're announcing is about a vision that we called the DRY. Don't repeat yourself. I'm gonna get to that when I'm, when I'm gonna show you the entire vision. But first I wanna show you an intermediate step that what we're going to release. So right now you can write your code. Or part of it, like for example, just a class abstract or so with a coding assistant like copilot and maybe in the future, like a Codium AI coding assistant.And then you can create a spec I already presented to you. And the next thing is that you going to have like a spec assistant to generate technical spec, helping you fill it quickly focused on that. And this is something that we're working on and, and going to release the first feature very soon as part of announcement.And it's gonna be very lean. Okay? We're, we're a startup that going bottom up, like lean features going to more and more comprehensive one. And then once you have the spec and implementation, you can either from implementation, have tests, and then you can run the test and fix them like I presented to you.But you can also from spec create tests, okay? From the spec directly to tests. [00:40:30]So then now you have a really interesting thing going on here is that you can start from spec, create, test, create code. You can start from test create code. You can start from a limitation. From code, create, spec and test. And actually we think the future is a very flexible one. You don't need to choose what you're practicing traditional TDD or whatever you wanna start with.If you have already some spec being created together with one time in one sprint, you decided to write a spec because you wanted to align about it with your team, et cetera, and now you can go and create tests and implementation or you wanted to run ahead and write your code. Creating tests and spec that aligns to it will be relatively easy.So what I'm talking about is extreme DRY concept; DRY is don't repeat yourself. Until today when we talked about DRY is like, don't repeat your code. I claim that there is a big parts of the spec test and implementation that repeat himself, but it's not a complete repetition because if spec was as detailed as the implementation, it's actually the implementation.But the spec is usually in different language, could be natural language and visual. And what we're aiming for, our vision is enabling the dry concept to the extreme. With all these three: you write your test will help you generate the code and the spec you write your spec will help you doing the test and implementation.Now the developers is the driver, okay? You'll have a lot [00:42:00] of like, what do you think about this? This is what you meant. Yes, no, you wanna fix the coder test, click yes or no. But you still be the driver. But there's gonna be like extreme automation on the DRY level. So that's what we're announcing, that we're aiming for as our vision and what we're providing these days in our product is the middle, is what, what you see in the middle, which is our code integrity agents working for you right now in your id, but soon also part of your Github actions, et cetera, helping you to align all these three.Alessio: This is great. How do you reconcile the difference in languages, you know, a lot of times the specs is maybe like a PM or it's like somebody who's more at the product level.Some of the implementation details is like backend developers for something. Frontend for something. How do you help translate the language between the two? And then I think in the one of the blog posts on your blog, you mentioned that this is also changing maybe how programming language themselves work. How do you see that change in the future? Like, are people gonna start From English, do you see a lot of them start from code and then it figures out the English for them?Itamar: Yeah. So first of all, I wanna say that although we're working, as we speak on managing we front-end frameworks and languages and usage, we are currently focused on the backend.So for example, as the spec, we won't let you input Figma, but don't be surprised if in 2024 the input of the spec could be a Figma. Actually, you can see [00:43:30] demos of that on a pencil drawing from OpenAI and when he exposed the GPT-4. So we will have that actually.I had a blog, but also I related to two different blogs. One, claiming a very knowledgeable and respectful, respectful person that says that English is going to be the new language program language and, and programming is dead. And another very respectful person, I think equally said that English is a horrible programming language.And actually, I think both of are correct. That's why when I wrote the blog, I, I actually related, and this is what we're saying here. Nothing is really fully redundant, but what's annoying here is that to align these three, you always need to work very hard. And that's where we want AI to help with. And if there is inconsistency will raise a question, what do, which one is true?And just click yes or no or test or, or, or code that, that what you can see in our product and we'll fix the right one accordingly. So I think like English and, and visual language and code. And the test language, let's call it like, like that for a second. All of them are going to persist. And just at the level of automation aligning all three is what we're aiming for.Swyx: You told me this before, so I I'm, I'm just actually seeing Alessio's reaction to it as a first time.Itamar: Yeah, yeah. Like you're absorbing like, yeah, yeah.Swyx: No, no. This is, I mean, you know, you can put your VC hat on or like compare, like what, what is the most critical or unsolved question presented by this vision?Alessio: A lot of these tools, especially we've seen a lot in the past, it's like the dynamic nature of a lot of this, you know?[00:45:00] Yeah. Sometimes, like, as you mentioned, sometimes people don't have time to write the test. Sometimes people don't have time to write the spec. Yeah. So sometimes you end up with things. Out of sync, you know? Yeah. Or like the implementation is moving much faster than the spec, and you need some of these agents to make the call sometimes to be like, no.Yeah, okay. The spec needs to change because clearly if you change the code this way, it needs to be like this in the future. I think my main question as a software developer myself, it's what is our role in the future? You know? Like, wow, how much should we intervene, where should we intervene?I've been coding for like 15 years, but if I've been coding for two years, where should I spend the next year? Yeah. Like focus on being better at understanding product and explain it again. Should I get better at syntax? You know, so that I can write code. Would love have any thoughts.Itamar: Yeah. You know, there's gonna be a difference between 1, 2, 3 years, three to six, six to 10, and 10 to 20. Let's for a second think about the idea that programming is solved. Then we're talking about a machine that can actually create any piece of code and start creating, like we're talking about singularity, right?Mm-hmm. If the singularity happens, then we're talking about this new set of problems. Let's put that aside. Like even if it happens in 2041, that's my prediction. I'm not sure like you should aim for thinking what you need to do, like, or not when the singularity happens. So I, [00:46:30] I would aim for mm-hmm.Like thinking about the future of the next five years or or, so. That's my recommendation because it's so crazy. Anyway. Maybe not the best recommendation. Take that we're for grain of salt. And please consult with a lawyer, at least in the scope of, of the next five years. The idea that the developers is the, the driver.It actually has like amazing team members. Agents that working for him or her and eventually because he or she's a driver, you need to understand especially what you're trying to achieve, but also being able to review what you get. The better you are in the lower level of programming in five years, it it mean like real, real program language.Then you'll be able to develop more sophisticated software and you will work in companies that probably pay more for sophisticated software and the more that you're less skilled in, in the actual programming, you actually would be able to be the programmer of the new era, almost a creator. You'll still maybe look on the code levels testing, et cetera, but what's important for you is being able to convert products, requirements, et cetera, to working with tools like Codium AI.So I think like there will be like degree of diff different type developers now. If you think about it for a second, I think like it's a natural evolution. It's, it's true today as well. Like if you know really good the Linux or assembly, et cetera, you'll probably work like on LLVM Nvidia [00:48:00] whatever, like things like that.Right. And okay. So I think it'll be like the next, next step. I'm talking about the next five years. Yeah. Yeah. Again, 15 years. I think it's, it's a new episode if you would like to invite me. Yeah. Oh, you'll be, you'll be back. Yeah. It's a new episode about how, how I think the world will look like when you really don't need a developer and we will be there as Cody mi like you can see.Mm-hmm.Alessio: Do we wanna dive a little bit into AutoGPT? You mentioned you're part of the community. Yeah.Swyx: Obviously Try, Catch, Finally, Repeat is also part of the company motto.Itamar: Yeah. So it actually really. Relates to what we're doing and there's a reason we have like a strong relationship and connection with the AutoGPT community and us being part part of it.So like you can see, we're talking about agent for a few months now, and we are building like a designated, a specific agent because we're trying to build like a product that works and gets the developer trust to have developer trust us. We're talking about code integrity. We need it to work. Like even if it will not put 100% it's not 100% by the way our product at all that UX/UI should speak the language of, oh, okay, we're not sure here, please take the driving seat.You want this or that. But we really not need, even if, if we're not close to 100%, we still need to work really well just throwing a number. 90%. And so we're building a like really designated agents like those that from code, create tests.So it could create tests, run them, fix them. It's a few tests. So we really believe in that we're [00:49:30] building a designated agent while Auto GPT is like a swarm of agents, general agents that were supposedly you can ask, please make me rich or make me rich by increase my net worth.Now please be so smart and knowledgeable to use a lot of agents and the tools, et cetera, to make it work. So I think like for AutoGPT community was less important to be very accurate at the beginning, rather to show the promise and start building a framework that aims directly to the end game and start improving from there.While what we are doing is the other way around. We're building an agent that works and build from there towards that. The target of what I explained before. But because of this related connection, although it's from different sides of the, like the philosophy of how you need to build those things, we really love the general idea.So we caught it really early that with Toran like building it, the, the maker of, of AutoGPT, and immediately I started contributing, guess what, what did I contribute at the beginning tests, right? So I started using Codium AI to build tests for AutoGPT, even, even finding problems this way, et cetera.So I become like one of the, let's say 10 contributors. And then in the core team of the management, I talk very often with with Toran on, on different aspects. And we are even gonna have a workshop,Swyx: a very small [00:49:00] meetingItamar: work meeting workshop. And we're going to compete together in a, in a hackathons.And to show that AutoGPT could be useful while, for example, Codium AI is creating the test for it, et cetera. So I'm part of that community, whether is my team are adding tests to it, whether like advising, whether like in in the management team or whether to helping Toran. Really, really on small thing.He is the amazing leader like visionaire and doing really well.Alessio: What do you think is the future of open source development? You know, obviously this is like a good example, right? You have code generating the test and in the future code could actually also implement the what the test wanna do. So like, yeah.How do you see that change? There's obviously not enough open source contributors and yeah, that's one of the, the main issue. Do you think these agents are maybe gonna help us? Nadia Eghbal has this  great book called like Working in Public and there's this type of projects called Stadium model, which is, yeah, a lot of people use them and like nobody wants to contribute to them.I'm curious about, is it gonna be a lot of noise added by a lot of these agents if we let them run on any repo that is open source? Like what are the contributing guidelines for like humans versus agents? I don't have any of the answers, but like some of the questions that I've been thinking about.Itamar: Okay. So I wanna repeat your question and make sure I understand you, but like, if they're agents, for example, dedicated for improving code, why can't we run them on, mm-hmm.Run them on like a full repository in, in fixing that? The situation right now is that I don't think that right now Auto GPT would be able to do that for you. Codium AI might but it's not open sourced right now. And and like you can see like in the months or two, you will be able to like running really quickly like development velocity, like our motto is moving fast with confidence by the way.So we try to like release like every day or so, three times even a day in the backend, et cetera. And we'll develop more feature, enable you, for example, to run an entire re, but, but it's not open source. So about the open source I think like AutoGPT or LangChain, you can't really like ask please improve my repository, make it better.I don't think it will work right now because because let me like. Softly quote Ilya from Open AI. He said, like right now, let's say that a certain LLM is 95% accurate. Now you're, you're concatenating the results. So the accuracy is one point like it's, it's decaying. And what you need is like more engineering frameworks and work to be done there in order to be able to deal with inaccuracies, et cetera.And that's what we specialize in Codium, but I wanna say that I'm not saying that Auto GPT won't be able to get there. Like the more tools and that going to be added, the [00:52:30] more prompt engineering that is dedicated for this, this idea will be added by the way, where I'm talking with Toran, that Codium, for example, would be one of the agents for Auto GPT.Think about it AutoGPT is not, is there for any goal, like increase my net worth, though not focused as us on fixing or improving code. We might be another agent, by the way. We might also be, we're working on it as a plugin for ChatGPT. We're actually almost finished with it. So that's like I think how it's gonna be done.Again, open opensource, not something we're thinking about. We wanted to be really good before weSwyx: opensource it. That was all very impressive. Your vision is actually very encouraging as well, and I, I'm very excited to try it out myself. I'm just curious on the Israel side of things, right? Like you, you're visiting San Francisco for a two week trip for this special program you can tell us about. But also I think a lot of American developers have heard that, you know, Israel has a really good tech scene. Mostly it's just security startups. You know, I did some, I was in some special unit in the I D F and like, you know, I come out and like, I'm doing the same thing again, but like, you know, for enterprises but maybe just something like, describe for, for the rest of the world.It's like, What is the Israeli tech scene like? What is this program that you're on and what shouldItamar: people know? So I think like Israel is the most condensed startup per capita. I think we're number one really? Or, or startup pair square meter. I think, I think we're number one as well because of these properties actually there is a very strong community and like everyone are around, like are [00:57:00] working in a.An entrepreneur or working in a startup. And when you go to the bar or the coffee, you hear if it's 20, 21, people talking about secondary, if it's 2023 talking about like how amazing Geni is, but everyone are like whatever are around you are like in, in the scene. And, and that's like a lot of networking and data propagation, I think.Somehow similar here to, to the Bay Area in San Francisco that it helps, right. So I think that's one of our strong points. You mentioned some others. I'm not saying that it doesn't help. Yes. And being in the like idf, the army, that age of 19, you go and start dealing with technology like very advanced one, that, that helps a lot.And then going back to the community, there's this community like is all over the world. And for example, there is this program called Icon. It's basically Israelis and in the Valley created a program for Israelis from, from Israel to come and it's called Silicon Valley 1 0 1 to learn what's going on here.Because with all the respect to the tech scene in Israel here, it's the, the real thing, right? So, so it's an non-profit organization by Israelis that moved here, that brings you and, and then brings people from a 16 D or, or Google or Navon or like. Amazing people from unicorns or, or up and coming startup or accelerator, and give you up-to-date talks and, and also connect you to relevant people.And that's, that's why I'm here in addition to to, you know, to [00:58:30] me and, and participate in this amazing podcast, et cetera.Swyx: Yeah. Oh, well, I, I think, I think there's a lot of exciting tech talent, you know, in, in Tel Aviv, and I, I'm, I'm glad that your offer is Israeli.Itamar: I, I think one of thing I wanted to say, like yeah, of course, that because of what, what what we said security is, is a very strong scene, but a actually water purification agriculture attack, there's a awful other things like usually it's come from necessity.Yeah. Like, we have big part of our company of our state is like a desert. So there's, there's other things like ai by the way is, is, is big also in Israel. Like, for example, I think there's an Israeli competitor to open ai. I'm not saying like it's as big, but it's ai 21, I think out of 10.Yeah. Out. Oh yeah. 21. Is this really? Yeah. Out of 10 like most, mm-hmm. Profound research labs. Research lab is, for example, I, I love, I love their. Yeah. Yeah.Swyx: I, I think we should try to talk to one of them. But yeah, when you and I met, we connected a little bit Singapore, you know, I was in the Singapore Army and Israeli army.We do have a lot of connections between countries and small countries that don't have a lot of natural resources that have to make due in the world by figuring out some other services. I think the Singapore startup scene has not done as well as the Israeli startup scene. So I'm very interested in, in how small, small countries can have a world impact essentially.Itamar: It's a question we're being asked a lot, like why, for example, let's go to the soft skills. I think like failing is a bad thing. Yeah. Like, okay. Like sometimes like VCs prefer to [01:00:00] put money on a, on an entrepreneur that failed in his first startup and actually succeeded because now that person is knowledgeable, what it mean to be, to fail and very hungry to, to succeed.So I think like generally, like there's a few reason I think it's hard to put the finger exactly, but we talked about a few things. But one other thing I think like failing is not like, this is my fourth company. I did one as, it wasn't a startup, it was a company as a teenager. And then I had like my first startup, my second company that like, had a amazing run, but then very beautiful collapse.And then like my third company, my second startup eventually exit successfully to, to Alibaba. So, so like, I think like it's there, there are a lot of trial and error, which is being appreciated, not like suppressed. I guess like that's one of the reason,Alessio: wanna jump into lightning round?Swyx: Yes. I think we send you into prep, but there's just three questions now.We've, we've actually reduced it quite a bit, but you have it,Alessio: so, and we can read them that you can take time and answer. You don't have to right away. First question, what is a already appin in AI that Utah would take much longer than an sItamar: Okay, so I have to, I hope it doesn't sound like arrogant,

Arrow Podcast
Nvidia Networking

Arrow Podcast

Play Episode Listen Later Mar 19, 2023 16:36


Flere har måske hørt om Nvidias opkøb af Mellanox i april 2020, men hvad betyder det egentlig for Nvidias portefølje og fremtidsvisioner? Lyt med når Lars Kok og Christian Binow Lassen giver et hurtigt overblik i, hvordan Nvidia Networking optimerer netværkstrafikken med intelligente netværkskomponenter.

Interviews: Tech and Business
CTO View: From Cloud to Metaverse

Interviews: Tech and Business

Play Episode Listen Later May 10, 2022 40:22


#metaverse #omniverse #cto #cloudcomputing #datacenter #nvidiaWhat is the metaverse and what does it mean for you? Michael Kagan, Chief Technology Officer at NVIDIA, explains the metaverse (or omniverse, as NVIDIA calls it) and links concepts around cloud computing, data centers, digital twins, and AI.If you've wondered what the metaverse is or how our digital world is changing, you'll want to watch this interview.The conversation includes these topics:-- About Michael Kagan, CTO, of NVIDIA-- What is the metaverse or omniverse?-- On digital twin applications in the metaverse (or omniverse)-- On computing platforms and the metaverse (or omniverse)-- On AI and the metaverse (or omniverse)-- On collecting data for advanced digital simulations-- On federated learning and autonomous vehicles-- On differences between smart manufacturing digital twins and the metaverse (or omniverse)-- On cloud computing platforms, data centers, and the metaverse (or omniverse)-- On cryptocurrencies and the metaverse-- On how the metaverse (omniverse) will change distributed computing and data storageSubscribe to participate in live shows and ask questions to the guests: https://www.cxotalk.com/subscribeRead the complete transcript: https://www.cxotalk.com/episode/cto-view-cloud-metaverseMichael Kagan has been NVIDIA's CTO (Chief Technology Officer) since May 2020. He joined NVIDIA through the acquisition of Mellanox, where he was CTO and a co-founder of the company, founded in April 1999. From 1983 to April 1999, Kagan held a number of architecture and design positions at Intel Corporation. Kagan holds a BSc. in Electrical Engineering from the Technion — Israel Institute of Technology.

alphalist.CTO Podcast - For CTOs and Technical Leaders
#48 - Michael Kagan // CTO at Nvidia

alphalist.CTO Podcast - For CTOs and Technical Leaders

Play Episode Listen Later Mar 31, 2022 45:16


Michael Kagan, the CTO of computing giant Nvidia and co-founder of Mellanox shares insight into the world of hardware, software, AI, and of course computing. Listen to find out: - Why you should ditch deep hierarchy and allow those close to the product to make the decisions.

Real Life Superpowers
Episode 44 - Jacques Benkoski (General Partner USVP)

Real Life Superpowers

Play Episode Listen Later Jan 1, 2022 56:43


Jacques Benkoski (General Partner USVP) In this episode we speak with international businessman Jacques Benkoski , General Partner of U.S. Venture partners (USVP). For those who somehow didn't hear of USVP it's a very well known Silicon Valley venture firm that's invested in more than 500 companies, including Box, Check Point, Mellanox, Epsagon, Guidewire, HeartFlow, HotelTonight, Imperva, Inspire Medical, Luminate, Medigate, Omada Health, Pluto TV, Trusteer, Yammer - the list goes on. We discuss his journey, perspective and experience in helping entrepreneurs execute their vision and grow into significant sustainable businesses. Some key topics we delved into: • The power of Karma • The profound significance of focusing on adding value, (always) doing the right thing and creating a deep positive impact on the people around you • What legacy truly means • Where money plays in • Why the CEO must be the one managing the board and not vice versa • And much more! Checkout the episode for some gems on what the top investors in the world are looking for in a startup, the passion that drives it all, and many other valuable insights.

SetNa Mixes
DJ Platon - SetNa 164 `Magistral Mix 21`

SetNa Mixes

Play Episode Listen Later Nov 14, 2021 56:13


Красивая музыка, мощные и динамичные Drum and Bass-ритмы в двадцать первой части серии Magistral Mix. Приятного прослушивания! 1.Phonetic - Falling (Original Mix) 2.Mindhead & Mellanox, Yulua Oreshko - Infinity (Original Mix) 3.Blanke - Breathe 4.Swedish House Mafia - It Gets Better (10xx Drum & Bass Edit) 5.K Motionz & Emily Makis - High Note 6.Kolectiv - Can't Hold Me 7.Kubiks & Lomax - Things To Come 8.Prospa - WANT NEED LOVE (Dimension Remix) 9.Mefjus, Camo & Krooked - Sientelo 10.Pegboard Nerds, More Plastic - The Ride (Original Mix) 11.Zombie Cats - Mutation (Current Value Remix) 12.Holy Goof ft. Takura - Untouchable (Original Mix) 13.Tall Order - Can't Breathe 14.Akylla - Someday (Original Mix) 15.Low;r - Parsley Soda 16.ilLegal Content - GAS 17.Kove - Sweet Music (Extended Mix) 18.Bcee & Solah - Spirals (Original Mix) 19.Rudimental x MJ Cole feat. Josh Barry - Remember Their Names (Original Mix) 20.Maduk feat. Jelvin - Transformations 21.Voicians - Mask Of Joy (Instrumental) 22.Hybrid Minds & Dylan - Blame You (Original Mix) 23.Mazare & Nick Luebke - Spirals (Original Mix)

Moore's Lobby: Where engineers talk all about circuits
Ep. 29 | NVIDIA CTO Michael Kagan on the New Age of AI and Supercomputers

Moore's Lobby: Where engineers talk all about circuits

Play Episode Listen Later Sep 7, 2021 43:36


The advent of AI is forcing us to rethink the way we design hardware and changing the way we think of processing. After all, data-hungry applications are processor-hungry applications.  In this episode of Moore's Lobby, Daniel speaks with Michael Kagan, the CTO of NVIDIA, a tech giant and household name in processing. Kagan's career spans foundational work across Intel, Mellanox, and now NVIDIA as they forge new technologies to enable accelerated compute. Learn about the three core pillars of data center computing (spoilers: “GPU” might not mean what you think it means anymore). And learn why compute will soon need to become service-based as the burden of processing shifts increasingly to supercomputers. And, of course, hear the historic reasons Kagan asserts that “chips without software is just expensive sand.” You won't find a more qualified voice on the intersection between processing, compute-hungry applications, and data centers, so don't miss this episode.

20 Minute Leaders
Ep408: Shai Morag | CEO & Co-founder, Ermetic

20 Minute Leaders

Play Episode Listen Later May 10, 2021 25:23


Previously he co-founded and was CEO of Secdo, a cybersecurity company, that was acquired by Palo Alto Networks and of Integrity-Project, a company specialized in connectivity, networking, and security solutions acquired by Mellanox. Shai served for 10 years as an officer in the IDF Intelligence Corps Unit 8200, where he held a variety of roles in management and product development and won several national awards for excellence. Shai is a graduate of the Talpiot program and earned an MBA from Tel Aviv University.

20 Minute Leaders
Ep386: Roey Eliyahu | Co-Founder & CEO, Salt Security

20 Minute Leaders

Play Episode Listen Later Apr 27, 2021 21:24


Roey is a former elite cybersecurity unit veteran that led the development of high-end security systems, to protect the largest network in Israel of the Israel Defense Forces (IDF) and the government. He went to lead the development of security system projects at Cigol Digital Systems, a military-grade security systems company (Acquired by Mellanox). After Cigol Digital Systems, Roey founded the cybersecurity college that trains the next generation of leaders and prepares them for serving in the IDF’s elite security units. 

Packet Pushers - Full Podcast Feed
Network Break 282: NVIDIA Completes Mellanox Acquisition; SpaceX Sets Date For Satellite Internet Beta Testing

Packet Pushers - Full Podcast Feed

Play Episode Listen Later May 4, 2020 34:59


The latest Network Break podcast examines NVIDIA's just-completed Mellanox acquisition, beta testing dates from SpaceX, the latest version of Cumulus Networks' NetQ management software, plus quarterly financial news for Juniper, Microsoft, and Amazon

Packet Pushers - Network Break
Network Break 282: NVIDIA Completes Mellanox Acquisition; SpaceX Sets Date For Satellite Internet Beta Testing

Packet Pushers - Network Break

Play Episode Listen Later May 4, 2020 34:59


The latest Network Break podcast examines NVIDIA's just-completed Mellanox acquisition, beta testing dates from SpaceX, the latest version of Cumulus Networks' NetQ management software, plus quarterly financial news for Juniper, Microsoft, and Amazon

Packet Pushers - Fat Pipe
Network Break 282: NVIDIA Completes Mellanox Acquisition; SpaceX Sets Date For Satellite Internet Beta Testing

Packet Pushers - Fat Pipe

Play Episode Listen Later May 4, 2020 34:59


The latest Network Break podcast examines NVIDIA's just-completed Mellanox acquisition, beta testing dates from SpaceX, the latest version of Cumulus Networks' NetQ management software, plus quarterly financial news for Juniper, Microsoft, and Amazon

Packet Pushers - Full Podcast Feed
Network Break 280: Nvidia Advances Mellanox Acquisition; Startup Alkira Tackles Multi-Cloud Networking

Packet Pushers - Full Podcast Feed

Play Episode Listen Later Apr 20, 2020 51:17


Take a Network Break! We discuss Nvidia clearing a major hurdle to its Mellanox acquisition, GitHub changes its pricing, the startup Alkira tackles multi-cloud networking, and more tech news analysis. Our guest commentator is Stephen Foskett, founder of Tech Field Day and GestaltIT.

Packet Pushers - Network Break
Network Break 280: Nvidia Advances Mellanox Acquisition; Startup Alkira Tackles Multi-Cloud Networking

Packet Pushers - Network Break

Play Episode Listen Later Apr 20, 2020 51:17


Take a Network Break! We discuss Nvidia clearing a major hurdle to its Mellanox acquisition, GitHub changes its pricing, the startup Alkira tackles multi-cloud networking, and more tech news analysis. Our guest commentator is Stephen Foskett, founder of Tech Field Day and GestaltIT.

Packet Pushers - Fat Pipe
Network Break 280: Nvidia Advances Mellanox Acquisition; Startup Alkira Tackles Multi-Cloud Networking

Packet Pushers - Fat Pipe

Play Episode Listen Later Apr 20, 2020 51:17


Take a Network Break! We discuss Nvidia clearing a major hurdle to its Mellanox acquisition, GitHub changes its pricing, the startup Alkira tackles multi-cloud networking, and more tech news analysis. Our guest commentator is Stephen Foskett, founder of Tech Field Day and GestaltIT.

The Razor's Edge
The Razor's Edge #4: Nvidia's Expanding AI Edge

The Razor's Edge

Play Episode Listen Later Dec 5, 2019 101:27


Akram's Razor was a notable bear on Nvidia in 2018 before closing a short position. Recently, he switched from the sidelines to becoming an owner of shares and a bull. We discuss what changed with the company over the past year, how it's proven itself, and what may be ahead. Topics Covered 2:00 minute mark - revisiting the short case 12:30 - NVDA's competitive position in artificial intelligence 23:00 - Expanded advantage in artificial intelligence, with the USPS deal as an example 33:00 - How Nvidia's moat plays out for new potential entrants 38:00 - The change in the valuation picture 47:00 - How crypto wreaked havoc on understanding Nvidia's gaming segment 56:00 - The structural questions in the gaming segment 1:11:00 - Mellanox and its import/fit with Nvidia 1:19:00 - Has management regained its credibility? 1:34:00 - Applying a lens to management's views

Packet Pushers - Full Podcast Feed
Network Break 254: Amazon Develops Wireless Gadget Protocol; Mellanox Gear Harmonizes With SONiC

Packet Pushers - Full Podcast Feed

Play Episode Listen Later Oct 1, 2019 62:10


Network Break feasts on a variety of tech news including a new wireless protocol proposed by Amazon, Mellanox support for the SONiC NOS, Palo Alto Networks saying it will build an SD-WAN offering, a dip in cloud infrastructure spending, and more.

Packet Pushers - Network Break
Network Break 254: Amazon Develops Wireless Gadget Protocol; Mellanox Gear Harmonizes With SONiC

Packet Pushers - Network Break

Play Episode Listen Later Oct 1, 2019 62:10


Network Break feasts on a variety of tech news including a new wireless protocol proposed by Amazon, Mellanox support for the SONiC NOS, Palo Alto Networks saying it will build an SD-WAN offering, a dip in cloud infrastructure spending, and more.

Packet Pushers - Fat Pipe
Network Break 254: Amazon Develops Wireless Gadget Protocol; Mellanox Gear Harmonizes With SONiC

Packet Pushers - Fat Pipe

Play Episode Listen Later Oct 1, 2019 62:10


Network Break feasts on a variety of tech news including a new wireless protocol proposed by Amazon, Mellanox support for the SONiC NOS, Palo Alto Networks saying it will build an SD-WAN offering, a dip in cloud infrastructure spending, and more.

Packet Pushers - Full Podcast Feed
Network Break 250: VMware Embraces Kubernetes; Dell Partners With VMware On Datacenters, SD-WAN

Packet Pushers - Full Podcast Feed

Play Episode Listen Later Sep 3, 2019 67:38


It's a heaping helping of Network Break as we try to parse all the Kubernetes pronouncements coming out of VMworld 2019, including Project Pacific and Tanzu Mission Control. Plus we cover new tech and new partnerships between Dell EMC and VMware, new products from Apstra and Mellanox, and HPE's latest financials.

Packet Pushers - Network Break
Network Break 250: VMware Embraces Kubernetes; Dell Partners With VMware On Datacenters, SD-WAN

Packet Pushers - Network Break

Play Episode Listen Later Sep 3, 2019 67:38


It's a heaping helping of Network Break as we try to parse all the Kubernetes pronouncements coming out of VMworld 2019, including Project Pacific and Tanzu Mission Control. Plus we cover new tech and new partnerships between Dell EMC and VMware, new products from Apstra and Mellanox, and HPE's latest financials.

Packet Pushers - Fat Pipe
Network Break 250: VMware Embraces Kubernetes; Dell Partners With VMware On Datacenters, SD-WAN

Packet Pushers - Fat Pipe

Play Episode Listen Later Sep 3, 2019 67:38


It's a heaping helping of Network Break as we try to parse all the Kubernetes pronouncements coming out of VMworld 2019, including Project Pacific and Tanzu Mission Control. Plus we cover new tech and new partnerships between Dell EMC and VMware, new products from Apstra and Mellanox, and HPE's latest financials.

ZION NEWS
Trump Says He'd Easily Win Israeli Election | 3/11/19

ZION NEWS

Play Episode Listen Later Mar 12, 2019 24:14


Zaka Int'l rescue workers head to Ethiopia Boeing 737 max-8 tragically crashes in Addis Ababa; 157 people killed Riots; incendiary balloon attacks continue A 22 year old Palestinian rioter who was injured along the Gaza border in February, died of his wounds this week; as violence and demonstrations continue across the Palestinian territories. PA appoints new Prime Minister The Palestinian authority announced a series of major changes Sunday as the troubled body continues to struggle for survival under the harsh economic sanctions imposed upon it.   Nvidia acquires Mellanox for $6.9 billion Amit Kochavi, Strategic Adviser, TMTI in theILTV studio speaking about the acquisition of Mellanox by Nvidia. Infighting reported within the Blue-&-White party Infighting is now reportedly taking hold between leaders of the blue and white list; following frustrations in the wake of the most recent dip in the polls. Trump says he'd easily win Israeli election Anti-Semitism and Israel-US ties are becoming more and more entangled in both the Israeli and American campaign trails. Protesters clash at the Western Wall The “women of the wall” accuse the Israel police of ‘abandoning them' Israeli culture minister nixes ‘Diaspora Torch' Culture Minister Miri Regev has just decided that no diaspora Jewry representative will be lighting a torch this year for independence day– sparking massive criticisms that this move would further harm the Israeli-diaspora relationship. KKL & NBN unveil ‘project Israel 2040' unveils Efforts to make the periphery a more attractive place to live is critical. Israel officially unveils Eurovision 2019 song Kan Broadcaster releases full version of ‘home' performed by Kobi Marimi    Hebrew word of the Day: PERIPHERIA |פריפריה= PERIPHERY Learn a New Hebrew word every day. Today's word is ‘Peripheria', meaning Periphery See omnystudio.com/listener for privacy information.

Daily Tech News Show
Breaking Up a Corporation is Hard to Do - DTNS 3485

Daily Tech News Show

Play Episode Listen Later Mar 11, 2019 32:52


UK drone laws change, Nvidia acquires Mellanox, and Tesla takes a retail turn.Starring Sarah Lane, Roger Chang and Lamarr Wilson. See acast.com/privacy for privacy and opt-out information. Become a member at https://plus.acast.com/s/dtns.

TechCrunch
Daily Crunch 3/11/19

TechCrunch

Play Episode Listen Later Mar 11, 2019 3:28


Welcome to TechCrunch daily news, a round up of the top tech news of the day. Presented by Bose. Say goodnight to sleepless nights with New Bose sleepbuds. They mask unwanted noises with soothing sounds, so you can get the rest you deserve. -- NVIDIA is acquiring Mellanox -- we get an in-depth preview of Niantic's Harry Potter game -- and misconfigured Box accounts put sensitive data at risk Here's your Daily Crunch for March 11, 2019.

BSD Now
215: Turning FreeBSD up to 100 Gbps

BSD Now

Play Episode Listen Later Oct 11, 2017 93:35


We look at how Netflix serves 100 Gbps from an Open Connect Appliance, read through the 2nd quarter FreeBSD status report, show you a freebsd-update speedup via nginx reverse proxy, and customize your OpenBSD default shell. This episode was brought to you by Headlines Serving 100 Gbps from an Open Connect Appliance (https://medium.com/netflix-techblog/serving-100-gbps-from-an-open-connect-appliance-cdb51dda3b99) In the summer of 2015, the Netflix Open Connect CDN team decided to take on an ambitious project. The goal was to leverage the new 100GbE network interface technology just coming to market in order to be able to serve at 100 Gbps from a single FreeBSD-based Open Connect Appliance (OCA) using NVM Express (NVMe)-based storage. At the time, the bulk of our flash storage-based appliances were close to being CPU limited serving at 40 Gbps using single-socket Xeon E5–2697v2. The first step was to find the CPU bottlenecks in the existing platform while we waited for newer CPUs from Intel, newer motherboards with PCIe Gen3 x16 slots that could run the new Mellanox 100GbE NICs at full speed, and for systems with NVMe drives. Fake NUMA Normally, most of an OCA's content is served from disk, with only 10–20% of the most popular titles being served from memory (see our previous blog, Content Popularity for Open Connect (https://medium.com/@NetflixTechBlog/content-popularity-for-open-connect-b86d56f613b) for details). However, our early pre-NVMe prototypes were limited by disk bandwidth. So we set up a contrived experiment where we served only the very most popular content on a test server. This allowed all content to fit in RAM and therefore avoid the temporary disk bottleneck. Surprisingly, the performance actually dropped from being CPU limited at 40 Gbps to being CPU limited at only 22 Gbps! The ultimate solution we came up with is what we call “Fake NUMA”. This approach takes advantage of the fact that there is one set of page queues per NUMA domain. All we had to do was to lie to the system and tell it that we have one Fake NUMA domain for every 2 CPUs. After we did this, our lock contention nearly disappeared and we were able to serve at 52 Gbps (limited by the PCIe Gen3 x8 slot) with substantial CPU idle time. After we had newer prototype machines, with an Intel Xeon E5 2697v3 CPU, PCIe Gen3 x16 slots for 100GbE NIC, and more disk storage (4 NVMe or 44 SATA SSD drives), we hit another bottleneck, also related to a lock on a global list. We were stuck at around 60 Gbps on this new hardware, and we were constrained by pbufs. Our first problem was that the list was too small. We were spending a lot of time waiting for pbufs. This was easily fixed by increasing the number of pbufs allocated at boot time by increasing the kern.nswbuf tunable. However, this update revealed the next problem, which was lock contention on the global pbuf mutex. To solve this, we changed the vnode pager (which handles paging to files, rather than the swap partition, and hence handles all sendfile() I/O) to use the normal kernel zone allocator. This change removed the lock contention, and boosted our performance into the 70 Gbps range. As noted above, we make heavy use of the VM page queues, especially the inactive queue. Eventually, the system runs short of memory and these queues need to be scanned by the page daemon to free up memory. At full load, this was happening roughly twice per minute. When this happened, all NGINX processes would go to sleep in vm_wait() and the system would stop serving traffic while the pageout daemon worked to scan pages, often for several seconds. This problem is actually made progressively worse as one adds NUMA domains, because there is one pageout daemon per NUMA domain, but the page deficit that it is trying to clear is calculated globally. So if the vm pageout daemon decides to clean, say 1GB of memory and there are 16 domains, each of the 16 pageout daemons will individually attempt to clean 1GB of memory. To solve this problem, we decided to proactively scan the VM page queues. In the sendfile path, when allocating a page for I/O, we run the pageout code several times per second on each VM domain. The pageout code is run in its lightest-weight mode in the context of one unlucky NGINX process. Other NGINX processes continue to run and serve traffic while this is happening, so we can avoid bursts of pager activity that blocks traffic serving. Proactive scanning allowed us to serve at roughly 80 Gbps on the prototype hardware. Hans Petter Selasky, Mellanox's 100GbE driver developer, came up with an innovative solution to our problem. Most modern NICs will supply an Receive Side Scaling (RSS) hash result to the host. RSS is a standard developed by Microsoft wherein TCP/IP traffic is hashed by source and destination IP address and/or TCP source and destination ports. The RSS hash result will almost always uniquely identify a TCP connection. Hans' idea was that rather than just passing the packets to the LRO engine as they arrive from the network, we should hold the packets in a large batch, and then sort the batch of packets by RSS hash result (and original time of arrival, to keep them in order). After the packets are sorted, packets from the same connection are adjacent even when they arrive widely separated in time. Therefore, when the packets are passed to the FreeBSD LRO routine, it can aggregate them. With this new LRO code, we were able to achieve an LRO aggregation rate of over 2 packets per aggregation, and were able to serve at well over 90 Gbps for the first time on our prototype hardware for mostly unencrypted traffic. So the job was done. Or was it? The next goal was to achieve 100 Gbps while serving only TLS-encrypted streams. By this point, we were using hardware which closely resembles today's 100GbE flash storage-based OCAs: four NVMe PCIe Gen3 x4 drives, 100GbE ethernet, Xeon E5v4 2697A CPU. With the improvements described in the Protecting Netflix Viewing Privacy at Scale blog entry, we were able to serve TLS-only traffic at roughly 58 Gbps. In the lock contention problems we'd observed above, the cause of any increased CPU use was relatively apparent from normal system level tools like flame graphs, DTrace, or lockstat. The 58 Gbps limit was comparatively strange. As before, the CPU use would increase linearly as we approached the 58 Gbps limit, but then as we neared the limit, the CPU use would increase almost exponentially. Flame graphs just showed everything taking longer, with no apparent hotspots. We finally had a hunch that we were limited by our system's memory bandwidth. We used the Intel® Performance Counter Monitor Tools to measure the memory bandwidth we were consuming at peak load. We then wrote a simple memory thrashing benchmark that used one thread per core to copy between large memory chunks that did not fit into cache. According to the PCM tools, this benchmark consumed the same amount of memory bandwidth as our OCA's TLS-serving workload. So it was clear that we were memory limited. At this point, we became focused on reducing memory bandwidth usage. To assist with this, we began using the Intel VTune profiling tools to identify memory loads and stores, and to identify cache misses. Because we are using sendfile() to serve data, encryption is done from the virtual memory page cache into connection-specific encryption buffers. This preserves the normal FreeBSD page cache in order to allow serving of hot data from memory to many connections. One of the first things that stood out to us was that the ISA-L encryption library was using half again as much memory bandwidth for memory reads as it was for memory writes. From looking at VTune profiling information, we saw that ISA-L was somehow reading both the source and destination buffers, rather than just writing to the destination buffer. We realized that this was because the AVX instructions used by ISA-L for encryption on our CPUs worked on 256-bit (32-byte) quantities, whereas the cache line size was 512-bits (64 bytes)?—?thus triggering the system to do read-modify-writes when data was written. The problem is that the the CPU will normally access the memory system in 64 byte cache line-sized chunks, reading an entire 64 bytes to access even just a single byte. After a quick email exchange with the ISA-L team, they provided us with a new version of the library that used non-temporal instructions when storing encryption results. Non-temporals bypass the cache, and allow the CPU direct access to memory. This meant that the CPU was no longer reading from the destination buffers, and so this increased our bandwidth from 58 Gbps to 65 Gbps. At 100 Gbps, we're moving about 12.5 GB/s of 4K pages through our system unencrypted. Adding encryption doubles that to 25 GB/s worth of 4K pages. That's about 6.25 Million mbufs per second. When you add in the extra 2 mbufs used by the crypto code for TLS metadata at the beginning and end of each TLS record, that works out to another 1.6M mbufs/sec, for a total of about 8M mbufs/second. With roughly 2 cache line accesses per mbuf, that's 128 bytes * 8M, which is 1 GB/s (8 Gbps) of data that is accessed at multiple layers of the stack (alloc, free, crypto, TCP, socket buffers, drivers, etc). At this point, we're able to serve 100% TLS traffic comfortably at 90 Gbps using the default FreeBSD TCP stack. However, the goalposts keep moving. We've found that when we use more advanced TCP algorithms, such as RACK and BBR, we are still a bit short of our goal. We have several ideas that we are currently pursuing, which range from optimizing the new TCP code to increasing the efficiency of LRO to trying to do encryption closer to the transfer of the data (either from the disk, or to the NIC) so as to take better advantage of Intel's DDIO and save memory bandwidth. FreeBSD April to June 2017 Status Report (https://www.freebsd.org/news/status/report-2017-04-2017-06.html) FreeBSD Team Reports FreeBSD Release Engineering Team Ports Collection The FreeBSD Core Team The FreeBSD Foundation The Postmaster Team Projects 64-bit Inode Numbers Capability-Based Network Communication for Capsicum/CloudABI Ceph on FreeBSD DTS Updates Kernel Coda revival FreeBSD Driver for the Annapurna Labs ENA Intel 10G Driver Update pNFS Server Plan B Architectures FreeBSD on Marvell Armada38x FreeBSD/arm64 Userland Programs DTC Using LLVM's LLD Linker as FreeBSD's System Linker Ports A New USES Macro for Porting Cargo-Based Rust Applications GCC (GNU Compiler Collection) GNOME on FreeBSD KDE on FreeBSD New Port: FRRouting PHP Ports: Help Improving QA Rust sndio Support in the FreeBSD Ports Collection TensorFlow Updating Port Metadata for non-x86 Architectures Xfce on FreeBSD Documentation Absolute FreeBSD, 3rd Edition Doc Version Strings Improved by Their Absence New Xen Handbook Section Miscellaneous BSD Meetups at Rennes (France) Third-Party Projects HardenedBSD DPDK, VPP, and the future of pfSense @ the DPDK Summit (https://www.pscp.tv/DPDKProject/1dRKZnleWbmKB?t=5h1m0s) The DPDK (Data Plane Development Kit) conference included a short update from the pfSense project The video starts with a quick introduction to pfSense and the company behind it It covers the issues they ran into trying to scale to 10gbps and beyond, and some of the solutions they tried: libuinet, netmap, packet-journey Then they discovered VPP (Vector Packet Processing) The video then covers the architecture of the new pfSense pfSense has launched of EC2, on Azure soon, and will launch support for the new Atom C3000 and Xeon hardware with built-in QAT (Quick-Assist crypto offload) in November The future: 100gbps, MPLS, VXLANs, and ARM64 hardware support *** News Roundup Local nginx reverse proxy cache for freebsd-update (https://wiki.freebsd.org/VladimirKrstulja/Guides/FreeBSDUpdateReverseProxy) Vladimir Krstulja has created this interesting tutorial on the FreeBSD wiki about a freebsd-update reverse proxy cache Either because you're a good netizen and don't want to repeatedly hammer the FreeBSD mirrors to upgrade all your systems, or you want to benefit from the speed of having a local "mirror" (cache, more precisely), running a freebsd update reverse proxy cache with, say, nginx is dead simple. 1. Install nginx somewhere 2. Configure nginx for a subdomain, say, freebsd-update.example.com 3. On all your hosts, in all your jails, configure /etc/freebsd-update.conf for new ServerName And... that's it. Running freebsd-update will use the ServerName domain which is your reverse nginx proxy. Note the comment about using a "nearby" server is not quite true. FreeBSD update mirrors are frequently slow and running such a reverse proxy cache significantly speeds things up. Caveats: This is a simple cache. That means it doesn't consider the files as a whole repository, which in turn means updates to your cache are not atomic. It'd be advised to nuke your cache before your update run, as its point is only to retain the files in a local cache for some short period of time required for all your machines to be updated. ClonOS is a free, open-source FreeBSD-based platform for virtual environment creation and management (https://clonos.tekroutine.com/) The operating system uses FreeBSD's development branch (12.0-CURRENT) as its base. ClonOS uses ZFS as the default file system and includes web-based administration tools for managing virtual machines and jails. The project's website also mentions the availability of templates for quickly setting up new containers and web-based VNC access to jails. Puppet, we are told, can be used for configuration management. ClonOS can be downloaded as a disk image file (IMG) or as an optical media image (ISO). I downloaded the ISO file which is 1.6GB in size. Booting from ClonOS's media displays a text console asking us to select the type of text terminal we are using. There are four options and most people can probably safely take the default, xterm, option. The operating system, on the surface, appears to be a full installation of FreeBSD 12. The usual collection of FreeBSD packages are available, including manual pages, a compiler and the typical selection of UNIX command line utilities. The operating system uses ZFS as its file system and uses approximately 3.3GB of disk space. ClonOS requires about 50MB of active memory and 143MB of wired memory before any services or jails are created. Most of the key features of ClonOS, the parts which set it apart from vanilla FreeBSD, can be accessed through a web-based control panel. When we connect to this control panel, over a plain HTTP connection, using our web browser, we are not prompted for an account name or password. The web-based interface has a straight forward layout. Down the left side of the browser window we find categories of options and controls. Over on the right side of the window are the specific options or controls available in the selected category. At the top of the page there is a drop-down menu where we can toggle the displayed language between English and Russian, with English being the default. There are twelve option screens we can access in the ClonOS interface and I want to quickly give a summary of each one: Overview - this page shows a top-level status summary. The page lists the number of jails and nodes in the system. We are also shown the number of available CPU cores and available RAM on the system. Jail containers - this page allows us to create and delete jails. We can also change some basic jail settings on this page, adjusting the network configuration and hostname. Plus we can click a button to open a VNC window that allows us to access the jail's command line interface. Template for jails - provides a list of available jail templates. Each template is listed with its name and a brief description. For example, we have a Wordpress template and a bittorrent template. We can click a listed template to create a new jail with a vanilla installation of the selected software included. We cannot download or create new templates from this page. Bhyve VMs - this page is very much like the Jails containers page, but concerns the creation of new virtual machines and managing them. Virtual Private Network - allows for the management of subnets Authkeys - upload security keys for something, but it is not clear for what these keys will be used. Storage media - upload ISO files that will be used when creating virtual machines and installing an operating system in the new virtual environment. FreeBSD Bases - I think this page downloads and builds source code for alternative versions of FreeBSD, but I am unsure and could not find any associated documentation for this page. FreeBSD Sources - download source code for various versions of FreeBSD. TaskLog - browse logs of events, particularly actions concerning jails. SQLite admin - this page says it will open an interface for managing a SQLite database. Clicking link on the page gives a file not found error. Settings - this page simply displays a message saying the settings page has not been implemented yet. While playing with ClonOS, I wanted to perform a couple of simple tasks. I wanted to use the Wordpress template to set up a blog inside a jail. I wanted a generic, empty jail in which I could play and run commands without harming the rest of the operating system. I also wanted to try installing an operating system other than FreeBSD inside a Bhyve virtual environment. I thought this would give me a pretty good idea of how quick and easy ClonOS would make common tasks. Conclusions ClonOS appears to be in its early stages of development, more of a feature preview or proof-of-concept than a polished product. A few of the settings pages have not been finished yet, the web-based controls for jails are unable to create jails that connect to the network and I was unable to upload even small ISO files to create virtual machines. The project's website mentions working with Puppet to handle system configuration, but I did not encounter any Puppet options. There also does not appear to be any documentation on using Puppet on the ClonOS platform. One of the biggest concerns I had was the lack of security on ClonOS. The web-based control panel and terminal both automatically login as the root user. Passwords we create for our accounts are ignored and we cannot logout of the local terminal. This means anyone with physical access to the server automatically gains root access and, in addition, anyone on our local network gets access to the web-based admin panel. As it stands, it would not be safe to install ClonOS on a shared network. Some of the ideas present are good ones. I like the idea of jail templates and have used them on other systems. The graphical Bhyve tools could be useful too, if the limitations of the ISO manager are sorted out. But right now, ClonOS still has a way to go before it is likely to be safe or practical to use. Customize ksh display for OpenBSD (http://nanxiao.me/en/customize-ksh-display-for-openbsd/) The default shell for OpenBSD is ksh, and it looks a little monotonous. To make its user-experience more friendly, I need to do some customizations: (1) Modify the “Prompt String” to display the user name and current directory: PS1='$USER:$PWD# ' (2) Install colorls package: pkg_add colorls Use it to replace the shipped ls command: alias ls='colorls -G' (3) Change LSCOLORS environmental variable to make your favorite color. For example, I don't want the directory is displayed in default blue, change it to magenta: LSCOLORS=fxexcxdxbxegedabagacad For detailed explanation of LSCOLORS, please refer manual of colorls. This is my final modification of .profile: PS1='$USER:$PWD# ' export PS1 LSCOLORS=fxexcxdxbxegedabagacad export LSCOLORS alias ls='colorls -G' DragonFly 5 release candidate (https://www.dragonflydigest.com/2017/10/02/20295.html) Commit (http://lists.dragonflybsd.org/pipermail/commits/2017-September/626463.html) I tagged DragonFly 5.0 (commit message list in that link) over the weekend, and there's a 5.0 release candidate for download (http://mirror-master.dragonflybsd.org/iso-images/). It's RC2 because the recent Radeon changes had to be taken out. (http://lists.dragonflybsd.org/pipermail/commits/2017-September/626476.html) Beastie Bits Faster forwarding (http://www.grenadille.net/post/2017/08/21/Faster-forwarding) DRM-Next-Kmod hits the ports tree (http://www.freshports.org/graphics/drm-next-kmod/) OpenBSD Community Goes Platinum (https://undeadly.org/cgi?action=article;sid=20170829025446) Setting up iSCSI on TrueOS and FreeBSD12 (https://www.youtube.com/watch?v=4myESLZPXBU) *** Feedback/Questions Christopher - Virtualizing FreeNAS (http://dpaste.com/38G99CK#wrap) Van - Tar Question (http://dpaste.com/3MEPD3S#wrap) Joe - Book Reviews (http://dpaste.com/0T623Z6#wrap) ***

BSD Now
127: DNS, Black Holes & Willem

BSD Now

Play Episode Listen Later Feb 3, 2016 129:36


Today on the show, we welcome Allan back from FOSSDEM, and enjoy an interview with Willem about DNS and MTU Black Holes. That plus all the weeks news, keep it turned here to BSD This episode was brought to you by Headlines FreeBSD Quarterly Status Report (https://www.freebsd.org/news/status/report-2015-10-2015-12.html) It is that time of year again, reviewing the progress of the FreeBSD project over the last quarter of 2015 There are a huge number of projects that have recently been completed or that are planned to finish in time for FreeBSD 10.3 or 11.0 This is just a sample of the of the items that stood out most to us: A number of new teams have been created, and existing teams report in. The Issue Triage, bugmeister, jenkins, IPv6 advocacy, and wiki-admin teams are all mentioned in the status report Progress is reported on the i915 project to update the Intel graphics drivers In the storage subsystem: RCTL I/O rate limiting, Warner Losh's CAM I/O Scheduler is progressing, Mellanox iSCSI Extensions for RDMA (iSER) was added, Chelsio iSCSI offload drivers, Mellanox 100 gbit/s drivers In Security: Encrypted crash dumps, OpenBSM updates, and a status report on HardenedBSD For embedded: Support for Ralink/Mediatek MIPS devices, Raspberry Pi Video Code packages, touch screen support for RPI and BBB, new port to the Marvell Armada38x, and the work on arm64 and RISC-V kib@ rewrote the out-of-memory handler, specifically to perform better in situations where a system does not have swap. Was tested on systems ranging from 32 MB of memory, to 512 GB Various improvements to the tool chain, build system, and nanobsd It was nice to see a bunch of reports from ports committers An overview of the different proposed init replacements, with a report on each *** First timer's guide to FOSS conferences (http://sarah.thesharps.us/2016/02/02/first-timers-guide-to-foss-conferences/) This post provides a lot of good information for those considering going to their first conference The very first item says the most: “Conference talks are great because they teach you new skills or give you ideas. However, what conference talks are really for is giving you additional topics of conversation to chat with your fellow conference goers with. Hanging out after a talk ends to chat with the speaker is a great way to connect with speakers or fellow attendees that are passionate about a particular subject.” The hallway track is the best part of the conference. I've ended up missing as much as 2/3rds of a conference, and still found it to be a very valuable conference, sometimes more so than if I attend a talk in every slot It is important to remember that missing a talk is not the end of the world, that discussion in the hallway may be much more valuable. Most of the talks end up on youtube anyway. The point of the conference is being in the same place as the other people at the conference, the talks are just a means to get us all there. There is even a lot of good advice for people with social anxiety, and those like Allan who do not partake in alcohol Know the conference perks and the resources available to you. The author of the post commented on twitter about originally being unaware of the resources that some conferences provide for speakers, but also of discounts for students, and travel grants from Google and others like the FreeBSD Foundation There are also tips about swag, including watching out for booth wranglers (not common at BSD events, but many larger conferences have booths where your personal information can be exchanged for swag), as well as advice for following up with the people you meet at conferences. Lastly, it provides thoughts on avoiding “Project Passion Explosion“, or what I call “overcharging your BSD battery”, where after hearing about the interesting stuff other people are doing, or about the things other need, you try to do everything at once, and burn yourself out I know for myself, there are at least 10 projects I would love to work on, but I need to balance my free time, my work schedule, the FreeBSD release schedule, and which items might be better for someone else to work on. *** FreeBSD 10.1 based WiFi Captive Portal (http://www.unixmen.com/freebsd-10-1-x64-wifi-captive-portal/) Captive portals, the bane of many a traveler's existence, however a necessary evil in the era of war-driving and other potentially nefarious uses of “free-wifi”. This week we have an article from the folks at “unixmen”, showing (in great detail) how they setup a FreeBSD 10.1 based captive portal, and yes those are manual MySQL commands. First up is a diagram showing the layout of their new portal system, using multiple APs for different floors of the apartment / hotel? The walkthrough assumes you have Apache/MySQL and PHP already installed, so you'll need to prep those bits beforehand. Some Apache configuration is up next, which re-directs all port 80 requests over to 443/SSL and the captive portal web-login At this point we have to install “pear” from ports or packages and begin to do the database setup which is fairly typical if you done any SQL before, such as create user / database / table, etc. With the database finished, the article provides a nice and clean rc.conf which enables all the necessary services. Next up is the firewall configuration, which is using IPFW, specifically DUMMYNET/IPALIAS/IPDIVERT and friends. The article does mention to compile a new minimal kernel with these features, if you plan on doing so they I would recommend starting off with that. The article then continues, with setting up DHCP server, SUDO and the PHP file creation that will act as the interface between the client and mysql/firewall rules. When it's all said and done, you end up with a nice web-interface for clients, plus a bonus Admin interface to manage creating and removing users. For convenience at the very end is a link to all the files / configurations used, so grab that and avoid some of the copy-n-paste *** Sailor, a 'wannabe' portable container system {their own words!} (https://github.com/NetBSDfr/sailor) In the world of docker / jails / VMs, containers are all the rage right now, and now we can introduce “Sailor” to this mix A unique thing about this new solution, is that its based upon chroot/pkgin, and available on NetBSD / OSX and CentOS Since it is not using “jail” or other security mechanism, they to give us this cavet “Note that sailor's goal is not to provide bullet-proof security, chroot is definitely not a trustable isolator; instead, sailor is a really convenient way of trying / testing an environment without compromising your workstation filesystem.” Creating a new “ship” is relatively straight-forward, a simple shell define file can supply most of the relevant information. Nginx for example is only a few lines: https://github.com/NetBSDfr/sailor/blob/master/examples/nginx.conf In addition to the basic pkg configuration, it also provides methods to do rw/ro mounts into the chroot, as well as IP aliases and copying of specific host binaries into the container *** Interview - Willem Toorop - willem@nlnetlabs.nl (mailto:willem@nlnetlabs.nl) / @WillemToorop (https://twitter.com/WillemToorop) GetDNS vBSDCon 2015 Talk (https://www.youtube.com/watch?v=73M7h56Dsas) *** News Roundup A Quarter Century of Unix (http://wiki.tuhs.org/doku.php?id=publications:quarter_century_of_unix) An oldie, but goodie, the book “A Quarter Century of UNIX” is now available for free download via PDF format. This provides an invaluable look into the history of UNIX, which of course we wouldn't have BSD without. There is also a print version still available via Amazon (link at the above URL also). If you find the book useful, consider buying a copy, since a % still goes to the original author *** Bjoern Zeeb has been awarded grant to finalize VIMAGE fixes (https://www.freebsdfoundation.org/press/2016janupdate.pdf) “Bjoern Zeeb has been awarded a project grant to finalize and integrate the work done to make the VIMAGE network stack virtualization infrastructure production ready.” VIMAGE is the network virtualization kernel component that can be used to give jails their own network interfaces, so they can have their own firewalls, be assign addresses via DHCP, etc. Currently, a number of bugs prevent this feature from being enabled by default, or used in production The main areas of focus for the work are: network stack teardown, interface ordering, locking, and addressing the remaining memory leaks at teardown The work is expected to be completed by the end of March and to be included in FreeBSD 11.0 *** Building a smtpd Mail Server on OpenBSD (http://www.openbsd.org/opensmtpd/faq/example1.html) The OpenSMTPd FAQ has been updated with a new walkthrough of a complete installation Following this guide, the resulting installation will: Accepting mails for multiple domains and virtual users Allowing virtual users to authenticate and send mails Applying anti-spam and anti-virus filters on mails Providing IMAP access for the virtual users Providing log statistics It covers setting up the new filter system, configuring TLS, creating the domain and user tables, configuring spamassassin and clamav, and setting up dovecot There is even a crontab to send you weekly stats on what your email server is doing *** Introduction to the FreeBSD Open Source Operating System LiveLessons (http://www.informit.com/store/introduction-to-the-freebsd-open-source-operating-system-9780134305868) Dr. Kirk McKusick has been one of the foremost authorities on FreeBSD for some time now, as co-author of the D&I of FreeBSD (along with George Neville-Neil and Robert Watson) and teaching numerous classes on the same. (Another good reason to come to a *BSD conference) As part of the Addison-Wesley Professional / LiveLessons series, he has made a 10+ hour video lecture you can now purchase to take his class from the comfort of your own home/couch/office/etc Aspiring FreeBSD developers, kernel developers, Application Developers and other interested individuals should really consider this invaluable resource in their learning. The video starts with an introduction to the FreeBSD community and explains how it differs from the Linux ecosystem. The video then goes on to provide a firm background in the FreeBSD kernel. The POSIX kernel interfaces are used as examples where they are defined. Where they are not defined, the FreeBSD interfaces are described. The video covers basic kernel services, locking, process structure, scheduling, signal handling, jails, and virtual and physical memory management. The kernel I/O structure is described showing how I/O is multiplexed and the virtual filesystem interface is used to support multiple filesystems. Devices are described showing disk management and their auto-configuration. The organization and implementation of the fast filesystem is described concluding with a discussion of how to maintain consistency in the face of hardware or software failures. The video includes an overview of the ZFS filesystem and covers the socket-based network architecture, layering and routing issues. The presentations emphasize code organization, data structure navigation, and algorithms. Normally the video will set you back $299, but right now you can pick it up for $239 (USD). We can't recommend this enough, but also don't forget to try and make it out to BSDCan or MeetBSD, where you can usually talk to Dr. McKusick in person. *** BeastieBits Faces of FreeBSD: Sean Bruno (http://freebsdfoundation.blogspot.ca/2016/01/faces-of-freebsd-2016-sean-bruno.html) Support Michael W. Lucas writing BSD books, and get your name in the credits (http://blather.michaelwlucas.com/archives/2539) bhyve windows support merged to stable/10 branch, will be included in FreeBSD 10.3 (https://svnweb.freebsd.org/base?view=revision&revision=295124) FreeBSD Outsells Windows by almost 2-1 (http://arstechnica.com/gaming/2016/01/ea-lets-slip-lifetime-xbox-one-and-ps4-consoles-sales/) A rant about the whois protocol (http://fanf.livejournal.com/140505.html) Kris Moore talks about Jails and system management on BSDTalk (http://bsdtalk.blogspot.com/2016/01/bsdtalk261-jails-and-system-management.html) FOSDEM 2016: Slides from the 5 years of IllumOS talk (https://fosdem.org/2016/schedule/event/illumos_overview/attachments/audio/873/export/events/attachments/illumos_overview/audio/873/FOSDEM_2016.pdf) A tweet from the first day of FOSDEM showed only 1 FreeBSD machine. Many of the FreeBSD developers were at a devsummit offsite that day, and more users arrived for the BSD dev room which was on the Sunday (https://twitter.com/pvaneynd/status/693813132649697281) Feedback/Questions Antonio - ZFS Book Formatting (http://pastebin.com/ZWNHgqHQ) Simon - ZFS Corruption? (http://pastebin.com/XW97YSQK) Christian - rm -r^^^OOOPSSS (http://pastebin.com/W7TwWwtE) Phillipp - ZFS Send/Recv (http://pastebin.com/zA2ewPuF) ***