POPULARITY
Yohei Nakajima leads a double life. By day, he's a general partner of a small venture firm, Untapped Capital. By night, he's one of the most prolific internet tinkerers in AI. (He also sometimes works on automating his job as a venture capitalist.) He's the creator of BabyAGI (@babyAGI_), the first open-source autonomous agent that went viral in March 2023. Yohei has since released seven iterations of BabyAGI (each one named after a different animal), a coding agent called Ditto, a framework for building autonomous agents, and, most recently, BabyAGI 2o, a self-building autonomous agent (that follows OpenAI's unfortunate naming convention). Even more incredible, Yohei isn't a professional developer. His day job is as the general partner of Untapped Capital (@UntappedVC). I sat down with Yohei to talk about: What feeds Yohei's drive to create new tools The evolution of BabyAGI into a more powerful version of itself What Yohei learned about himself by tinkering on the internet Yohei's personal philosophy about how the tools we build our extensions of ourselves Why founders in AI should think about their products from a modular lens, by addressing immediate problems while enabling growth in the future Yohei's insight into a future where models will train themselves as you use them We experiment with Ditto live on the show, using the tool to build a game of Snake and a handy scheduling app. Yohei also screenshares a demo of BabyAGI 2o in action. This is a must-watch for anyone curious about autonomous agents, building cool AI tools on the internet, and the future of AI tooling. If you found this episode interesting, please like, subscribe, comment, and share! Want even more? Sign up for Every to unlock our ultimate guide to prompting ChatGPT here: https://every.ck.page/ultimate-guide-to-prompting-chatgpt. It's usually only for paying subscribers, but you can get it here for free. To hear more from Dan Shipper: Subscribe to Every: https://every.to/subscribe Follow him on X: https://twitter.com/danshipper Timestamps: Introduction: (00:00:59) BabyAGI and its evolution into a more powerful tool: (00:02:26) How better models are changing the way Yohei builds: (00:05:00) Using code building agent Ditto to build a game of Snake: (00:08:10) The ins and outs of how Ditto works: (00:13:24) How Yohei gets a lot done in little time: (00:19:21) Yohei's personal philosophy around building AI tools: (00:21:50) How Yohei experiments with AI as a tech-forward parent: (00:33:13) Demo of Yohei's latest release, BabyAGI 2.0: (00:39:29) Yohei's insights on the future of AI tooling: (00:51:24) Links to resources mentioned in the episode: Yohei Nakajima: @yoheinakajima, http://yohei.me Untapped Capital: @UntappedVC, https://www.untapped.vc/ My first interview with Yohei, around the time he released BabyAGI: https://every.to/chain-of-thought/this-vc-is-slowly-automating-their-job The other AI tools Yohei has created: Ditto, BabyAGI 2, BabyAGI 2o The tweet thread about AI bots being let loose on a Discord server: https://x.com/AISafetyMemes/status/1847312782049333701
Yohei Nakajima is an investor by day and coder by night. In particular, one of his projects, an AI agent framework called BabyAGI that creates a plan-execute loop, got a ton of attention in the past year. The truth is that AI agents are an extremely experimental space, and depending on how strict you want to be with your definition, there aren't a lot of production use cases today. Yohei discusses the current state of AI agents and where they might take us. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.
Join Logan Kilpatrick and Nolan Fortman as we dive deep into the future of AI Agents, how they will affect businesses, developers, and the world. Yohei is one of the deepest thinkers in the world of AI agents, as the creator of Baby AGI and the backer of many companies in the agent space. I hope you all enjoy this conversation as much as Nolan and I did. Takeaways - Building AI tools for venture capital can provide value to founders and investors. - The future of AI in venture capital lies in the development of autonomous agents and the integration of AI into VC workflows. - Verticalized AI solutions can capture value quickly by addressing specific tasks and industries. - The adoption of AI in organizations requires a mindset shift and a focus on empowering employees rather than replacing them. - The AI landscape is constantly evolving, and there is still much to be explored and developed. Location can have an impact on attending events and the fear of missing out (FOMO). Being in a different location can provide a different perspective and prevent exhaustion from attending too many events. - Vector databases play a crucial role in enabling AI applications, particularly in semantic search. There is a growing number of companies providing vector database solutions, but there is still room for improvement in fine-tuning embeddings for specific use cases. - Knowledge graphs were gaining traction before the rise of large language models. However, there is potential for knowledge graphs to be integrated with AI and solve complex problems. - The decision to go open source or closed source depends on various factors, including the team's unique strengths, target market, and business goals. It is a strategic decision that reflects the values and philosophy of the company. - Building in public can be a strategic decision that aligns with a company's values and philosophy. It can help gain attention, work with other developers, and establish credibility. - There is a mix of experienced founders and new founders in the AI space. Both have their advantages, with experienced founders bringing valuable expertise and new founders bringing fresh and innovative ideas. - Yohei expresses optimism for the future of AI and technology, hoping to see advancements in autonomous agents, knowledge graphs, and passive AI. He looks forward to the progress and exciting ideas that will emerge in the coming years.
We are reuniting for the 2nd AI UX demo day in SF on Apr 28. Sign up to demo here! And don't forget tickets for the AI Engineer World's Fair — for early birds who join before keynote announcements!About a year ago there was a lot of buzz around prompt engineering techniques to force structured output. Our friend Simon Willison tweeted a bunch of tips and tricks, but the most iconic one is Riley Goodside making it a matter of life or death:Guardrails (friend of the pod and AI Engineer speaker), Marvin (AI Engineer speaker), and jsonformer had also come out at the time. In June 2023, Jason Liu (today's guest!) open sourced his “OpenAI Function Call and Pydantic Integration Module”, now known as Instructor, which quickly turned prompt engineering black magic into a clean, developer-friendly SDK. A few months later, model providers started to add function calling capabilities to their APIs as well as structured outputs support like “JSON Mode”, which was announced at OpenAI Dev Day (see recap here). In just a handful of months, we went from threatening to kill grandmas to first-class support from the research labs. And yet, Instructor was still downloaded 150,000 times last month. Why?What Instructor looks likeInstructor patches your LLM provider SDKs to offer a new response_model option to which you can pass a structure defined in Pydantic. It currently supports OpenAI, Anthropic, Cohere, and a long tail of models through LiteLLM.What Instructor is forThere are three core use cases to Instructor:* Extracting structured data: Taking an input like an image of a receipt and extracting structured data from it, such as a list of checkout items with their prices, fees, and coupon codes.* Extracting graphs: Identifying nodes and edges in a given input to extract complex entities and their relationships. For example, extracting relationships between characters in a story or dependencies between tasks.* Query understanding: Defining a schema for an API call and using a language model to resolve a request into a more complex one that an embedding could not handle. For example, creating date intervals from queries like “what was the latest thing that happened this week?” to then pass onto a RAG system or similar.Jason called all these different ways of getting data from LLMs “typed responses”: taking strings and turning them into data structures. Structured outputs as a planning toolThe first wave of agents was all about open-ended iteration and planning, with projects like AutoGPT and BabyAGI. Models would come up with a possible list of steps, and start going down the list one by one. It's really easy for them to go down the wrong branch, or get stuck on a single step with no way to intervene.What if these planning steps were returned to us as DAGs using structured output, and then managed as workflows? This also makes it easy to better train model on how to create these plans, as they are much more structured than a bullet point list. Once you have this structure, each piece can be modified individually by different specialized models. You can read some of Jason's experiments here:While LLMs will keep improving (Llama3 just got released as we write this), having a consistent structure for the output will make it a lot easier to swap models in and out. Jason's overall message on how we can move from ReAct loops to more controllable Agent workflows mirrors the “Process” discussion from our Elicit episode:Watch the talkAs a bonus, here's Jason's talk from last year's AI Engineer Summit. He'll also be a speaker at this year's AI Engineer World's Fair!Timestamps* [00:00:00] Introductions* [00:02:23] Early experiments with Generative AI at StitchFix* [00:08:11] Design philosophy behind the Instructor library* [00:11:12] JSON Mode vs Function Calling* [00:12:30] Single vs parallel function calling* [00:14:00] How many functions is too many?* [00:17:39] How to evaluate function calling* [00:20:23] What is Instructor good for?* [00:22:42] The Evolution from Looping to Workflow in AI Engineering* [00:27:03] State of the AI Engineering Stack* [00:28:26] Why Instructor isn't VC backed* [00:31:15] Advice on Pursuing Open Source Projects and Consulting* [00:36:00] The Concept of High Agency and Its Importance* [00:42:44] Prompts as Code and the Structure of AI Inputs and Outputs* [00:44:20] The Emergence of AI Engineering as a Distinct FieldShow notes* Jason on the UWaterloo mafia* Jason on Twitter, LinkedIn, website* Instructor docs* Max Woolf on the potential of Structured Output* swyx on Elo vs Cost* Jason on Anthropic Function Calling* Jason on Rejections, Advice to Young People* Jason on Bad Startup Ideas* Jason on Prompts as Code* Rysana's inversion models* Bryan Bischof's episode* Hamel HusainTranscriptAlessio [00:00:00]: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO at Residence at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol AI.Swyx [00:00:16]: Hello, we're back in the remote studio with Jason Liu from Instructor. Welcome Jason.Jason [00:00:21]: Hey there. Thanks for having me.Swyx [00:00:23]: Jason, you are extremely famous, so I don't know what I'm going to do introducing you, but you're one of the Waterloo clan. There's like this small cadre of you that's just completely dominating machine learning. Actually, can you list like Waterloo alums that you're like, you know, are just dominating and crushing it right now?Jason [00:00:39]: So like John from like Rysana is doing his inversion models, right? I know like Clive Chen from Waterloo. When I started the data science club, he was one of the guys who were like joining in and just like hanging out in the room. And now he was at Tesla working with Karpathy, now he's at OpenAI, you know.Swyx [00:00:56]: He's in my climbing club.Jason [00:00:58]: Oh, hell yeah. I haven't seen him in like six years now.Swyx [00:01:01]: To get in the social scene in San Francisco, you have to climb. So both in career and in rocks. So you started a data science club at Waterloo, we can talk about that, but then also spent five years at Stitch Fix as an MLE. You pioneered the use of OpenAI's LLMs to increase stylist efficiency. So you must have been like a very, very early user. This was like pretty early on.Jason [00:01:20]: Yeah, I mean, this was like GPT-3, okay. So we actually were using transformers at Stitch Fix before the GPT-3 model. So we were just using transformers for recommendation systems. At that time, I was very skeptical of transformers. I was like, why do we need all this infrastructure? We can just use like matrix factorization. When GPT-2 came out, I fine tuned my own GPT-2 to write like rap lyrics and I was like, okay, this is cute. Okay, I got to go back to my real job, right? Like who cares if I can write a rap lyric? When GPT-3 came out, again, I was very much like, why are we using like a post request to review every comment a person leaves? Like we can just use classical models. So I was very against language models for like the longest time. And then when ChatGPT came out, I basically just wrote a long apology letter to everyone at the company. I was like, hey guys, you know, I was very dismissive of some of this technology. I didn't think it would scale well, and I am wrong. This is incredible. And I immediately just transitioned to go from computer vision recommendation systems to LLMs. But funny enough, now that we have RAG, we're kind of going back to recommendation systems.Swyx [00:02:21]: Yeah, speaking of that, I think Alessio is going to bring up the next one.Alessio [00:02:23]: Yeah, I was going to say, we had Bryan Bischof from Hex on the podcast. Did you overlap at Stitch Fix?Jason [00:02:28]: Yeah, he was like one of my main users of the recommendation frameworks that I had built out at Stitch Fix.Alessio [00:02:32]: Yeah, we talked a lot about RecSys, so it makes sense.Swyx [00:02:36]: So now I have adopted that line, RAG is RecSys. And you know, if you're trying to reinvent new concepts, you should study RecSys first, because you're going to independently reinvent a lot of concepts. So your system was called Flight. It's a recommendation framework with over 80% adoption, servicing 350 million requests every day. Wasn't there something existing at Stitch Fix? Why did you have to write one from scratch?Jason [00:02:56]: No, so I think because at Stitch Fix, a lot of the machine learning engineers and data scientists were writing production code, sort of every team's systems were very bespoke. It's like, this team only needs to do like real time recommendations with small data. So they just have like a fast API app with some like pandas code. This other team has to do a lot more data. So they have some kind of like Spark job that does some batch ETL that does a recommendation. And so what happens is each team writes their code differently. And I have to come in and refactor their code. And I was like, oh man, I'm refactoring four different code bases, four different times. Wouldn't it be better if all the code quality was my fault? Let me just write this framework, force everyone else to use it. And now one person can maintain five different systems, rather than five teams having their own bespoke system. And so it was really a need of just sort of standardizing everything. And then once you do that, you can do observability across the entire pipeline and make large sweeping improvements in this infrastructure, right? If we notice that something is slow, we can detect it on the operator layer. Just hey, hey, like this team, you guys are doing this operation is lowering our latency by like 30%. If you just optimize your Python code here, we can probably make an extra million dollars. So let's jump on a call and figure this out. And then a lot of it was doing all this observability work to figure out what the heck is going on and optimize this system from not only just a code perspective, sort of like harassingly or against saying like, we need to add caching here. We're doing duplicated work here. Let's go clean up the systems. Yep.Swyx [00:04:22]: Got it. One more system that I'm interested in finding out more about is your similarity search system using Clip and GPT-3 embeddings and FIASS, where you saved over $50 million in annual revenue. So of course they all gave all that to you, right?Jason [00:04:34]: No, no, no. I mean, it's not going up and down, but you know, I got a little bit, so I'm pretty happy about that. But there, you know, that was when we were doing fine tuning like ResNets to do image classification. And so a lot of it was given an image, if we could predict the different attributes we have in the merchandising and we can predict the text embeddings of the comments, then we can kind of build a image vector or image embedding that can capture both descriptions of the clothing and sales of the clothing. And then we would use these additional vectors to augment our recommendation system. And so with the recommendation system really was just around like, what are similar items? What are complimentary items? What are items that you would wear in a single outfit? And being able to say on a product page, let me show you like 15, 20 more things. And then what we found was like, hey, when you turn that on, you make a bunch of money.Swyx [00:05:23]: Yeah. So, okay. So you didn't actually use GPT-3 embeddings. You fine tuned your own? Because I was surprised that GPT-3 worked off the shelf.Jason [00:05:30]: Because I mean, at this point we would have 3 million pieces of inventory over like a billion interactions between users and clothes. So any kind of fine tuning would definitely outperform like some off the shelf model.Swyx [00:05:41]: Cool. I'm about to move on from Stitch Fix, but you know, any other like fun stories from the Stitch Fix days that you want to cover?Jason [00:05:46]: No, I think that's basically it. I mean, the biggest one really was the fact that I think for just four years, I was so bearish on language models and just NLP in general. I'm just like, none of this really works. Like, why would I spend time focusing on this? I got to go do the thing that makes money, recommendations, bounding boxes, image classification. Yeah. Now I'm like prompting an image model. I was like, oh man, I was wrong.Swyx [00:06:06]: So my Stitch Fix question would be, you know, I think you have a bit of a drip and I don't, you know, my primary wardrobe is free startup conference t-shirts. Should more technology brothers be using Stitch Fix? What's your fashion advice?Jason [00:06:19]: Oh man, I mean, I'm not a user of Stitch Fix, right? It's like, I enjoy going out and like touching things and putting things on and trying them on. Right. I think Stitch Fix is a place where you kind of go because you want the work offloaded. I really love the clothing I buy where I have to like, when I land in Japan, I'm doing like a 45 minute walk up a giant hill to find this weird denim shop. That's the stuff that really excites me. But I think the bigger thing that's really captured is this idea that narrative matters a lot to human beings. Okay. And I think the recommendation system, that's really hard to capture. It's easy to use AI to sell like a $20 shirt, but it's really hard for AI to sell like a $500 shirt. But people are buying $500 shirts, you know what I mean? There's definitely something that we can't really capture just yet that we probably will figure out how to in the future.Swyx [00:07:07]: Well, it'll probably output in JSON, which is what we're going to turn to next. Then you went on a sabbatical to South Park Commons in New York, which is unusual because it's based on USF.Jason [00:07:17]: Yeah. So basically in 2020, really, I was enjoying working a lot as I was like building a lot of stuff. This is where we were making like the tens of millions of dollars doing stuff. And then I had a hand injury. And so I really couldn't code anymore for like a year, two years. And so I kind of took sort of half of it as medical leave, the other half I became more of like a tech lead, just like making sure the systems were like lights were on. And then when I went to New York, I spent some time there and kind of just like wound down the tech work, you know, did some pottery, did some jujitsu. And after GPD came out, I was like, oh, I clearly need to figure out what is going on here because something feels very magical. I don't understand it. So I spent basically like five months just prompting and playing around with stuff. And then afterwards, it was just my startup friends going like, hey, Jason, you know, my investors want us to have an AI strategy. Can you help us out? And it just snowballed and bore more and more until I was making this my full time job. Yeah, got it.Swyx [00:08:11]: You know, you had YouTube University and a journaling app, you know, a bunch of other explorations. But it seems like the most productive or the best known thing that came out of your time there was Instructor. Yeah.Jason [00:08:22]: Written on the bullet train in Japan. I think at some point, you know, tools like Guardrails and Marvin came out. Those are kind of tools that I use XML and Pytantic to get structured data out. But they really were doing things sort of in the prompt. And these are built with sort of the instruct models in mind. Like I'd already done that in the past. Right. At Stitch Fix, you know, one of the things we did was we would take a request note and turn that into a JSON object that we would use to send it to our search engine. Right. So if you said like, I want to, you know, skinny jeans that were this size, that would turn into JSON that we would send to our internal search APIs. But it always felt kind of gross. A lot of it is just like you read the JSON, you like parse it, you make sure the names are strings and ages are numbers and you do all this like messy stuff. But when function calling came out, it was very much sort of a new way of doing things. Right. Function calling lets you define the schema separate from the data and the instructions. And what this meant was you can kind of have a lot more complex schemas and just map them in Pytantic. And then you can just keep those very separate. And then once you add like methods, you can add validators and all that kind of stuff. The one thing I really had with a lot of these libraries, though, was it was doing a lot of the string formatting themselves, which was fine when it was the instruction to models. You just have a string. But when you have these new chat models, you have these chat messages. And I just didn't really feel like not being able to access that for the developer was sort of a good benefit that they would get. And so I just said, let me write like the most simple SDK around the OpenAI SDK, a simple wrapper on the SDK, just handle the response model a bit and kind of think of myself more like requests than actual framework that people can use. And so the goal is like, hey, like this is something that you can use to build your own framework. But let me just do all the boring stuff that nobody really wants to do. People want to build their own frameworks, but people don't want to build like JSON parsing.Swyx [00:10:08]: And the retrying and all that other stuff.Jason [00:10:10]: Yeah.Swyx [00:10:11]: Right. We had this a little bit of this discussion before the show, but like that design principle of going for being requests rather than being Django. Yeah. So what inspires you there? This has come from a lot of prior pain. Are there other open source projects that inspired your philosophy here? Yeah.Jason [00:10:25]: I mean, I think it would be requests, right? Like, I think it is just the obvious thing you install. If you were going to go make HTTP requests in Python, you would obviously import requests. Maybe if you want to do more async work, there's like future tools, but you don't really even think about installing it. And when you do install it, you don't think of it as like, oh, this is a requests app. Right? Like, no, this is just Python. The bigger question is, like, a lot of people ask questions like, oh, why isn't requests like in the standard library? Yeah. That's how I want my library to feel, right? It's like, oh, if you're going to use the LLM SDKs, you're obviously going to install instructor. And then I think the second question would be like, oh, like, how come instructor doesn't just go into OpenAI, go into Anthropic? Like, if that's the conversation we're having, like, that's where I feel like I've succeeded. Yeah. It's like, yeah, so standard, you may as well just have it in the base libraries.Alessio [00:11:12]: And the shape of the request stayed the same, but initially function calling was maybe equal structure outputs for a lot of people. I think now the models also support like JSON mode and some of these things and, you know, return JSON or my grandma is going to die. All of that stuff is maybe to decide how have you seen that evolution? Like maybe what's the metagame today? Should people just forget about function calling for structure outputs or when is structure output like JSON mode the best versus not? We'd love to get any thoughts given that you do this every day.Jason [00:11:42]: Yeah, I would almost say these are like different implementations of like the real thing we care about is the fact that now we have typed responses to language models. And because we have that type response, my IDE is a little bit happier. I get autocomplete. If I'm using the response wrong, there's a little red squiggly line. Like those are the things I care about in terms of whether or not like JSON mode is better. I usually think it's almost worse unless you want to spend less money on like the prompt tokens that the function call represents, primarily because with JSON mode, you don't actually specify the schema. So sure, like JSON load works, but really, I care a lot more than just the fact that it is JSON, right? I think function calling gives you a tool to specify the fact like, okay, this is a list of objects that I want and each object has a name or an age and I want the age to be above zero and I want to make sure it's parsed correctly. That's where kind of function calling really shines.Alessio [00:12:30]: Any thoughts on single versus parallel function calling? So I did a presentation at our AI in Action Discord channel, and obviously showcase instructor. One of the big things that we have before with single function calling is like when you're trying to extract lists, you have to make these funky like properties that are lists to then actually return all the objects. How do you see the hack being put on the developer's plate versus like more of this stuff just getting better in the model? And I know you tweeted recently about Anthropic, for example, you know, some lists are not lists or strings and there's like all of these discrepancies.Jason [00:13:04]: I almost would prefer it if it was always a single function call. Obviously, there is like the agents workflows that, you know, Instructor doesn't really support that well, but are things that, you know, ought to be done, right? Like you could define, I think maybe like 50 or 60 different functions in a single API call. And, you know, if it was like get the weather or turn the lights on or do something else, it makes a lot of sense to have these parallel function calls. But in terms of an extraction workflow, I definitely think it's probably more helpful to have everything be a single schema, right? Just because you can sort of specify relationships between these entities that you can't do in a parallel function calling, you can have a single chain of thought before you generate a list of results. Like there's like small like API differences, right? Where if it's for parallel function calling, if you do one, like again, really, I really care about how the SDK looks and says, okay, do I always return a list of functions or do you just want to have the actual object back out and you want to have like auto complete over that object? Interesting.Alessio [00:14:00]: What's kind of the cap for like how many function definitions you can put in where it still works well? Do you have any sense on that?Jason [00:14:07]: I mean, for the most part, I haven't really had a need to do anything that's more than six or seven different functions. I think in the documentation, they support way more. I don't even know if there's any good evals that have over like two dozen function calls. I think if you're running into issues where you have like 20 or 50 or 60 function calls, I think you're much better having those specifications saved in a vector database and then have them be retrieved, right? So if there are 30 tools, like you should basically be like ranking them and then using the top K to do selection a little bit better rather than just like shoving like 60 functions into a single. Yeah.Swyx [00:14:40]: Yeah. Well, I mean, so I think this is relevant now because previously I think context limits prevented you from having more than a dozen tools anyway. And now that we have million token context windows, you know, a cloud recently with their new function calling release said they can handle over 250 tools, which is insane to me. That's, that's a lot. You're saying like, you know, you don't think there's many people doing that. I think anyone with a sort of agent like platform where you have a bunch of connectors, they wouldn't run into that problem. Probably you're right that they should use a vector database and kind of rag their tools. I know Zapier has like a few thousand, like 8,000, 9,000 connectors that, you know, obviously don't fit anywhere. So yeah, I mean, I think that would be it unless you need some kind of intelligence that chains things together, which is, I think what Alessio is coming back to, right? Like there's this trend about parallel function calling. I don't know what I think about that. Anthropic's version was, I think they use multiple tools in sequence, but they're not in parallel. I haven't explored this at all. I'm just like throwing this open to you as to like, what do you think about all these new things? Yeah.Jason [00:15:40]: It's like, you know, do we assume that all function calls could happen in any order? In which case, like we either can assume that, or we can assume that like things need to happen in some kind of sequence as a DAG, right? But if it's a DAG, really that's just like one JSON object that is the entire DAG rather than going like, okay, the order of the function that return don't matter. That's definitely just not true in practice, right? Like if I have a thing that's like turn the lights on, like unplug the power, and then like turn the toaster on or something like the order doesn't matter. And it's unclear how well you can describe the importance of that reasoning to a language model yet. I mean, I'm sure you can do it with like good enough prompting, but I just haven't any use cases where the function sequence really matters. Yeah.Alessio [00:16:18]: To me, the most interesting thing is the models are better at picking than your ranking is usually. Like I'm incubating a company around system integration. For example, with one system, there are like 780 endpoints. And if you're actually trying to do vector similarity, it's not that good because the people that wrote the specs didn't have in mind making them like semantically apart. You know, they're kind of like, oh, create this, create this, create this. Versus when you give it to a model, like in Opus, you put them all, it's quite good at picking which ones you should actually run. And I'm curious to see if the model providers actually care about some of those workflows or if the agent companies are actually going to build very good rankers to kind of fill that gap.Jason [00:16:58]: Yeah. My money is on the rankers because you can do those so easily, right? You could just say, well, given the embeddings of my search query and the embeddings of the description, I can just train XGBoost and just make sure that I have very high like MRR, which is like mean reciprocal rank. And so the only objective is to make sure that the tools you use are in the top end filtered. Like that feels super straightforward and you don't have to actually figure out how to fine tune a language model to do tool selection anymore. Yeah. I definitely think that's the case because for the most part, I imagine you either have like less than three tools or more than a thousand. I don't know what kind of company said, oh, thank God we only have like 185 tools and this works perfectly, right? That's right.Alessio [00:17:39]: And before we maybe move on just from this, it was interesting to me, you retweeted this thing about Anthropic function calling and it was Joshua Brown's retweeting some benchmark that it's like, oh my God, Anthropic function calling so good. And then you retweeted it and then you tweeted it later and it's like, it's actually not that good. What's your flow? How do you actually test these things? Because obviously the benchmarks are lying, right? Because the benchmarks say it's good and you said it's bad and I trust you more than the benchmark. How do you think about that? And then how do you evolve it over time?Jason [00:18:09]: It's mostly just client data. I actually have been mostly busy with enough client work that I haven't been able to reproduce public benchmarks. And so I can't even share some of the results in Anthropic. I would just say like in production, we have some pretty interesting schemas where it's like iteratively building lists where we're doing like updates of lists, like we're doing in place updates. So like upserts and inserts. And in those situations we're like, oh yeah, we have a bunch of different parsing errors. Numbers are being returned to strings. We were expecting lists of objects, but we're getting strings that are like the strings of JSON, right? So we had to call JSON parse on individual elements. Overall, I'm like super happy with the Anthropic models compared to the OpenAI models. Sonnet is very cost effective. Haiku is in function calling, it's actually better, but I think they just had to sort of file down the edges a little bit where like our tests pass, but then we actually deployed a production. We got half a percent of traffic having issues where if you ask for JSON, it'll try to talk to you. Or if you use function calling, you know, we'll have like a parse error. And so I think that definitely gonna be things that are fixed in like the upcoming weeks. But in terms of like the reasoning capabilities, man, it's hard to beat like 70% cost reduction, especially when you're building consumer applications, right? If you're building something for consultants or private equity, like you're charging $400, it doesn't really matter if it's a dollar or $2. But for consumer apps, it makes products viable. If you can go from four to Sonnet, you might actually be able to price it better. Yeah.Swyx [00:19:31]: I had this chart about the ELO versus the cost of all the models. And you could put trend graphs on each of those things about like, you know, higher ELO equals higher cost, except for Haiku. Haiku kind of just broke the lines, or the ISO ELOs, if you want to call it. Cool. Before we go too far into your opinions on just the overall ecosystem, I want to make sure that we map out the surface area of Instructor. I would say that most people would be familiar with Instructor from your talks and your tweets and all that. You had the number one talk from the AI Engineer Summit.Jason [00:20:03]: Two Liu. Jason Liu and Jerry Liu. Yeah.Swyx [00:20:06]: Yeah. Until I actually went through your cookbook, I didn't realize the surface area. How would you categorize the use cases? You have LLM self-critique, you have knowledge graphs in here, you have PII data sanitation. How do you characterize to people what is the surface area of Instructor? Yeah.Jason [00:20:23]: This is the part that feels crazy because really the difference is LLMs give you strings and Instructor gives you data structures. And once you get data structures, again, you can do every lead code problem you ever thought of. Right. And so I think there's a couple of really common applications. The first one obviously is extracting structured data. This is just be, okay, well, like I want to put in an image of a receipt. I want to give it back out a list of checkout items with a price and a fee and a coupon code or whatever. That's one application. Another application really is around extracting graphs out. So one of the things we found out about these language models is that not only can you define nodes, it's really good at figuring out what are nodes and what are edges. And so we have a bunch of examples where, you know, not only do I extract that, you know, this happens after that, but also like, okay, these two are dependencies of another task. And you can do, you know, extracting complex entities that have relationships. Given a story, for example, you could extract relationships of families across different characters. This can all be done by defining a graph. The last really big application really is just around query understanding. The idea is that like any API call has some schema and if you can define that schema ahead of time, you can use a language model to resolve a request into a much more complex request. One that an embedding could not do. So for example, I have a really popular post called like rag is more than embeddings. And effectively, you know, if I have a question like this, what was the latest thing that happened this week? That embeds to nothing, right? But really like that query should just be like select all data where the date time is between today and today minus seven days, right? What if I said, how did my writing change between this month and last month? Again, embeddings would do nothing. But really, if you could do like a group by over the month and a summarize, then you could again like do something much more interesting. And so this really just calls out the fact that embeddings really is kind of like the lowest hanging fruit. And using something like instructor can really help produce a data structure. And then you can just use your computer science and reason about the data structure. Maybe you say, okay, well, I'm going to produce a graph where I want to group by each month and then summarize them jointly. You can do that if you know how to define this data structure. Yeah.Swyx [00:22:29]: So you kind of run up against like the LangChains of the world that used to have that. They still do have like the self querying, I think they used to call it when we had Harrison on in our episode. How do you see yourself interacting with the other LLM frameworks in the ecosystem? Yeah.Jason [00:22:42]: I mean, if they use instructor, I think that's totally cool. Again, it's like, it's just Python, right? It's like asking like, oh, how does like Django interact with requests? Well, you just might make a request.get in a Django app, right? But no one would say, I like went off of Django because I'm using requests now. They should be ideally like sort of the wrong comparison in terms of especially like the agent workflows. I think the real goal for me is to go down like the LLM compiler route, which is instead of doing like a react type reasoning loop. I think my belief is that we should be using like workflows. If we do this, then we always have a request and a complete workflow. We can fine tune a model that has a better workflow. Whereas it's hard to think about like, how do you fine tune a better react loop? Yeah. You always train it to have less looping, in which case like you wanted to get the right answer the first time, in which case it was a workflow to begin with, right?Swyx [00:23:31]: Can you define workflow? Because I used to work at a workflow company, but I'm not sure this is a good term for everybody.Jason [00:23:36]: I'm thinking workflow in terms of like the prefect Zapier workflow. Like I want to build a DAG, I want you to tell me what the nodes and edges are. And then maybe the edges are also put in with AI. But the idea is that like, I want to be able to present you the entire plan and then ask you to fix things as I execute it, rather than going like, hey, I couldn't parse the JSON, so I'm going to try again. I couldn't parse the JSON, I'm going to try again. And then next thing you know, you spent like $2 on opening AI credits, right? Yeah. Whereas with the plan, you can just say, oh, the edge between node like X and Y does not run. Let me just iteratively try to fix that, fix the one that sticks, go on to the next component. And obviously you can get into a world where if you have enough examples of the nodes X and Y, maybe you can use like a vector database to find a good few shot examples. You can do a lot if you sort of break down the problem into that workflow and executing that workflow, rather than looping and hoping the reasoning is good enough to generate the correct output. Yeah.Swyx [00:24:35]: You know, I've been hammering on Devon a lot. I got access a couple of weeks ago. And obviously for simple tasks, it does well. For the complicated, like more than 10, 20 hour tasks, I can see- That's a crazy comparison.Jason [00:24:47]: We used to talk about like three, four loops. Only once it gets to like hour tasks, it's hard.Swyx [00:24:54]: Yeah. Less than an hour, there's nothing.Jason [00:24:57]: That's crazy.Swyx [00:24:58]: I mean, okay. Maybe my goalposts have shifted. I don't know. That's incredible.Jason [00:25:02]: Yeah. No, no. I'm like sub one minute executions. Like the fact that you're talking about 10 hours is incredible.Swyx [00:25:08]: I think it's a spectrum. I think I'm going to say this every single time I bring up Devon. Let's not reward them for taking longer to do things. Do you know what I mean? I think that's a metric that is easily abusable.Jason [00:25:18]: Sure. Yeah. You know what I mean? But I think if you can monotonically increase the success probability over an hour, that's winning to me. Right? Like obviously if you run an hour and you've made no progress. Like I think when we were in like auto GBT land, there was that one example where it's like, I wanted it to like buy me a bicycle overnight. I spent $7 on credit and I never found the bicycle. Yeah.Swyx [00:25:41]: Yeah. Right. I wonder if you'll be able to purchase a bicycle. Because it actually can do things in real world. It just needs to suspend to you for off and stuff. The point I was trying to make was that I can see it turning plans. I think one of the agents loopholes or one of the things that is a real barrier for agents is LLMs really like to get stuck into a lane. And you know what you're talking about, what I've seen Devon do is it gets stuck in a lane and it will just kind of change plans based on the performance of the plan itself. And it's kind of cool.Jason [00:26:05]: I feel like we've gone too much in the looping route and I think a lot of more plans and like DAGs and data structures are probably going to come back to help fill in some holes. Yeah.Alessio [00:26:14]: What do you think of the interface to that? Do you see it's like an existing state machine kind of thing that connects to the LLMs, the traditional DAG players? Do you think we need something new for like AI DAGs?Jason [00:26:25]: Yeah. I mean, I think that the hard part is going to be describing visually the fact that this DAG can also change over time and it should still be allowed to be fuzzy. I think in like mathematics, we have like plate diagrams and like Markov chain diagrams and like recurrent states and all that. Some of that might come into this workflow world. But to be honest, I'm not too sure. I think right now, the first steps are just how do we take this DAG idea and break it down to modular components that we can like prompt better, have few shot examples for and ultimately like fine tune against. But in terms of even the UI, it's hard to say what it will likely win. I think, you know, people like Prefect and Zapier have a pretty good shot at doing a good job.Swyx [00:27:03]: Yeah. You seem to use Prefect a lot. I actually worked at a Prefect competitor at Temporal and I'm also very familiar with Dagster. What else would you call out as like particularly interesting in the AI engineering stack?Jason [00:27:13]: Man, I almost use nothing. I just use Cursor and like PyTests. Okay. I think that's basically it. You know, a lot of the observability companies have... The more observability companies I've tried, the more I just use Postgres.Swyx [00:27:29]: Really? Okay. Postgres for observability?Jason [00:27:32]: But the issue really is the fact that these observability companies isn't actually doing observability for the system. It's just doing the LLM thing. Like I still end up using like Datadog or like, you know, Sentry to do like latency. And so I just have those systems handle it. And then the like prompt in, prompt out, latency, token costs. I just put that in like a Postgres table now.Swyx [00:27:51]: So you don't need like 20 funded startups building LLM ops? Yeah.Jason [00:27:55]: But I'm also like an old, tired guy. You know what I mean? Like I think because of my background, it's like, yeah, like the Python stuff, I'll write myself. But you know, I will also just use Vercel happily. Yeah. Yeah. So I'm not really into that world of tooling, whereas I think, you know, I spent three good years building observability tools for recommendation systems. And I was like, oh, compared to that, Instructor is just one call. I just have to put time star, time and then count the prompt token, right? Because I'm not doing a very complex looping behavior. I'm doing mostly workflows and extraction. Yeah.Swyx [00:28:26]: I mean, while we're on this topic, we'll just kind of get this out of the way. You famously have decided to not be a venture backed company. You want to do the consulting route. The obvious route for someone as successful as Instructor is like, oh, here's hosted Instructor with all tooling. Yeah. You just said you had a whole bunch of experience building observability tooling. You have the perfect background to do this and you're not.Jason [00:28:43]: Yeah. Isn't that sick? I think that's sick.Swyx [00:28:44]: I mean, I know why, because you want to go free dive.Jason [00:28:47]: Yeah. Yeah. Because I think there's two things. Right. Well, one, if I tell myself I want to build requests, requests is not a venture backed startup. Right. I mean, one could argue whether or not Postman is, but I think for the most part, it's like having worked so much, I'm more interested in looking at how systems are being applied and just having access to the most interesting data. And I think I can do that more through a consulting business where I can come in and go, oh, you want to build perfect memory. You want to build an agent. You want to build like automations over construction or like insurance and supply chain, or like you want to handle writing private equity, mergers and acquisitions reports based off of user interviews. Those things are super fun. Whereas like maintaining the library, I think is mostly just kind of like a utility that I try to keep up, especially because if it's not venture backed, I have no reason to sort of go down the route of like trying to get a thousand integrations. In my mind, I just go like, okay, 98% of the people use open AI. I'll support that. And if someone contributes another platform, that's great. I'll merge it in. Yeah.Swyx [00:29:45]: I mean, you only added Anthropic support this year. Yeah.Jason [00:29:47]: Yeah. You couldn't even get an API key until like this year, right? That's true. Okay. If I add it like last year, I was trying to like double the code base to service, you know, half a percent of all downloads.Swyx [00:29:58]: Do you think the market share will shift a lot now that Anthropic has like a very, very competitive offering?Jason [00:30:02]: I think it's still hard to get API access. I don't know if it's fully GA now, if it's GA, if you can get a commercial access really easily.Alessio [00:30:12]: I got commercial after like two weeks to reach out to their sales team.Jason [00:30:14]: Okay.Alessio [00:30:15]: Yeah.Swyx [00:30:16]: Two weeks. It's not too bad. There's a call list here. And then anytime you run into rate limits, just like ping one of the Anthropic staff members.Jason [00:30:21]: Yeah. Then maybe we need to like cut that part out. So I don't need to like, you know, spread false news.Swyx [00:30:25]: No, it's cool. It's cool.Jason [00:30:26]: But it's a common question. Yeah. Surely just from the price perspective, it's going to make a lot of sense. Like if you are a business, you should totally consider like Sonnet, right? Like the cost savings is just going to justify it if you actually are doing things at volume. And yeah, I think the SDK is like pretty good. Back to the instructor thing. I just don't think it's a billion dollar company. And I think if I raise money, the first question is going to be like, how are you going to get a billion dollar company? And I would just go like, man, like if I make a million dollars as a consultant, I'm super happy. I'm like more than ecstatic. I can have like a small staff of like three people. It's fun. And I think a lot of my happiest founder friends are those who like raised a tiny seed round, became profitable. They're making like 70, 60, 70, like MRR, 70,000 MRR and they're like, we don't even need to raise the seed round. Let's just keep it like between me and my co-founder, we'll go traveling and it'll be a great time. I think it's a lot of fun.Alessio [00:31:15]: Yeah. like say LLMs / AI and they build some open source stuff and it's like I should just raise money and do this and I tell people a lot it's like look you can make a lot more money doing something else than doing a startup like most people that do a company could make a lot more money just working somewhere else than the company itself do you have any advice for folks that are maybe in a similar situation they're trying to decide oh should I stay in my like high paid FAANG job and just tweet this on the side and do this on github should I go be a consultant like being a consultant seems like a lot of work so you got to talk to all these people you know there's a lot to unpackJason [00:31:54]: I think the open source thing is just like well I'm just doing it purely for fun and I'm doing it because I think I'm right but part of being right is the fact that it's not a venture backed startup like I think I'm right because this is all you need right so I think a part of the philosophy is the fact that all you need is a very sharp blade to sort of do your work and you don't actually need to build like a big enterprise so that's one thing I think the other thing too that I've kind of been thinking around just because I have a lot of friends at google that want to leave right now it's like man like what we lack is not money or skill like what we lack is courage you should like you just have to do this a hard thing and you have to do it scared anyways right in terms of like whether or not you do want to do a founder I think that's just a matter of optionality but I definitely recognize that the like expected value of being a founder is still quite low it is right I know as many founder breakups and as I know friends who raised a seed round this year right like that is like the reality and like you know even in from that perspective it's been tough where it's like oh man like a lot of incubators want you to have co-founders now you spend half the time like fundraising and then trying to like meet co-founders and find co-founders rather than building the thing this is a lot of time spent out doing uh things I'm not really good at. I do think there's a rising trend in solo founding yeah.Swyx [00:33:06]: You know I am a solo I think that something like 30 percent of like I forget what the exact status something like 30 percent of starters that make it to like series B or something actually are solo founder I feel like this must have co-founder idea mostly comes from YC and most everyone else copies it and then plenty of companies break up over co-founderJason [00:33:27]: Yeah and I bet it would be like I wonder how much of it is the people who don't have that much like and I hope this is not a diss to anybody but it's like you sort of you go through the incubator route because you don't have like the social equity you would need is just sort of like send an email to Sequoia and be like hey I'm going on this ride you want a ticket on the rocket ship right like that's very hard to sell my message if I was to raise money is like you've seen my twitter my life is sick I've decided to make it much worse by being a founder because this is something I have to do so do you want to come along otherwise I want to fund it myself like if I can't say that like I don't need the money because I can like handle payroll and like hire an intern and get an assistant like that's all fine but I really don't want to go back to meta I want to like get two years to like try to find a problem we're solving that feels like a bad timeAlessio [00:34:12]: Yeah Jason is like I wear a YSL jacket on stage at AI Engineer Summit I don't need your accelerator moneyJason [00:34:18]: And boots, you don't forget the boots. But I think that is a part of it right I think it is just like optionality and also just like I'm a lot older now I think 22 year old Jason would have been probably too scared and now I'm like too wise but I think it's a matter of like oh if you raise money you have to have a plan of spending it and I'm just not that creative with spending that much money yeah I mean to be clear you just celebrated your 30th birthday happy birthday yeah it's awesome so next week a lot older is relative to some some of the folks I think seeing on the career tipsAlessio [00:34:48]: I think Swix had a great post about are you too old to get into AI I saw one of your tweets in January 23 you applied to like Figma, Notion, Cohere, Anthropic and all of them rejected you because you didn't have enough LLM experience I think at that time it would be easy for a lot of people to say oh I kind of missed the boat you know I'm too late not gonna make it you know any advice for people that feel like thatJason [00:35:14]: Like the biggest learning here is actually from a lot of folks in jiu-jitsu they're like oh man like is it too late to start jiu-jitsu like I'll join jiu-jitsu once I get in more shape right it's like there's a lot of like excuses and then you say oh like why should I start now I'll be like 45 by the time I'm any good and say well you'll be 45 anyways like time is passing like if you don't start now you start tomorrow you're just like one more day behind if you're worried about being behind like today is like the soonest you can start right and so you got to recognize that like maybe you just don't want it and that's fine too like if you wanted you would have started I think a lot of these people again probably think of things on a too short time horizon but again you know you're gonna be old anyways you may as well just start now you knowSwyx [00:35:55]: One more thing on I guess the um career advice slash sort of vlogging you always go viral for this post that you wrote on advice to young people and the lies you tell yourself oh yeah yeah you said you were writing it for your sister.Jason [00:36:05]: She was like bummed out about going to college and like stressing about jobs and I was like oh and I really want to hear okay and I just kind of like text-to-sweep the whole thing it's crazy it's got like 50,000 views like I'm mind I mean your average tweet has more but that thing is like a 30-minute read nowSwyx [00:36:26]: So there's lots of stuff here which I agree with I you know I'm also of occasionally indulge in the sort of life reflection phase there's the how to be lucky there's the how to have high agency I feel like the agency thing is always a trend in sf or just in tech circles how do you define having high agencyJason [00:36:42]: I'm almost like past the high agency phase now now my biggest concern is like okay the agency is just like the norm of the vector what also matters is the direction right it's like how pure is the shot yeah I mean I think agency is just a matter of like having courage and doing the thing that's scary right you know if people want to go rock climbing it's like do you decide you want to go rock climbing then you show up to the gym you rent some shoes and you just fall 40 times or do you go like oh like I'm actually more intelligent let me go research the kind of shoes that I want okay like there's flatter shoes and more inclined shoes like which one should I get okay let me go order the shoes on Amazon I'll come back in three days like oh it's a little bit too tight maybe it's too aggressive I'm only a beginner let me go change no I think the higher agent person just like goes and like falls down 20 times right yeah I think the higher agency person is more focused on like process metrics versus outcome metrics right like from pottery like one thing I learned was if you want to be good at pottery you shouldn't count like the number of cups or bowls you make you should just weigh the amount of clay you use right like the successful person says oh I went through 100 pounds of clay right the less agency was like oh I've made six cups and then after I made six cups like there's not really what are you what do you do next no just pounds of clay pounds of clay same with the work here right so you just got to write the tweets like make the commits contribute open source like write the documentation there's no real outcome it's just a process and if you love that process you just get really good at the thing you're doingSwyx [00:38:04]: yeah so just to push back on this because obviously I mostly agree how would you design performance review systems because you were effectively saying we can count lines of code for developers rightJason [00:38:15]: I don't think that would be the actual like I think if you make that an outcome like I can just expand a for loop right I think okay so for performance review this is interesting because I've mostly thought of it from the perspective of science and not engineering I've been running a lot of engineering stand-ups primarily because there's not really that many machine learning folks the process outcome is like experiments and ideas right like if you think about outcome is what you might want to think about an outcome is oh I want to improve the revenue or whatnot but that's really hard but if you're someone who is going out like okay like this week I want to come up with like three or four experiments I might move the needle okay nothing worked to them they might think oh nothing worked like I suck but to me it's like wow you've closed off all these other possible avenues for like research like you're gonna get to the place that you're gonna figure out that direction really soon there's no way you try 30 different things and none of them work usually like 10 of them work five of them work really well two of them work really really well and one thing was like the nail in the head so agency lets you sort of capture the volume of experiments and like experience lets you figure out like oh that other half it's not worth doing right I think experience is going like half these prompting papers don't make any sense just use chain of thought and just you know use a for loop that's basically right it's like usually performance for me is around like how many experiments are you running how oftentimes are you trying.Alessio [00:39:32]: When do you give up on an experiment because a StitchFix you kind of give up on language models I guess in a way as a tool to use and then maybe the tools got better you were right at the time and then the tool improved I think there are similar paths in my engineering career where I try one approach and at the time it doesn't work and then the thing changes but then I kind of soured on that approach and I don't go back to it soonJason [00:39:51]: I see yeah how do you think about that loop so usually when I'm coaching folks and as they say like oh these things don't work I'm not going to pursue them in the future like one of the big things like hey the negative result is a result and this is something worth documenting like this is an academia like if it's negative you don't just like not publish right but then like what do you actually write down like what you should write down is like here are the conditions this is the inputs and the outputs we tried the experiment on and then one thing that's really valuable is basically writing down under what conditions would I revisit these experiments these things don't work because of what we had at the time if someone is reading this two years from now under what conditions will we try again that's really hard but again that's like another skill you kind of learn right it's like you do go back and you do experiments you figure out why it works now I think a lot of it here is just like scaling worked yeah rap lyrics you know that was because I did not have high enough quality data if we phase shift and say okay you don't even need training data oh great then it might just work a different domainAlessio [00:40:48]: Do you have anything in your list that is like it doesn't work now but I want to try it again later? Something that people should maybe keep in mind you know people always like agi when you know when are you going to know the agi is here maybe it's less than that but any stuff that you tried recently that didn't work thatJason [00:41:01]: You think will get there I mean I think the personal assistance and the writing I've shown to myself it's just not good enough yet so I hired a writer and I hired a personal assistant so now I'm gonna basically like work with these people until I figure out like what I can actually like automate and what are like the reproducible steps but like I think the experiment for me is like I'm gonna go pay a person like thousand dollars a month that helped me improve my life and then let me get them to help me figure like what are the components and how do I actually modularize something to get it to work because it's not just like a lot gmail calendar and like notion it's a little bit more complicated than that but we just don't know what that is yet those are two sort of systems that I wish gb4 or opus was actually good enough to just write me an essay but most of the essays are still pretty badSwyx [00:41:44]: yeah I would say you know on the personal assistance side Lindy is probably the one I've seen the most flow was at a speaker at the summit I don't know if you've checked it out or any other sort of agents assistant startupJason [00:41:54]: Not recently I haven't tried lindy they were not ga last time I was considering it yeah yeah a lot of it now it's like oh like really what I want you to do is take a look at all of my meetings and like write like a really good weekly summary email for my clients to remind them that I'm like you know thinking of them and like working for them right or it's like I want you to notice that like my monday is like way too packed and like block out more time and also like email the people to do the reschedule and then try to opt in to move them around and then I want you to say oh jason should have like a 15 minute prep break after form back to back those are things that now I know I can prompt them in but can it do it well like before I didn't even know that's what I wanted to prompt for us defragging a calendar and adding break so I can like eat lunch yeah that's the AGI test yeah exactly compassion right I think one thing that yeah we didn't touch on it before butAlessio [00:42:44]: I think was interesting you had this tweet a while ago about prompts should be code and then there were a lot of companies trying to build prompt engineering tooling kind of trying to turn the prompt into a more structured thing what's your thought today now you want to turn the thinking into DAGs like do prompts should still be code any updated ideasJason [00:43:04]: It's the same thing right I think you know with Instructor it is very much like the output model is defined as a code object that code object is sent to the LLM and in return you get a data structure so the outputs of these models I think should also be code objects and the inputs somewhat should be code objects but I think the one thing that instructor tries to do is separate instruction data and the types of the output and beyond that I really just think that most of it should be still like managed pretty closely to the developer like so much of is changing that if you give control of these systems away too early you end up ultimately wanting them back like many companies I know that I reach out or ones were like oh we're going off of the frameworks because now that we know what the business outcomes we're trying to optimize for these frameworks don't work yeah because we do rag but we want to do rag to like sell you supplements or to have you like schedule the fitness appointment the prompts are kind of too baked into the systems to really pull them back out and like start doing upselling or something it's really funny but a lot of it ends up being like once you understand the business outcomes you care way more about the promptSwyx [00:44:07]: Actually this is fun in our prep for this call we were trying to say like what can you as an independent person say that maybe me and Alessio cannot say or me you know someone at a company say what do you think is the market share of the frameworks the LangChain, the LlamaIndex, the everything...Jason [00:44:20]: Oh massive because not everyone wants to care about the code yeah right I think that's a different question to like what is the business model and are they going to be like massively profitable businesses right making hundreds of millions of dollars that feels like so straightforward right because not everyone is a prompt engineer like there's so much productivity to be captured in like back office optim automations right it's not because they care about the prompts that they care about managing these things yeah but those would be sort of low code experiences you yeah I think the bigger challenge is like okay hundred million dollars probably pretty easy it's just time and effort and they have the manpower and the money to sort of solve those problems again if you go the vc route then it's like you're talking about billions and that's really the goal that stuff for me it's like pretty unclear but again that is to say that like I sort of am building things for developers who want to use infrastructure to build their own tooling in terms of the amount of developers there are in the world versus downstream consumers of these things or even just think of how many companies will use like the adobes and the ibms right because they want something that's fully managed and they want something that they know will work and if the incremental 10% requires you to hire another team of 20 people you might not want to do it and I think that kind of organization is really good for uh those are bigger companiesSwyx [00:45:32]: I just want to capture your thoughts on one more thing which is you said you wanted most of the prompts to stay close to the developer and Hamel Husain wrote this post which I really love called f you show me the prompt yeah I think he cites you in one of those part of the blog post and I think ds pi is kind of like the complete antithesis of that which is I think it's interesting because I also hold the strong view that AI is a better prompt engineer than you are and I don't know how to square that wondering if you have thoughtsJason [00:45:58]: I think something like DSPy can work because there are like very short-term metrics to measure success right it is like did you find the pii or like did you write the multi-hop question the correct way but in these workflows that I've been managing a lot of it are we minimizing churn and maximizing retention yeah that's a very long loop it's not really like a uptuna like training loop right like those things are much more harder to capture so we don't actually have those metrics for that right and obviously we can figure out like okay is the summary good but like how do you measure the quality of the summary it's like that feedback loop it ends up being a lot longer and then again when something changes it's really hard to make sure that it works across these like newer models or again like changes to work for the current process like when we migrate from like anthropic to open ai like there's just a ton of change that are like infrastructure related not necessarily around the prompt itself yeah cool any other ai engineering startups that you think should not exist before we wrap up i mean oh my gosh i mean a lot of it again it's just like every time of investors like how does this make a billion dollars like it doesn't i'm gonna go back to just like tweeting and holding my breath underwater yeah like i don't really pay attention too much to most of this like most of the stuff i'm doing is around like the consumer of like llm calls yep i think people just want to move really fast and they will end up pick these vendors but i don't really know if anything has really like blown me out the water like i only trust myself but that's also a function of just being an old man like i think you know many companies are definitely very happy with using most of these tools anyways but i definitely think i occupy a very small space in the engineering ecosystem.Swyx [00:47:41]: Yeah i would say one of the challenges here you know you call about the dealing in the consumer of llm's space i think that's what ai engineering differs from ml engineering and i think a constant disconnect or cognitive dissonance in this field in the ai engineers that have sprung up is that they are not as good as the ml engineers they are not as qualified i think that you know you are someone who has credibility in the mle space and you are also a very authoritative figure in the ai space and i think so and you know i think you've built the de facto leading library i think yours i think instructors should be part of the standard lib even though i try to not use it like i basically also end up rebuilding instructor right like that's a lot of the back and forth that we had over the past two days i think that's the fundamental thing that we're trying to figure out like there's very small supply of MLEs not everyone's going to have that experience that you had but the global demand for AI is going to far outstrip the existing MLEs.Jason [00:48:36]: So what do we do do we force everyone to go through the standard MLE curriculum or do we make a new one? I'
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Anthropic AI made the right call, published by bhauth on April 16, 2024 on LessWrong. I've seen a number of people criticize Anthropic for releasing Claude 3 Opus, with arguments along the lines of: Anthropic said they weren't going to push the frontier, but this release is clearly better than GPT-4 in some ways! They're betraying their mission statement! I think that criticism takes too narrow a view. Consider the position of investors in AI startups. If OpenAI has a monopoly on the clearly-best version of a world-changing technology, that gives them a lot of pricing power on a large market. However, if there are several groups with comparable products, investors don't know who the winner will be, and investment gets split between them. Not only that, but if they stay peers, then there will be more competition in the future, meaning less pricing power and less profitability. The comparison isn't just "GPT-4 exists" vs "GPT-4 and Claude Opus exist" - it's more like "investors give X billion dollars to OpenAI" vs "investors give X/3 billion dollars to OpenAI and Anthropic". Now, you could argue that "more peer-level companies makes an agreement to stop development less likely" - but that wasn't happening anyway, so any pauses would be driven by government action. If Anthropic was based in a country that previously had no notable AI companies, maybe that would be a reasonable argument, but it's not. If you're concerned about social problems from widespread deployment of LLMs, maybe you should be unhappy about more good LLMs and more competition. But if you're concerned about ASI, especially if you're only concerned about future developments and not LLM hacks like BabyAGI, I think you should be happy about Anthropic releasing Claude 3 Opus. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Anthropic AI made the right call, published by bhauth on April 16, 2024 on LessWrong. I've seen a number of people criticize Anthropic for releasing Claude 3 Opus, with arguments along the lines of: Anthropic said they weren't going to push the frontier, but this release is clearly better than GPT-4 in some ways! They're betraying their mission statement! I think that criticism takes too narrow a view. Consider the position of investors in AI startups. If OpenAI has a monopoly on the clearly-best version of a world-changing technology, that gives them a lot of pricing power on a large market. However, if there are several groups with comparable products, investors don't know who the winner will be, and investment gets split between them. Not only that, but if they stay peers, then there will be more competition in the future, meaning less pricing power and less profitability. The comparison isn't just "GPT-4 exists" vs "GPT-4 and Claude Opus exist" - it's more like "investors give X billion dollars to OpenAI" vs "investors give X/3 billion dollars to OpenAI and Anthropic". Now, you could argue that "more peer-level companies makes an agreement to stop development less likely" - but that wasn't happening anyway, so any pauses would be driven by government action. If Anthropic was based in a country that previously had no notable AI companies, maybe that would be a reasonable argument, but it's not. If you're concerned about social problems from widespread deployment of LLMs, maybe you should be unhappy about more good LLMs and more competition. But if you're concerned about ASI, especially if you're only concerned about future developments and not LLM hacks like BabyAGI, I think you should be happy about Anthropic releasing Claude 3 Opus. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
Sie heißen Devin, SIMA oder BabyAGI: autonome KI-Agenten gelten als die nächste Generation künstlicher Intelligenz. Statt nur Texte zusammenzufassen, erledigen diese "AI Agents” selbstständig komplexe Aufgaben mit mehreren Arbeitsschritten. Für Gregor und Fritz ist das Grund genug, die Frage unserer allerersten Folge noch einmal aufzugreifen: Haben KI-Agenten das Zeug, eines Tages unsere Jobs besser zu machen als wir selbst? Über die Hosts: Gregor Schmalzried ist freier Tech-Journalist und Berater, er arbeitet u.a. für den Bayerischen Rundfunk und Brand Eins. Fritz Espenlaub ist freier Journalist und Moderator beim Bayerischen Rundfunk und 1E9 mit Fokus auf Technologie und Wirtschaft. In dieser Folge: 00:00 Intro 03:15 Was sind KI-Agenten? 06:50 Von Westworld bis Devin: Beispiele für KI-Agenten 15:19 Was, wenn KI-Agenten wirklich ALLES besser können als wir Menschen? 26:37 Was haben wir diese Woche mit KI gemacht? Links: Devin: https://www.cognition-labs.com/introducing-devin SIMA: https://deepmind.google/discover/blog/sima-generalist-ai-agent-for-3d-virtual-environments/ Westworld und co.: https://towardsdatascience.com/4-autonomous-ai-agents-you-need-to-know-d612a643fa92 Warum bessere KI mehr Jobs für Menschen bedeuten könnte: https://youtu.be/k3KzKJV6osw?si=UTJshbH0m8axsnPr Redaktion und Mitarbeit: David Beck, Cristina Cletiu, Chris Eckardt, Fritz Espenlaub, Marie Kilg, Mark Kleber, Gudrun Riedl, Christian Schiffer, Gregor Schmalzried Kontakt: Wir freuen uns über Fragen und Kommentare an podcast@br.de. Unterstützt uns: Wenn euch dieser Podcast gefällt, freuen wir uns über eine Bewertung auf eurer liebsten Podcast-Plattform. Abonniert den KI-Podcast in der ARD Audiothek oder wo immer ihr eure Podcasts hört, um keine Episode zu verpassen. Und empfehlt uns gerne weiter!
In today's episode Yohei Nakajima, GP at Untapped Capital and Creator of BabyAGI, returns to the show. Together with Nathan, they watch Yohei's TEDAI Talk, unpacking themes on collective intelligence, identity, and how AI can help us better understand one another. If you need an ecommerce platform, check out our sponsor Shopify: https://shopify.com/cognitive for a $1/month trial period. We're hiring across the board at Turpentine and for Erik's personal team on other projects he's incubating. He's hiring a Chief of Staff, EA, Head of Special Projects, Investment Associate, and more. For a list of JDs, check out: eriktorenberg.com. -- LINKS: - Yohei's TEDAI Talk: https://www.ai-event.ted.com/panelists/yohei-nakajima - Listen to Yohei Nakajima Part 1 on The Cognitive Revolution here: https://www.cognitiverevolution.ai/ai-agents-vc-insights-on-ai-and-building-in-public-with-yohei-nakajima-creator-of-babyagi/ SPONSORS: The Brave search API can be used to assemble a data set to train your AI models and help with retrieval augmentation at the time of inference. All while remaining affordable with developer first pricing, integrating the Brave search API into your workflow translates to more ethical data sourcing and more human representative data sets. Try the Brave search API for free for up to 2000 queries per month at https://brave.com/api Shopify is the global commerce platform that helps you sell at every stage of your business. Shopify powers 10% of ALL eCommerce in the US. And Shopify's the global force behind Allbirds, Rothy's, and Brooklinen, and 1,000,000s of other entrepreneurs across 175 countries.From their all-in-one e-commerce platform, to their in-person POS system – wherever and whatever you're selling, Shopify's got you covered. With free Shopify Magic, sell more with less effort by whipping up captivating content that converts – from blog posts to product descriptions using AI. Sign up for $1/month trial period: https://shopify.com/cognitive Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off www.omneky.com NetSuite has 25 years of providing financial software for all your business needs. More than 36,000 businesses have already upgraded to NetSuite by Oracle, gaining visibility and control over their financials, inventory, HR, eCommerce, and more. If you're looking for an ERP platform ✅ head to NetSuite: http://netsuite.com/cognitive and download your own customized KPI checklist. X/SOCIALS: @labenz (Nathan) @yoheinakajima (Yohei) @CogRev_Podcast TIMESTAMPS: (06:39) Yohei's AI Creation Process (09:50) Differences between models (15:46) Sponsor - Brave Search API | Shopify (20:10) Knowledge Graphs (24:30) Shift to philosophical discussion (27:40) GPT Store (30:28) Sponsor - NetSuite | Omneky (35:40) Eastern Philosophy and Identity (36:22) How different societies will relate to Ai (37:00) Appreciate how much philosophy is embedded into language (39:03) Japan's policy towards AI and individual privacy (40:15) Eastern philosophy view of the brain (40:51) AI is a tool much like real intelligence (42:20) AI in China (42:46) Collective, extended cognition (44:10) Emergence in ants (44:51) Language model performance in different languages (47:06) Nationalism and AI (48:31) Tokenization, unicode, and cost structure between languages (56:54) Accepting, not understanding (59:24) AI bridging connection
In this episode, Yohei Nakajima, creator of BabyAGI and GP at Untapped Capital, chats with Nathan about the opportunity for AI to strengthen human understanding, AI agents, and his insights on investing in AI projects. If you need an ERP platform, check out our sponsor NetSuite: http://netsuite.com/cognitive. SPONSORS: NetSuite | Omneky NetSuite has 25 years of providing financial software for all your business needs. More than 36,000 businesses have already upgraded to NetSuite by Oracle, gaining visibility and control over their financials, inventory, HR, eCommerce, and more. If you're looking for an ERP platform ✅ head to NetSuite: http://netsuite.com/cognitive and download your own customized KPI checklist. Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off. LINKS: BabyAGI: https://github.com/yoheinakajima/babyagi Yohei's site: https://yoheinakajima.com/ X/SOCIAL @labenz (Nathan) @yoheinakajima (Yohei) @CogRev_Podcast TIMESTAMPS: (00:10:59) - A large part of AI is trying to map human cognitive processes onto our software stack (00:11:14) - Yohei's TED AI talk: the opportunity for AI to strengthen human understanding (00:13:29) - Yohei's journey to build AI projects and building in public (00:14:04) - Yohei's lifelong interest in building tools and automating workflows for efficiency (00:16:29) - Sponsors: Netsuite | Omneky (00:22:05) - Building projects as a fun and exploratory practice (00:23:53) - Single use software (00:24:18) - Yohei's home assessment price contesting project (00:24:58) - The origins of BabyAGI as an autonomous startup founder prototype (00:27:44) - Create as many paper clips as possible (00:30:10) - Thoughts on going all in on a project and turning it into a company vs pursuing it on the side (00:33:00) - Forking different versions of Baby AGI based on animal names (00:34:24) - Yohei's learnings about agents after iterating on six versions of Baby AGI (00:36:24) - Yohei's thoughts on generalist agents (00:37:28) - Replit (00:40:54) - Begin exploring AI automation with tasks you wish you had time for, not critical workflows (00:42:32) - Automating outbound sales and recruiting workflows with AI (00:46:19) - Where should people start learning about AI agents? (00:48:23) - Will agents come online with GPT-4V? (00:52:40) - The unique moment where both executives and engineers both want to implement AI (00:54:59) - AI layers: hardware, software, tooling (00:57:20) - Taxonomy of AI business models (01:10:47) - Designing AI to provide value for senior citizens (01:15:14) - The responsibility to nudge people positively with influential AI assistants (01:19:24) - Advice on when and how startups should build in public (01:22:24) - Perspective on existential risk from AI vs. its potential benefits (01:15:30) - Responsibility to handle AI-human emotional connections well (01:19:24) - Advice on when and how startups should build in public (01:22:24) - Perspective on existential risk from AI vs. its potential benefits The Cognitive Revolution is brought to you by the Turpentine Media network. Producer: Vivian Meng Executive Producers: Amelia Salyers, and Erik Torenberg Editor: Graham Bessellieu For inquiries about guests or sponsoring the podcast, please email vivian@turpentine.co
AI for EveryoneHit me up on mylesdhillon@gmail.com - I would love to hear from listeners, I read all my emails and reply to ... pretty much all of them! Segment 1: Autonomous Military Drones in UkraineUkrainian drones utilize AI for autonomous strikes on Russian targets. #AIWarfare #AutonomousDronesThe Saker Scout quadcopter drone represents the first use of AI for lethal force in history. #SakerScout #AIWeaponsAI agents in drones are a new development in the kill-chain. #AIInMilitary #UkraineConflictThe United Nations aims to establish a legally binding agreement to ban lethal autonomous weapons without human oversight by 2026. #UNAIWarfareSegment 2: ChatGPT Autonomous AgentsChatGPT autonomous agents enhance AI interactions and automation. #ChatGPT #AIApplicationsAuto-GPT, BabyAGI, and Microsoft Jarvis are examples of autonomous agents. #AutoGPT #AIProgressThese agents have the potential to revolutionize various tasks and daily life. #AIRevolution #ChatGPTAgentsSegment 3: Flying Taxis in ChinaEhang's autonomous passenger-carrying drone obtains an airworthiness certificate. #Ehang #AirTaxiAir taxis are set to become common in cities within approximately five years. #FlyingTaxis #FutureTransportationChina represents the largest and fastest-growing market for air taxis. #ChinaAviation #TransportationTrendsEhang anticipates overseas expansion, simplifying airworthiness certification processes. #GlobalExpansion #EhangOverseasGet intouch with Myles at mylesdhillon@gmail.com
"Digital Marketing Legend Leaks" is the most popular AI-Powered Digital Marketing Podcast on Spreaker. All episodes can be found here - https://www.spreaker.com/show/digital-marketing-legend-leaksAlso, visit https://www.bookspotz.com/ to read mind-blowing articles on AI Digital Marketing, Mind-Cloning, Immortality Research, Self-Driving Flying Cars etc.Watch the full-fledged AI Digital Marketing Course - https://youtu.be/-m_0zi7K5-wBuild your own ChatGPT without code in minutes: https://youtu.be/e7eDX0bO_-UDigital Marketing Legend Srinidhi answers a Crying Fan: https://youtu.be/vZr5lrjSzm8Watch the trailer of Bookspotz by clicking here - https://youtu.be/cvM3MlxOknwEnter the new world ruled by Automation and Artificial Intelligence (AI).Digital Marketing Legend "Srinidhi Ranganathan" is the CEO of Bookspotz (A World-Changing Publication Powered by AI technology: https://youtu.be/svJW5eolKrUWorld-Wide Remote Jobs List Leaked: https://youtu.be/mVfvHK1U6X0Legend Srinidhi's Biggest Fan: https://youtu.be/L-AoyU1pyIw100 Free Coding Resources: https://youtu.be/MXQEkZ3KyiwLegend talks to more fans: https://youtu.be/eM1YVX7VwD4Subscribe to Legend https://www.youtube.com/channel/UCXP3bY7BbMt1pXK0tPp8G4QThis Podcast covers AI in Digital Marketing New-Age Trends and Technologies and other creative stories from Bookspotz.Digital Marketing Legend "Srinidhi Ranganathan", founder and CEO of FirstLookAi leaks the futuristic secrets of Digital Marketing powered by Artificial Intelligence (AI) in this amazing podcast.Step into the future of digital marketing with "Digital Marketing Legend Leaks," a captivating podcast hosted by the visionary entrepreneur, Srinidhi Ranganathan. Get ready to unveil the transformative potential of artificial intelligence (AI) as it revolutionizes the digital marketing landscape. In each episode, Srinidhi, the esteemed founder and CEO of Bookspotz, shares exclusive insights and groundbreaking strategies that are shaping the industry.Whether you're a seasoned digital marketer or a business owner seeking to stay ahead of the curve, this podcast offers an unrivalled opportunity to explore cutting-edge trends and technologies in AI-driven digital marketing. With a wealth of experience, Srinidhi opens the doors to a world of possibilities.Discover the untapped potential of chatbots, voice search optimization, machine learning algorithms, predictive analytics, and other game-changing techniques that can elevate your marketing efforts." Digital Marketing Legend Leaks" goes beyond theory and offers actionable advice to help you leverage the power of AI.Through engaging interviews with industry experts and real-world examples, you'll gain practical insights and strategic frameworks to drive your business forward. Stay on top of the latest developments and embrace the future of digital marketing.Subscribe to "Digital Marketing Legend Leaks" today and embark on a journey of innovation and success alongside Srinidhi Ranganathan.Unleash the potential of AI and revolutionize your marketing strategies like never before.
Ron Bodkin is a self-described “serial entrepreneur focused on beneficial uses of AI”. Ron founded ChainML in April 2022 to make it easier to integrate AI models into applications. The AI we know today is immature in so many ways and many of them relate to how crude the tooling is for traditional developers building AI-first features. The ChainML protocol is a cost-efficient, decentralized network built for compute-intensive applications running on blockchain technology. Prior to founding ChainML Ron had a distinguished entrepreneurial career having founded Think Big Analytics before it was eventually acquired by Teradata after which he spent three years in applied AI at Google. Ron is also an active investor and advisor and has degrees in Computer Science from McGill and MIT.Listen and learn...What led Ron to focus on how AI can have a positive impact on the worldWhy Hinton's right when he says "we've invented a superior form of learning"Where the current toolstack for building LLM apps is incredibly immatureHow to control the cost and performance of LLM appsWhy human brains are inefficientWhy the "effective cost of computing" is being reduced by 50% every yearHow we may get to AGI within 20 yearsWhy proprietary datasets and commercial issues will slow down AI innovationThe right way to regulate AIReferences in this episode...Meredith Broussard, professor and author, on AI and the Future of WorkAttorney relies on court cases made up by ChatGPTThe Microsoft Sparks of AGI paper
This week, host Anna Rose (https://twitter.com/annarrose) and co-host Kobi Gurkan (https://twitter.com/kobigurk) chat with Yohei Nakajima (https://twitter.com/yoheinakajima), General Partner at Untapped Capital (https://www.untapped.vc/) and creator of BabyAGI (https://babyagi.org/). They cover a wide variety of topics from the world of AGIs and agents to building no-code software in public. They kick-off with a chat about how Yohei's interest in NFTs led him down the AI ‘rabbit hole' and how he started to build out experiments in public that have inspired a new group of AI tools and projects. They wrap up with a discussion about the possible impacts of some of this AI tech, how ZK may help mediate the challenges it introduces and more. Here's some additional links for this episode; ReAct: Synergizing Reasoning and Acting in Language Models by Yao and Cao (https://ai.googleblog.com/2022/11/react-synergizing-reasoning-and-acting.html) Episode 279: Intro to zkpod.ai with Anna and Kobi (https://zeroknowledge.fm/279-2/) Bonus: zkpod.ai & Attested Audio Experiment with Daniel Kang (https://zeroknowledge.fm/bonus-zkpod-ai-attested-audio-experiment-with-daniel-kang/) BabyAGI GitHub (https://github.com/yoheinakajima/babyagi) Auto-GPT (https://auto-gpt.ai/) PixelBeasts (https://www.pixelbeasts.co/about) Stable Diffusion (https://stability.ai/blog/stable-diffusion-public-release) DALL·E 2 (https://openai.com/dall-e-2) Midjourney (https://www.midjourney.com/home/?callbackUrl=%2Fapp%2F) OpenAI (https://openai.com/) Playground AI (https://playgroundai.com/) LangChain (https://python.langchain.com/docs/get_started/introduction.html) LlamaIndex (https://llamaindex.ai) Dust (https://dust.tt/) Universal Paperclips: the game by Frank Lantz (https://www.decisionproblem.com/paperclips/index2.html) AI and the Paperclip Problem (https://cepr.org/voxeu/columns/ai-and-paperclip-problem) Check out the Modular Summit here: https://modularsummit.dev/ (https://modularsummit.dev/) zkSummit 10 is happening in London on September 20, 2023! Apply to attend now -> zkSummit 10 Application Form (https://9lcje6jbgv1.typeform.com/zkSummit10) Anoma's (https://anoma.net/) first fractal instance, Namada (https://namada.net/), is launching soon! Namada is a proof-of-stake L1 for interchain asset-agnostic privacy. Namada natively interoperates with fast-finality chains via IBC and with Ethereum via a trustless two-way bridge. For privacy, Namada deploys an upgraded version of the multi-asset shielded pool (MASP) circuit that allows all assets (fungible and non-fungible) to share a common shielded set – this removes the size limits of the anonymity set and provides the best privacy guarantees possible for every user in the multichain. The MASP circuit's latest update enables shielded set rewards directly in the shielded set, a novel feature that funds privacy as a public good. Follow Namada on twitter @namada (https://twitter.com/namada) for more information and join the community on Discord discord.gg/namada (https://discord.com/invite/namada) If you like what we do: * Find all our links here! @ZeroKnowledge | Linktree (https://linktr.ee/zeroknowledge) * Subscribe to our podcast newsletter (https://zeroknowledge.substack.com) * Follow us on Twitter @zeroknowledgefm (https://twitter.com/zeroknowledgefm) * Join us on Telegram (https://zeroknowledge.fm/telegram) * Catch us on YouTube (https://zeroknowledge.fm/)
Big thanks to our sponsors for this episode! Index.dev: https://www.index.dev/hire?utm_medium=referral&utm_source=HQLA&utm_campaign=podcast Newnique: https://www.newnique.com.au/contact In Episode 31 of HQLA, the hosts discuss various topics related to AI and technology. They start with a discussion on NFTs, particularly a song created using AI deep fake technology featuring Drake and The Weeknd, which sparks debates on intellectual property. They also mention musician Grimes' innovative but risky decision to split AI-generated music royalties. The hosts then move on to Apple's venture into banking, highlighting their partnership with Goldman Sachs for a high-yield savings account and buy now, pay later service, which is seen as a smart move for the tech giant. The episode delves into the world of AI, focusing on two new developments: AutoGPT and BabyAGI. AutoGPT is discussed in detail, as it automates tasks like website building and continuous self-improvement through a feedback loop, potentially replacing the need for prompt engineers. The hosts also mention AgentGPT as a solution to AutoGPT's lack of user interface. The future applications of AI are speculated upon, emphasizing cost and effort reduction in website creation, content generation, customized user experiences, real-time iteration of webpages based on user interactions, and on-demand entertainment. The hosts suggest that startups may need to reassess their business models due to these AI advancements, as many services can be more efficiently handled by AI, making traditional startup solutions less cost-effective. The requirements for setting up AutoGPT are explained, including a Python setup, IDE, and OpenAI subscription. In other news, Nike collaborates with RTFKT to release an NFT collection of Airforce 1 designs. Customers can redeem the NFTs for physical sneakers during a limited forge period on the RTFKT website. These NFTs are available on secondary markets, with prices ranging from $500 to $8,500 USD, and the shoes are set to ship in Q4 2023. Revere, a London-based company, promoted their new design and R-Tec spray shop through a marketing stunt involving a tagged Range Rover left outside Harrods in London. The incident was falsely claimed by the TV show 'Cheaters,' and the R-Tec team sprayed another vehicle with 'REVERENOTREVENGE'. Skittles staged a Broadway show called 'Skittles Commercial: The Broadway Musical' during Super Bowl Sunday in 2019, focusing on the manipulative nature of advertising. In 2017, Skittles streamed a Super Bowl ad exclusively to one teenage fan, with the world watching his reaction live on Skittles' Facebook page. Go Puff promoted its app in 2017 by sharing weird orders from actual customers on their Twitter page. Milka, upon entering the German and French markets in 2013, left out one square of chocolate from their bars, allowing purchasers to claim the missing piece or send it to a loved one using a code on the packaging. In 2016, the Art Institute of Chicago partnered with Airbnb to create an immersive artwork experience, re-creating Van Gogh's 'Bedroom in Arles'. The room was advertised for rent, and several respondents were able to rent it on Airbnb for $10 per night.
I denne episode skal vi se nærmere på ChatGPTs fætter, AutoGPT.Auto står for 'automatisering' og har ikke noget med biler at gøre. AutoGPT kan nemlig i en vis udstrækning handle selvstændigt, og peger måske på en fremtid med rigtige digitale assistenter, der kan lave ting for os i baggrunden.Til at fortælle mere om AutoGPT har jeg haft besøg i studiet af Adam Hede, der arbejder med generative AI i Implement Consulting Group, både internt og med firmaets kunder.Han er på seneste især blevet nysgerrig på de noget mere selvstændige alternativer som AutoGPT, BabyAGI, Godmode og flere andre projekter fra samme skuffe. Og Adam har også taget en version af AutoGPT med i studiet, så vi kan prøve at se hvad vi kan få sådan en AutoGPT til at hjælpe os med.LINKSNYHEDERDræbte en drone in operatør i en simulation?Endnu en udtalelse om truslen fra AINVIDIA er blevet MANGE penge værdKan OpenAI finde på at trække sig ud af EU?AUTOGPTAutoGPTs projektsiteWireds test af AutoGPTImplement Consulting GroupAdam Hede på LinkedIn
This week, Anna Rose (https://twitter.com/annarrose) and Kobi Gurkan (https://twitter.com/kobigurk) continue their conversation with a full episode dedicated to their new side project zkpod.ai (https://twitter.com/zkpodai). They cover the origin of the project and discuss their motivations and concerns around its uses as well as how they could see the project developing and what this could mean for the field of zk. Kobi shares his thoughts on the current tooling available, where this could progress and potential limitations it holds. Additional links mentioned in this episode: * zkpod.ai Website (https://zkpod.ai/) * Anna Rose's zkpod.ai Blog Post (https://hackmd.io/@Arrose/H1uPQ78gn) * Yohei Nakajima Twitter (https://twitter.com/yoheinakajima) * LangChain Website (https://python.langchain.com/en/latest/) * Weaviate Website (https://weaviate.io/?utm_source=google&utm_medium=cpc&utm_campaign=18671086036&utm_content=145710453274&utm_term=vector%20based%20database&gclid=Cj0KCQjwj_ajBhCqARIsAA37s0zMQwc1jPKwWLGcjDnz2PdUGz4s387sd-bYeObjUx5SlY2oeu355F8aAk3xEALw_wcB) * Anthropic Product ‘Claude' (https://www.anthropic.com/index/introducing-claude) * Universal Paperclips: the game by Frank Lantz (https://www.decisionproblem.com/paperclips/index2.html) * Large Language Models can be used to effectively scale Spear Phishing campaigns by Hazell (https://arxiv.org/pdf/2305.06972.pdf) * Creating Word Embeddings: Coding the Word2Vec Algorithm in Python using Deep Learning (https://towardsdatascience.com/creating-word-embeddings-coding-the-word2vec-algorithm-in-python-using-deep-learning-b337d0ba17a8) Check out the ZK Jobs Board here: ZK Jobs (https://jobsboard.zeroknowledge.fm/). Find your next job working in ZK! Anoma's (https://anoma.net/) first fractal instance, Namada (https://namada.net/), is launching soon! Namada is a proof-of-stake L1 for interchain asset-agnostic privacy. Namada natively interoperates with fast-finality chains via IBC and with Ethereum via a trustless two-way bridge. For privacy, Namada deploys an upgraded version of the multi-asset shielded pool (MASP) circuit that allows all assets (fungible and non-fungible) to share a common shielded set – this removes the size limits of the anonymity set and provides the best privacy guarantees possible for every user in the multichain. The MASP circuit's latest update enables shielded set rewards directly in the shielded set, a novel feature that funds privacy as a public good. Follow Namada on twitter @namada (https://twitter.com/namada) for more information and join the community on Discord discord.gg/namada (http://discord.gg/namada). Zero-knowledge is changing the world and until now, building ZK applications meant learning new, chain-specific languages and complex cryptography. But no more! With SnarkyJS, the easiest to use zk SDK, developers can add the magic of zk to their apps using TypeScript! Whether you're targeting Mina (https://minaprotocol.com/about), the leading zk-native blockchain, or off-chain applications, SnarkyJS (snarkyjs.o1labs.org) from O(1) Labs has you covered. With support for infinite recursion, in-browser proving, and so much more, the full power of zk is available to everyone. Visit snarkyjs.o1labs.org (snarkyjs.o1labs.org) to get started. If you like what we do: * Find all our links here! @ZeroKnowledge | Linktree (https://linktr.ee/zeroknowledge) * Subscribe to our podcast newsletter (https://zeroknowledge.substack.com) * Follow us on Twitter @zeroknowledgefm (https://twitter.com/zeroknowledgefm) * Join us on Telegram (https://zeroknowledge.fm/telegram) * Catch us on YouTube (https://zeroknowledge.fm/)
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: [AISN #5]: Geoffrey Hinton speaks out on AI risk, the White House meets with AI labs, and Trojan attacks on language models, published by Center for AI Safety on May 9, 2023 on The Effective Altruism Forum. Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. Subscribe here to receive future versions. Geoffrey Hinton is concerned about existential risks from AI Geoffrey Hinton won the Turing Award for his work on AI. Now he says that part of him regrets his life's work, as he believes that AI poses an existential threat to humanity. As Hinton puts it, “it's quite conceivable that humanity is just a passing phase in the evolution of intelligence.” AI is developing more rapidly than Hinton expected. In 2015, Andrew Ng argued that worrying about AI risk is like worrying about overpopulation on Mars. Geoffrey Hinton also used to believe that advanced AI was decades away, but recent progress has changed his views. Now he says that AI will become “smarter than a human” in “5 to 20 years, but without much confidence. We live in very uncertain times.” The AI race is heating up, but Hinton sees a way out. In an interview with MIT Technology Review, Hinton argues that building AI is “inevitable” given competition between companies and countries. But he argues that “we're all in the same boat with respect to existential risk,” so potentially “we could get the US and China to agree like we could with nuclear weapons.” Similar to climate change, AI risk will require coordination to solve. Hinton compared the two risks by saying, "I wouldn't like to devalue climate change. I wouldn't like to say, 'You shouldn't worry about climate change.' That's a huge risk too. But I think this might end up being more urgent." When AIs create their own subgoals, they will seek power. Hinton argues that AI agents like AutoGPT and BabyAGI demonstrate that people will build AIs that choose their own goals and pursue them. Hinton and others have argued that this is dangerous because “getting more control is a very good subgoal because it helps you achieve other goals.” Other experts are speaking up on AI risk. Demis Hassabis, CEO of DeepMind, recently said that he believes some form of AGI is “a few years, maybe within a decade away” and recommended “developing these types of AGI technologies in a cautious manner.” Shane Legg, co-founder of DeepMind, thinks AGI is likely to arrive around 2026. Warren Buffet compared AI to the nuclear bomb, and many others are concerned about advanced AI. White House meets with AI labs Vice President Kamala Harris met at the White House on Thursday with leaders of Microsoft, Google, Anthropic, and OpenAI to discuss risks from artificial intelligence. This is an important step towards AI governance, though it's a bit like inviting oil companies to a discussion on climate change—they have the power to solve the problem, but incentives to ignore it. New executive action on AI. After the meeting, the White House outlined three steps they plan to take to continue responding to the challenges posed by AI: To evaluate the risks of generative AI models, the White House will facilitate a public red-teaming competition. The event will take place at the DEF CON 31 conference and will feature cutting-edge models provided by leading AI labs. The White House continues to support investments in AI research, such as committing $140M over 5 years to National AI Research Institutes. Unfortunately, it's plausible that most of this investment will be used to accelerate AI development without being directed at making these systems more safe. The Office of Management and Budget will release guidelines for federal use of AI. Federal agencies promise enforcement action on AI. Four federal agencies iss...
What's up everyone,If you haven't checked out our previous 3 episodes in our AI series you might want to before this episode, we give you a lot of context around some of the events that have happened and will shape the conversation today.So basically How fast could AI change or replace marketing jobs? How marketers can stay informed and become AI fluent Exploring new paths to future-proof your marketing career in the age of AI Today we're diving into specific tools… there's a lot of noise out there right now.What tools you should play around withIn TMW #107 | ChatGPT and the artificial marketer, Juan Mendoza explains that“...generative AI tools are already everywhere. From text generation to video and audio production, to image creation, there's a thriving industry of technologies taking small slices out of our creative talents, packaging them up, and selling them as a SaaS product on a recurring revenue model. If you're wanting to stay relevant five years from now in the marketing technology industry, you're probably going to have to learn some of these platforms. In 2010 we used to say: “there's an app for that”. In 2023, we will be saying: “there's an AI for that.””OutlineHere are some of the topics for this third AI episode: Key AI technology definitions and how to differentiate real AI tools vs all the noise out there Deep dive into tools Content marketing tools Email and marketing automation tools Predictive analytics tools Text to presentation and pitch deck tools 3D animation tools for product marketers Sales and outreach tools Text to website creator tools Ad and social creative tools AutoGPT and AI agents And a bunch of other tools like conversational search engines, 1-1 convos with celebrities and an even longer list of honorable mentions Here's today's main takeaway:The key to future proofing your marketing career with the ever changing AI landscape is to stay curious, get your hands dirty and experiment fearlessly: Fill out some forms, spin up free trials, get on wait lists, and give new AI tools a chance. It's only by actually getting your hands dirty that you'll discover which tools truly work for you and which are just part of the ever growing sea of gimmicky AI tools.Definition of tech termsI'll be using some of these terms throughout my analysis of some of these tools so here's a primer explaining the three most common AI technologies used for marketing applications: MLMachine Learning): ML is a way to teach computers to learn by themselves, without having to be programmed for every task. They learn from examples and data patterns to make predictions or decisions. Applications include segmentation, predictive analytics and propensity models. NLPNatural Language Processing: NLP is a subset of ML and focuses on enabling computers to understand, interpret, and generate human language. Includes sentiment analysis, machine translation, named entity recognition, text summarization, and more. NLP techniques usually helps computers understand and communicate with humans using everyday language. GNNGraph Neural Network: GNN also a subset of ML is a type of neural network that aims to handle graph-structured data, data organized like a network or web of connected points. Applications include analyzing relationships between different things like users in a social network or users in your database or recommending additional products based on past purchase history. Real AI vs noisePart of the reason AI gets a really bad rep, especially in martech, is that anything that's built on if statements or simple Javascript logic gets called AI. There's still plenty of AI startups that shout about their proprietary AI when it's probably just a few decision trees and a few interns running spreadsheets.Now though, you have an even bigger bucket of noise that's essentially “slight tweak on Chat-GPT”. Developing AI that was comparable to human performance was a challenging feat prior to GPT's arrival. To achieve this level of sophistication, a company would have had to: make a substantial investment, amounting to millions of dollars developing its own algorithms performing extensive data cleanup But it's so easy now because GPT is so good out of the box. Allen Cheng puts it simply. Starting a new AI venture can be achieved by simply assembling a few elements: a product developed on GPT-4's user-friendly API a website, and a marketing campaign. This is why we're seeing hundreds of AI tolls pop up every week.A lot of these GPT-based products are pretty much indistinguishable from one another. Maybe a handful have a significant advantage over others but most are gimmicky. And over the next few months, every tool is going to be integrating ChatGPT features inside their products in the hopes of making it stickier.The threat of GPT-nThe part that I find trickiest and the most discouraging about building anything on top of GPT is that any progress you make on fine tuning GPT-4 will totally be wiped out by GPT-5 or GPT-n… Kind of like we talked about in a previous episode with all the tools GPT's plugins killed. So let's cut through the noise and dive into legit AI tools, the ones you should be playing with and experimenting. Content marketing toolsCopy.ai and Jasperhttps://copy.ai/ https://jasper.ai/ AI text generators are very common these days, the two most popular tools, especially for marketers are Copy.ai and Jasper. Both allow you to bypass the initial stage of writing where you face a blank page. The promise of these tools is that they help you in generating ideas, saving time on brainstorming and drafting, and ensuring a consistent production flow, freeing you to focus on higher-level strategic tasks, original research, and connecting with your audience.I've played around with both Jasper and Copy.ai before ChatGPT came out… and they were super unique. But both Copy.ai and Jasper are built on top of GPT, they essentially rent usage of the platform. So they built a pretty nice UI on top of GPT… but now that ChatGPT came out, I'm sure they've seen a drop in usage. Plus GPT-4 is 3 times more expensive.They still offer marketing specific value though and can get you up to speed faster than using CGPT in the form of templates, prompts and workflows. Both are super powerful, you could make a case that Jasper outshines its counterpart a bit, especially on the longer content format but it's also way more expensive. Miguel Rebelo from Zapier has a solid breakdown comparison here https://zapier.com/blog/jasper-vs-copy-ai/ GramarlyGOhttps://www.grammarly.com/grammarlygo Grammarly, the popular spelling and grammar checker which has been using AI for over a decade already, also entered the generative AI niche last month unveiling GrammarlyGO. You guessed it, built on GPT. It has a particular advantage because Grammarly is already widely used and this is just an extra feature so to speak. Instead of just checking your grammar it can now also help you with drafting documents, composing and responding to emails, editing writing for tone, clarity, and length, and brainstorming ideas or outlines for projects within the apps you're already working in. Lots of tools are going the Grammarly route in adding GPT features to their product, like Notion and more recently Superhuman.Other AI writing toolsSome of these specialize in SEO, some in long form content, some in short form… they all do similar things:Copysmith https://copysmith.ai/ Anyword https://anyword.com/ Writesonic https://writesonic.com/Copymatic https://copymatic.ai/ Yaara https://www.yaara.ai/ Rytr https://rytr.me/ Frase https://frase.io/ EmailEmail is just a channel of potential generative content tools so it's not totally distinguishable from the tools we chatted about in the content category. Chances are that the Marketing Automation platform or the customer engagement platform you're already using as a suite of features they are throwing AI next to. Most of these though are just ML. Some tools like Iterable and Braze have propensity models to find users that are likely to perform a purchase event, that's ML, and it's only based on your data set. Some tools like Seventh Sense throw AI in front of Send Time Optimization features, these have been around for a long time and are only based on your recipients. This is also ML. Some tools throw AI in front of Sentiment Analysis features allowing you to analyze and classify the emotional tone of text into useful data. This is a subset of NLP that uses ML. Some tools like 6sense throw AI in front of conversational email assistants that are supposed to be a step up from static drip emails sequences. This is also a form of NLP and generative AI. You're likely to start seeing copy assistants and additional AI features powered on GPT inside of these tools. I wanted to chat about one product that I feel like stands out from others in terms of being built around AI rather than it simply being a feature on the side.Persadohttps://www.persado.com/ One example I've seen (but haven't used myself) of a CEP using more advanced AI is a tool called Persado. The two co-founders are former founders of Upstream in the mobile marketing space. Similar to 6sense's conversational email platform, they've been around for a decade and they claim to leverage NLP + ML to create, optimize, and personalize engagement messaging. So they essentially analyze a bunch of data and identify the most effective emotions, CTAs, language, phrases to drive engagement. It's worth calling it out because it's more than just predicting user behavior and optimizing the best time to send a message, it takes it a step further by also incorporating NLP techniques, understanding the nuances of human language, and generating custom marketing messages that resonate with a unique user. One thing that makes Persado unique is it's not built on GPT, it has its own model that's trained on more than 100 billion digital business language impressions across industries. Potentially less data points than GPT but arguably better and more relevant impressions. As Lisa Spira, VP of Content Intelligence at Persado explains in an interview with Martechseries, “models like OpenAI's ChatGPT are trained on the entire “Wild West” of the Internet, so their results, while delivered confidently by the model, can be riddled with inaccuracies, or even offensive content”. She adds that “Generative AI tools might be capable of writing anything, but we're able to cut through that noise, and train our generative AI to write in a way that motivates recipients to take actions: to open an email, convert on a shopping cart webpage, or stop scrolling and click on a social media ad.”And not just generative AI. Persado is pushing a term called Motivation AI and they have a really cool example of it. Let's say you're in D2C and you're selling sweaters. Generative AI gives you level 1 and 2 content which is Intelligibility and Relevance so it might spit out content like: Sweaters are a knitted garment that usually have long sleeves and are designed to be worn over other clothing In the fall, sweaters provide an extra layer of warmth in the cooler temperatures. They are stylish choice for fall fashion because they come in a variety of colors and styles But the next stage of this is including Emotion to get an output like this:You're going to love these cozy sweaters for fallAnd the following stage, where Persado claims to play is Motivation where you would get an output like this:A cozy invitation: indulge in our luxuriously soft sweaters this fallNow this might all be semantics. I'd argue that with great prompting you can get to generative content that includes motivation and emotion. Optimovehttps://www.optimove.com/ This tool could actually go in the CRM, the CDP or even the email category with the acquisitions they've made in the last few years but another name that I've heard a few times is Optimove. They play in the enterprise arena and primarily serve retailers and gaming operators but they have an interesting suite of AI tools for marketers. I've personally not used the tool but they claim to provide personalized recommendation campaign orchestration with their AI-based marketing bot (Optibot). It provides what they call self-optimizing campaigns that are essentially advanced A/B/n campaign testing that automatically serves the best treatment to individual customers through the use of ML. Source: https://www.optimove.com/pdf/Optimove-Product-Features.pdf Predictive AnalyticsPredictive analytics is a super cool niche of AI application. It essentially enables us to anticipate trends and patterns of consumer behavior based on a bunch of user data. Done right, you can do cool things like uncover users ready to buy amongst a sea of tire kickers, find free users primed for upsells and anticipate customers about to churn.Vidora Cortex (mParticle Intelligent Attributes)https://www.vidora.com/platform/ I wanted to start by calling out a few no-code / low-code predictive analytics / ML pipeline platforms. There's not a ton in this space but it has been growing over the past few years. Many of the big analytics players like Qlik's AutoML, IBM's Watson Studio, Amazon SageMaker, Google's AutoML, have a ML specific tool that does this but it's built for data scientists. Vidora is worth calling out here because it was built more for knowledge workers.Vidora was founded by 3 academics while pursuing their Ph.Ds Caltech and Berkeley, they built systems that utilized machine learning to convert raw data into informed decisions. Their initial vision was to democratize machine learning capabilities. They are a small 20 person startup in SF but their product offers huge potential, so much so that they were acquired by mParticle, a popular CDP. Vidora's platform, Cortex, provides you with the ability to build distinct pipelines for your customer data, so you can then produce various predictive attributes tailored for particular applications. For example if your goal is to identify the customers with the highest probability of purchasing your product in the next 30 days, the pipeline allows you to enhance your retargeting ROI by focusing on this user segment. Similar: https://www.pecan.ai/ https://www.akkio.com/ 6sensehttps://6sense.com/It's worth highlighting 6sense here as well. Not specifically an ML tool but it has a wider set of use cases that are a mix of intent data and predictive analytics and a slice of lead scoring… but they've been around for a decade. They use real-time data about your buyers to predict their buying intent and what their current purchasing stage might be. They uncover some of the magic in a blog post about their predictive engine here. They claim to be using: Intent data, web visits, content engagement (1st party) and 3rd party user behavior data from all over the Internet Historical data, events that led up to sales in the past ICP, firmographic and technographic data So they layer intent data on top of historical data and filter it through your ICP and the output is a target list of accounts that show intent to buy and are a good fit for your solution.The secret sauce here is really the 3rd party intent data. This is hidden data that they are able to assign to leads and accounts. Part of the source here is 6sense's proprietary intent network and other B2B review sites like G2, TrustRadius, and PeerSpot.Kumohttps://kumo.ai/ Founded by Jure Leskovec, former Stanford University computer science professor who's known for his bold assertion that AI would eventually be capable of predicting the future. He partnered with the former CTO of Pinterest and Airbnb as well as the former Head of Growth AI at LinkedIn to build Kumo.Using Kumo, companies can not only analyze past events but also predict future opportunities. Kumo uses GNNs (Graph Neural Networks) to identify patterns and relationships in complex data sets that cannot be easily analyzed using traditional statistical or machine learning techniques. This essentially allows marketers to anticipate customer behavior (how much they will spend, which new products they will be more interested in, things that would make them leave for a competitor) and offer personalized product recommendations, promotions, and communication.Predictive analytics isn't a new concept though. We talked a bit about this in our first episode when we mentioned propensity models which tons of larger companies employ today. But this is operationalizing it a step further and not just on your company's datasets. And imagine unlocking this capability for startups. So the idea is that marketing operations teams would change their focus to future customer behaviors. PresentationTomehttps://beta.tome.app/ There are countless AI productivity tools that are all super similar. But one that's been buzzing lately is Tome. They are founded by product leads from Facebook and Instagram, and recently got $43 million in their Series B funding round. They launched an innovative document-to-presentation AI tool, which utilizes GPT-4 to generate text and images and transform a document into compelling presentations, narratives, and stories. Tome's tile system sets it apart from other generative PowerPoint tools on the market and gives Microsoft a run for its money. Not only does it offer robust functionality, but it also boasts a sleek and impressive design.https://gamma.app/ 3D AnimationSpline AIhttps://spline.design/ai Spline AI was created by Alejandro León, it's a YC21 startup that's building a 3D design platform comparable to Figma, it basically allows you to convert text to 3D. For decades, creating in 3D has been super hard. Spline is changing this.You can easily create objects and scenes, edit objects, colors, and properties, add physics and randomness, create animations and events, generate style alternatives, collaborate with others in real-time, and much more. The applications for videos and product marketing are endless here, see a demo here. SalesAI-powered sales tools are popping up every week. Especially in the niche of email outreach. Personalization, subject line optimization, send-time optimization, sales rep coaching, auto suggest follow-up cadences… just a few of the areas where AI can enhance your email campaigns. NLP can analyze email responses and suggest appropriate next steps, helping your sales team respond efficiently. There's tons of players in this space like Cresta, Lyne, Regie. Cresta was funded out of the AI Grant program organized by some of the brightest tech minds in AI. Their founder, S. Zayd Enam chose to leave his PhD program at Stanford to establish the startup. They specifically provide AI guidance software and support that elevates each representative, behavioral mentoring of agents to enhance performance and locate solutions and areas to streamline monotonous tasks.AI Sales Email Coach. It assists you in real-time. Get more positive replies and write better emails faster.https://www.lavender.ai/ https://cresta.com/ https://lyne.ai/cold-email-ai/ https://www.regie.ai/ Web creatorsButternut AIhttps://butternut.ai/ Tagline says it all: Create a website for your business in 20 seconds with AI. Butternut.ai uses generative AI technology that allows users to create websites by simply answering text prompts. The AI technology is designed to function like a personal developer, providing users with an efficient and cost-effective solution for creating and editing websites.Butternut.ai is continuously improving its website builder and is currently working on its v2 version that will offer more design and editing functionalities. Users can even command the AI to perform specific tasks like creating a pricing page with a CTA to sign up.http://Stunning.so https://10web.io Ad and social creativeAlbert.aihttps://albert.ai/ Albert is an autonomous and self-learning digital marketing tool that uses ML + NLP as well as analytics to automate and optimize your paid ad campaigns, specifically Google's search and programmatic channels, as well as Facebook, Instagram, YouTube and Bing. It can automate bidding and budget allocation, keyword and audience targeting, as well as creative testing and optimization. So obviously the part that stands out here is that unlike many other campaign AI tools that just spit out recommendations and then a marketer takes the action, Albert claims to be one of the first tools that's an autonomous AI, it does the action also. Not only that, it's also making adjustments and improvements constantly. You seem to be able to set guardrails of course. They also claim the ability to identify trends, uncover new audience segments, and optimize ad placements. In their docs they say they are more useful in large data sets and B2C environments. AdCreativehttps://adcreative.ai/ AutoGPT and AI agentsI don't think we can go a full episode about AI tools without talking about AutoGPT and AI agents.Essentially you can assign an objective to an AI agent and they work on tasks that lead to accomplish this objective. It's making prompting a bit easier, instead of giving full instructions, the AI identifies the necessary steps to achieve your goal and some of the more advanced ones generate additional AI to assist. You may have seen this on Twitter, I think the first true demo was Yohei Nakajima's impressive demonstration of babyAGI.Things started blowing up with AutoGPT, released by SigGravitas just last week. Described as a self-running AI agent that can write its own code, heal itself from errors, and access the internet via Google search. It does sound really cool for several reasons: Internet access, long-term and short-term memory management, text generation, and integration with 11 Labs for AI text to speech generation. It's arguable that all of this will be possible with GPT plugins but this is doable today and has different applications, let's unpack a few examples:Social media is primed for automation, lots of the work here can already be automated. But imagine setting an AI agent loose with the objective of creating content, scheduling posts (not just posting them) and replying to comments and questions. Replying to questions is a huge part of customer service, you could argue this is another area that's already being automated with chat bots, especially the better ones connected to help docs. But imagine setting an AI agent loose with the objective of really understanding customer questions (not just picking up a keyword and recommending a help article), it could speak any language and you could train it on your brand tone.We're still early but there's already been a ton of innovation here.Here are some of the tools you can play around with today without reading a GitHub repo:Cognosyshttps://www.cognosys.ai/ Cognosys allows you to operate your personal AutoGPT Agent in-browser which is currently available and free to utilize. I've personally played around with it quite a bit and it's pretty wild, it's not perfect yet but it's really cool seeing how it will add new tasks to get to the objective. The beta is still on GPT-3.5. Here's my favorite example:Main objective was to: Build a list of AI tools that have a marketing application, that were released in the last 3 months. Here were the follow-up tasks it added to get to the objective: 1. Retrieve a list of AI tools that were released in the last 3 months. 2. Filter the list to only include tools that have a marketing application. 3. Compile the filtered list into a final list of AI tools with marketing applications released in the last 3 months. The final list of AI tools with marketing applications released in the last 3 months is as follows: 1. Receptiviti: A tool that uses AI to analyze consumer emotions and sentiment in order to optimize marketing strategies. 2. Cortex: A platform that uses AI to help marketers personalize content and improve customer engagement. 3. Adext AI: A tool that uses AI to optimize ad campaigns and improve ad performance. 4. Acoustic: An AI-powered marketing platform that helps businesses automate and optimize their marketing strategies. 5. Pathmatics: A tool that uses AI to provide insights into ad spending and campaign performance for competitive analysis. It then added a few extra tasks because it didn't seem satisfied with the list. Research and compile a list of AI tools that have a marketing application and were released in the last 6 months. The result was 6 tools, but not all 5 from the first task were there. Identify the unique features and functionalities of each AI tool on the list and create a comparison chart to evaluate which tools are most suitable for specific marketing applications. Conduct a survey of marketing professionals to gather feedback and opinions on the identified AI tools and their effectiveness in improving marketing strategies.But then it apologized and said it cannot perform that task… Research and compile a list of case studies showcasing the successful implementation of AI tools in marketing campaigns.It spun up a case study about Coke using "Albert" to optimize their social media advertising campaigns and Amazon using "Rekognition" to analyze customer behavior and personalize their shopping experience. Create a step-by-step guide on how to integrate an AI tool into a marketing strategy, including best practices and potential challenges.The step by step wasn't too bad. 1: Define the problem or opportunity, 2: Identify the right AI tool by researching and comparing different tools, 3: Understand the data requirements like type of data, format, frequency updates, 4: Prepare the data for integration like cleaning and formatting 5: Integrate the AI tool Research and compile a list of AI tools that have a marketing application and were released in the last 2 months.Not sure why it did this for 2 months but this time it gave me tools with release dates which was pretty cool but also obvious that I wasn't getting tools released in the last 2 months, I was only getting tools released in the last 2 months since GPT's latest batch of data which was mid 2021. Create a comparison chart to evaluate the unique features and functionalities of each AI tool on the list and determine which tools are most suitable for specific marketing applications. Also try: https://agentgpt.reworkd.ai/ https://app.hyperwriteai.com/earlyAccess https://godmode.space/ OtherRewind.aihttps://twitter.com/dsiroker/status/1646895452317700097?s=20 https://www.rewind.ai/ You may have seen their public investor pitch on Twitter, the founder is the former creator of Optimizely and his team built a way for you to record and store everything you've said or heard or seen and they make it searchable. Obviously there's huge privacy considerations with something like this. But people don't seem to care haha they went from 0 to 700k in ARR in 3 months. Perplexity AI https://www.perplexity.ai/ Perplexity is a startup that's changing the way we search the web. With their conversational search engine, you can ask questions in plain English and get accurate answers from various sources. And with $26 million in Series A funding, (including investments from Ed Gil) they're looking to revolutionize the search engine game.Their AI technology sets it apart from traditional search engines like Google, and their chatbot-like interface is user-friendly and intuitive, it is built on top of GPT.Perplexity's focus on accuracy is a breath of fresh air in an era where search engines can be manipulated by advertisers and search engine optimization. The Series A funds will be used to optimize their knowledge database and expand their reach. All in all, Perplexity AI is definitely one to watch in the coming years!Character AIhttps://beta.character.ai/ Imagine being able to have a one-on-one conversation with your favorite celebrity or fictional character - well, that's now a possibility with Character.ai, an innovative AI website created by two former Google engineers. The platform has been growing in popularity since its launch last September, offering users the opportunity to chat with a wide range of characters for free. What's more, the interactions are so seamless that some users have reported spending hours chatting with their chosen personality, almost forgetting that they're talking to a machine.However, there's a catch - the interactions are not real, and the AI's responses may not always be accurate or reliable. Despite this, Character.ai has been embraced by fans who are looking for new ways to engage with their favorite personalities, particularly online. This is especially true for fan fiction lovers who are excited about the tool's potential for creating new experiences and making the barrier to entry for writing fan fiction much lower.But as with any new technology, there are concerns about its impact on real-world relationships, especially if users spend more time on it than with the people they care about. Furthermore, the AI technology could be used by stans to go after a perceived enemy of their favorite star, which could be a problem if it leads to harmful interactions or behavior towards other users.Despite these concerns, Character.ai represents a new frontier for fan culture, offering a new way for fans to engage with their favorite personalities and characters. The platform's growth in popularity is a testament to the human need for connection, and while it may not substitute actual interactions, it provides a unique and exciting way for fans to engage with their favorite personalities.Magichttps://magic.dev/ Magic, a startup developing an AI-driven tool to help software engineers write, review, debug, and plan code changes, just raised $23 million in a funding round led by Alphabet's CapitalG. The platform's CEO, Eric Steinberger, was inspired by AI at a young age and is using his experience as an AI researcher to create a tool that will communicate with users in natural language, helping to improve the speed and cost of developing software.Steinberger claims that Magic can understand legacy code and collaborate with users on code changes, operating like a pair programmer that learns more about the context of both coding projects and developers. The AI colleague that Magic offers will be able to understand code and can help developers navigate it, allowing companies to scale the impact of their current employees and train new employees with less personal coaching.What sets it apart is that it allows developers to describe what they want in English, and the AI will understand it and collaborate with them on the changes. This means that developers can work with Magic like a colleague and send messages telling it what changes they want to be made to the code. This one isn't built on GPT, the company has developed its own neural network architecture that can rapidly read code and is capable of detecting, warning about and overcoming potential bugs in the code. Honorable mentionsGalileo AI - Create editable UI designs from text descriptionhttps://usegalileo.ai/ Notocat - Write your newsletters in Notion and send them to your subscribershttps://twitter.com/razvanilin/status/1643565644061999104 Brainf FM - music made by AI that's scientifically proven to increase focushttps://www.brain.fm/ Meeting notes and transcription appshttps://tldv.io/ https://vowel.com/ Image Vectorizer - Turn small images to vectorhttps://vectorizer.io/Speech-to-text generator for podcasts that creates notes, timestamps and summary contenthttps://castmagic.io/ https://podsqueeze.com/ Text-to-speech AI voice generatorhttps://murf.ai/ https://beta.elevenlabs.io/ Text-to-music AI music generator https://mubert.com/ Text-to-SQL query, connect your database, ask a question, get an answerhttps://usechannel.com/ Teachable's AI course curriculum generatorhttps://teachable.com/ai-curriculum-generator https://twitter.com/teachable/status/1641154548419969048?s=20 The opportunities are endless with AI tools and applications go far beyond marketing. I see too many people on Twitter dunking on using it to help you generate text or as a search engine or fact checker… and they're missing the big picture. People are using AI to Help them build custom meal plans Custom exercise plans Help them build sleeping plans Help them build routines and schedules with their newborns Planning road trips Planning date ideas Acting as a therapist Getting movie and book recommendations Planning a party or event Designing personal budgets Assisting with resume and cover letter writing Summarizing long articles and youtube videos Write SQL queries Explain python and CSS code in plain English ConclusionIt's not like our marketing jobs are gonna vanish overnight, but the shift is happening faster than many of us realize. AI's no longer just a loosely backed buzzword; it's doing things today that we used to think were impossible. So, as marketers, we've gotta take this tech seriously.There's 4 main things marketers should be doing to future-proof their careers: Instead of asking if AI's gonna replace our roles in marketing, we should be talking about how quickly it could happen and what it'll look like if it does. Staying informed and learning from some of the best podcasts and newsletters about AI. Ben's Bites, No Priors Podcast, A Guide for Thinking Humans and the AI Exchange are great resources. Now is the time to figure out if you need to make changes to your current area of speciality. Ask yourself if you should double down on additional areas like data and API services, getting closer to product and customers or starting to learn about ethics and data privacy. Stay curious, get your hands dirty and experiment fearlessly: Fill out some forms, spin up free trials, get on wait lists, and give new AI tools a chance. It's only by actually getting your hands dirty that you'll discover which tools truly work for you and which are just part of the ever growing sea of gimmicky AI tools. ✌️--Intro music by Wowa via UnminusCover art created with Midjourney
In this episode of Partnering Leadership, Mahan Tavakoli first shares some of his perspectives on the transformation organizations and leaders will face due to advancements in artificial intelligence. Then Mahan welcomes Tom Taulli, AI author, investor, and advisor, to discuss the letter asking for a pause on the development of generative AI, the potential disruptive impact of generative AI, and the potential of AI agents built on top of Open AI's Chat GPT. They also discussed the need for serious conversations around AI ethics and the need for responsible AI. Next, Tom Taulli talked about the potential of generative AI, such as Chat GPT, for automation, customer service, and cybersecurity. He mentioned the concept of hallucination, where the AI can give false answers that seem convincing, and the importance of setting up prompts to get better responses. He also mentioned using generative AI for coding, image creation, and summarizing information. Mahan and Tom further discussed the implications of AI chat agents and voice cloning technology, including the potential for AI to deceive people and the need for regulation to protect privacy. Finally, Tom Taulli addressed some potential uses of AI in the workplace and what professionals need to do to stay ahead of the many changes resulting from AI applications in the workplace. Some Highlights:- Generative AI and its Potential Disruptions- Implications of the letter asking for a six months pause on the development of advanced generative AI- AI agents such as AutoGPT, BabyAGI, and their potential uses - Applications of generative AI in marketing, customer service, image creation, and cybersecurity- Rise of chat agents and potential social consequences - Implications of voice and video cloning technologies- The necessity for human input in AI systems- How and why professionals can develop new competencies as a result of advancements in artificial intelligence Connect with Tom Taulli:Tom Taulli Website Tom Taulli on LinkedIn Connect with Mahan Tavakoli: Mahan Tavakoli Website Mahan Tavakoli on LinkedIn Partnering Leadership Website
The AI Breakdown: Daily Artificial Intelligence News and Discussions
Meet semi-autonomous specialized agents (SASAs), more descreet, focused implementations of AutoGPT and BabyAGI. The AI Breakdown helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Breakdown wherever you listen: https://pod.link/1680633614
Dive into the AI revolution with Episode 26 of "AI For Humans" as hosts Gavin Purcell and Kevin Pereira explore the latest innovations transforming creativity! Hear about Grimes' groundbreaking decision to let AI use her voice, and how this is paving the way for the future of music. Discover Bark, the new, free AI voice tool that's making high-quality audio accessible to creators everywhere. Journey into fantastical realms with an AI story-engine that combines Dungeons & Dragons with BabyAGI technology to build immersive story worlds. And finally, get an exclusive demo of Wonder Dynamics' AI-powered GFX tools, bringing Hollywood-style visual effects to your home studio. Subscribe now and unlock the full potential of AI-driven creativity!
The AI Breakdown: Daily Artificial Intelligence News and Discussions
For the last 3 weeks, AutoGPT has massively captured the attention of the AI community. But how useful is it really? Some are starting to ask whether it really lives up to the hype. Watch the original video: https://www.youtube.com/@TheAIBreakdown
Girişimci Muhabbeti'nin 3. Sezon 10. Bölümünde Barış, Tuna ve Samican, Türkiye'deki teknokentlerde bulunan şirketlere uygulanan vergi düzenlemelerini ve GPT-4 furyasının devamını ele aldık.Öncelikle, Türkiye'deki teknokentlerde faaliyet gösteren şirketlere uygulanan %10 ek kurumlar vergisi ve emisyon primine uygulanan ek vergi hakkında konuştuk. Bu vergi düzenlemelerinin girişimler ve teknoloji ekosistemi üzerinde yarattığı olumsuz etkileri değerlendirdik ve bu durumun gelişmekte olan teknoloji sektörü için ne anlama geldiğine dair düşüncelerimizi paylaştık.Daha sonra, GPT-4 etrafında oluşan yeni sistemler olan Langchain, AutoGPT ve BabyAGI gibi projeleri ve bunların yapay zeka ekosistemine etkilerini ele aldık. Bu projelerin sunduğu fırsatlar, zorluklar ve potansiyel uygulama alanları hakkında fikirlerimizi paylaştık, aynı zamanda bu yeni nesil yapay zeka sistemlerinin gelecekteki yönünü ve etkisini değerlendirdik.Bu bölümde, Türkiye'deki teknokentlerde bulunan şirketler için uygulanan vergi düzenlemelerinin etkilerini ve GPT-4 etrafında oluşan heyecan verici yeni sistemleri tartıştık. Umarız siz de bu bölümümüzü beğenir ve bizimle birlikte teknoloji ve yapay zeka dünyasındaki gelişmeleri takip etmeye devam edersiniz. Hepinize iyi bayramlar ve iyi dinlemeler!Referanslar:https://swipeline.co/2022de-yatirim-almis-yerli-girisimler-aldiklari-yatirimin-10-unu-devlete-vergi-olarak-odeyebilir/https://swipeline.co/2022de-yatirim-almis-girisimlerden-10-ek-vergi-alinacagi-kesinlesti/https://autogpt.net/amazing-use-cases-for-auto-gpt-on-twitter/https://twitter.com/VarunMayya/status/1643902198164717569?s=20
In zweiten Teil sprechen wir über KI-Agenten wie #AutoGPT und #BabyAGI und was sie derzeit können oder noch nicht können.Dies ist Teil der Livestream-Aufzeichnung vom 20. April 2023.Folgt uns auf:Twitter: https://twitter.com/KIundMenschTwitch: https://www.twitch.tv/kiundmenschYoutube: https://www.youtube.com/@kiundmensch*Highlights*
This week we are discussing Google Search and Large Language Models. We'll be exploring how LLMs and Transformers are encroaching on the world of search engines. We discuss how AI is changing the way we find information and interact with the internet from BabyAGI to AutoGPT. We touch on Filter bubbles, Hallucinations (and how nobody has solved this yet). News from this week in AI Reddit Wants to Get Paid for Helping to Teach Big A.I. Systems https://www.theverge.com/2023/4/18/23688463/reddit-developer-api-terms-change-monetization-ai Google Working On New More Personalized Search Engine & Project Magi https://www.seroundtable.com/google-new-search-engine-project-magi-35230.html Samsung has considered changing its search partner. Microsoft's Bing is powered by Open AI after Microsoft invested $10 billion in the company back in January. Sundar Pichai "No One In The Field Has Yet Solved Hallucination Problems" - https://uk.investing.com/news/stock-market-news/sundar-pichai-opens-up-about-google-bards-trippy-troubles-no-one-in-the-field-has-yet-solved-hallucination-problems-2987805 Crazy things we saw this week Autogpt (GPT-4 running fully autonomously) - https://twitter.com/mattcduff/status/1642528658693984256 Photographer admits prize-winning image was AI-generated https://www.theguardian.com/technology/2023/apr/17/photographer-admits-prize-winning-image-was-ai-generated https://callsam.ai/ • Call Sam - call and speak to an AI about absolutely anything. Fun thing to try out Some questions from this episode What do you think the future of Google Search will look like in the next 5-10 years? There has been some concern about AI and LLMs creating "filter bubbles" that limit the diversity of information users are exposed to. What are your thoughts on this issue, and how can it be addressed? Next week's episode - AI Ethics --- Send in a voice message: https://podcasters.spotify.com/pod/show/using-ai/message
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Capabilities and alignment of LLM cognitive architectures, published by Seth Herd on April 18, 2023 on LessWrong. Epistemic status: Hoping for help working through all of these new ideas. TLDR: Scaffolded, "agentized" LLMs that combine and extend the approaches in AutoGPT, HuggingGPT, Reflexion, and BabyAGI seem likely to be a focus of near-term AI development. LLMs by themselves are like a human with great automatic language processing, but no goal-directed agency, executive function, episodic memory, or sensory processing. Recent work has added all of these to LLMs, making language model cognitive architectures (LMCAs). These implementations are currently limited but will improve. Cognitive capacities interact synergistically in human cognition. In addition, this new direction of development will allow individuals and small businesses to contribute to progress on AGI. These new factors of compounding progress may speed progress in this direction. LMCAs might well become intelligent enough to create X-risk before other forms of AGI. I expect LMCAs to enhance the effective intelligence of LLMs by performing extensive, iterative, goal-directed "thinking" that incorporates topic-relevant web searches. The possible shortening of timelines-to-AGI is a downside, but the upside may be even larger. LMCAs pursue goals and do much of their “thinking” in natural language, enabling a natural language alignment (NLA) approach. They reason about and balance ethical goals much as humans do. This approach to AGI and alignment has large potential benefits relative to existing approaches to AGI and alignment. Overview I still think it's likely that agentized LLMs will change the alignment landscape for the better, although I've tempered my optimism a bit since writing that. A big piece of the logic for that hypothesis is why I expect this approach to become very useful, and possibly become the de-facto standard for AGI progress. The other piece was the potential positive impacts on alignment work. Both of those pieces of logic were compressed in that post. I expand on them here. Beginning with a caveat may be appropriate since much of the below sounds both speculative and optimistic. I describe many potential improvements and positive-sum synergies between different capabilities. There will surely be difficulties and many things that don't work as well or as easily as they might, for deep reasons that will slow development. It's quite possible that there are enough of those things that this direction will be eclipsed by continued development of large models, and that progress in integrating cognitive capacities will take a different route. In particular, this approach relies heavily on calls to large language models (LLMs). Calling cutting-edge LLMs will continue to have nontrivial costs in both time and money, as they require substantial computing resources. These may hamper this direction, or drive progress in substantially less human-like (e.g., parallel) or interpretable (e.g., a move to non-natural language core processing) directions. With these caveats in mind, I think the potentials for capabilities and alignment are enough to merit serious consideration from the alignment community, even this early in the game. I think AutoGPT, HuggingGPT, and similar script wrappers and tool extensions for LLMs are just the beginning, and there are low-hanging fruit and synergies that will add capability to LLMs, effectively enhancing their intelligence and usefulness. This approach makes an LLM the natural language cognitive engine at the center of a cognitive architecture. Cognitive architectures are computational models of human brain function, including separate cognitive capacities that work synergistically. Cognitive architectures are a longstanding field of research at the conjunctio...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Capabilities and alignment of LLM cognitive architectures, published by Seth Herd on April 18, 2023 on LessWrong. Epistemic status: Hoping for help working through all of these new ideas. TLDR: Scaffolded, "agentized" LLMs that combine and extend the approaches in AutoGPT, HuggingGPT, Reflexion, and BabyAGI seem likely to be a focus of near-term AI development. LLMs by themselves are like a human with great automatic language processing, but no goal-directed agency, executive function, episodic memory, or sensory processing. Recent work has added all of these to LLMs, making language model cognitive architectures (LMCAs). These implementations are currently limited but will improve. Cognitive capacities interact synergistically in human cognition. In addition, this new direction of development will allow individuals and small businesses to contribute to progress on AGI. These new factors of compounding progress may speed progress in this direction. LMCAs might well become intelligent enough to create X-risk before other forms of AGI. I expect LMCAs to enhance the effective intelligence of LLMs by performing extensive, iterative, goal-directed "thinking" that incorporates topic-relevant web searches. The possible shortening of timelines-to-AGI is a downside, but the upside may be even larger. LMCAs pursue goals and do much of their “thinking” in natural language, enabling a natural language alignment (NLA) approach. They reason about and balance ethical goals much as humans do. This approach to AGI and alignment has large potential benefits relative to existing approaches to AGI and alignment. Overview I still think it's likely that agentized LLMs will change the alignment landscape for the better, although I've tempered my optimism a bit since writing that. A big piece of the logic for that hypothesis is why I expect this approach to become very useful, and possibly become the de-facto standard for AGI progress. The other piece was the potential positive impacts on alignment work. Both of those pieces of logic were compressed in that post. I expand on them here. Beginning with a caveat may be appropriate since much of the below sounds both speculative and optimistic. I describe many potential improvements and positive-sum synergies between different capabilities. There will surely be difficulties and many things that don't work as well or as easily as they might, for deep reasons that will slow development. It's quite possible that there are enough of those things that this direction will be eclipsed by continued development of large models, and that progress in integrating cognitive capacities will take a different route. In particular, this approach relies heavily on calls to large language models (LLMs). Calling cutting-edge LLMs will continue to have nontrivial costs in both time and money, as they require substantial computing resources. These may hamper this direction, or drive progress in substantially less human-like (e.g., parallel) or interpretable (e.g., a move to non-natural language core processing) directions. With these caveats in mind, I think the potentials for capabilities and alignment are enough to merit serious consideration from the alignment community, even this early in the game. I think AutoGPT, HuggingGPT, and similar script wrappers and tool extensions for LLMs are just the beginning, and there are low-hanging fruit and synergies that will add capability to LLMs, effectively enhancing their intelligence and usefulness. This approach makes an LLM the natural language cognitive engine at the center of a cognitive architecture. Cognitive architectures are computational models of human brain function, including separate cognitive capacities that work synergistically. Cognitive architectures are a longstanding field of research at the conjunctio...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Capabilities and alignment of LLM cognitive architectures, published by Seth Herd on April 18, 2023 on The AI Alignment Forum. Epistemic status: Hoping for help working through all of these new ideas. TLDR: Scaffolded, "agentized" LLMs that combine and extend the approaches in AutoGPT, HuggingGPT, Reflexion, and BabyAGI seem likely to be a focus of near-term AI development. LLMs by themselves are like a human with great automatic language processing, but no goal-directed agency, executive function, episodic memory, or sensory processing. Recent work has added all of these to LLMs, making language model cognitive architectures (LMCAs). These implementations are currently limited but will improve. Cognitive capacities interact synergistically in human cognition. In addition, this new direction of development will allow individuals and small businesses to contribute to progress on AGI. These new factors of compounding progress may speed progress in this direction. LMCAs might well become intelligent enough to create X-risk before other forms of AGI. I expect LMCAs to enhance the effective intelligence of LLMs by performing extensive, iterative, goal-directed "thinking" that incorporates topic-relevant web searches. The possible shortening of timelines-to-AGI is a downside, but the upside may be even larger. LMCAs pursue goals and do much of their “thinking” in natural language, enabling a natural language alignment (NLA) approach. They reason about and balance ethical goals much as humans do. This approach to AGI and alignment has large potential benefits relative to existing approaches to AGI and alignment. Overview I still think it's likely that agentized LLMs will change the alignment landscape for the better, although I've tempered my optimism a bit since writing that. A big piece of the logic for that hypothesis is why I expect this approach to become very useful, and possibly become the de-facto standard for AGI progress. The other piece was the potential positive impacts on alignment work. Both of those pieces of logic were compressed in that post. I expand on them here. Beginning with a caveat may be appropriate since much of the below sounds both speculative and optimistic. I describe many potential improvements and positive-sum synergies between different capabilities. There will surely be difficulties and many things that don't work as well or as easily as they might, for deep reasons that will slow development. It's quite possible that there are enough of those things that this direction will be eclipsed by continued development of large models, and that progress in integrating cognitive capacities will take a different route. In particular, this approach relies heavily on calls to large language models (LLMs). Calling cutting-edge LLMs will continue to have nontrivial costs in both time and money, as they require substantial computing resources. These may hamper this direction, or drive progress in substantially less human-like (e.g., parallel) or interpretable (e.g., a move to non-natural language core processing) directions. With these caveats in mind, I think the potentials for capabilities and alignment are enough to merit serious consideration from the alignment community, even this early in the game. I think AutoGPT, HuggingGPT, and similar script wrappers and tool extensions for LLMs are just the beginning, and there are low-hanging fruit and synergies that will add capability to LLMs, effectively enhancing their intelligence and usefulness. This approach makes an LLM the natural language cognitive engine at the center of a cognitive architecture. Cognitive architectures are computational models of human brain function, including separate cognitive capacities that work synergistically. Cognitive architectures are a longstanding field of research at t...
Before becoming a partner at Madrona Venture Group, Jon Turow worked as the head of product for computer vision at Amazon Web Services. He spent nine years at AWS in the product organization. Since becoming a venture capitalist, he's invested in promising AI companies like Runway and Numbers Station, along with the buzzy data company MotherDuck.So when Amazon announced a partnership, called Amazon Bedrock, with Anthropic, Stability AI, and AI21 Labs, I asked Turow to come on the show to help me break down Amazon's effort to bring foundation models closer to its cloud customers. Turow is someone who has helped me think through everything that's happening in artificial intelligence broadly. So we had a fun conversation about open source and the excitement around AI agents — like BabyAGI. This episode of Newcomer is brought to you by VantaSecurity is no longer a cost center — it's a strategic growth engine that sets your business apart. That means it's more important than ever to prove you handle customer data with the utmost integrity.But demonstrating your security and compliance can be time-consuming, tedious, and expensive. Until you use Vanta.Vanta's enterprise-ready Trust Management Platform empowers you to:* Centralize and scale your security program* Automate compliance for the most sought-after frameworks, including SOC 2, ISO 27001, and GDPR* Earn and maintain the trust of customers and vendors alikeWith Vanta, you can save up to 400 hours and 85% of costs. Win more deals and enable growth quickly, easily, and without breaking the bank.For a limited time, Newcomer listeners get $1,000 off Vanta. Go to vanta.com/newcomer to get started.Find the Podcast Get full access to Newcomer at www.newcomer.co/subscribe
Before becoming a partner at Madrona Venture Group, Jon Turow worked as the head of product for computer vision at Amazon Web Services. He spent nine years at AWS in the product organization. Since becoming a venture capitalist, he's invested in promising AI companies like Runway and Numbers Station, along with the buzzy data company MotherDuck.So when Amazon announced a partnership, called Amazon Bedrock, with Anthropic, Stability AI, and AI21 Labs, I asked Turow to come on the show to help me break down Amazon's effort to bring foundation models closer to its cloud customers. Turow is someone who has helped me think through everything that's happening in artificial intelligence broadly. So we had a fun conversation about open source and the excitement around AI agents — like BabyAGI. This episode of Newcomer is brought to you by VantaSecurity is no longer a cost center — it's a strategic growth engine that sets your business apart. That means it's more important than ever to prove you handle customer data with the utmost integrity.But demonstrating your security and compliance can be time-consuming, tedious, and expensive. Until you use Vanta.Vanta's enterprise-ready Trust Management Platform empowers you to:* Centralize and scale your security program* Automate compliance for the most sought-after frameworks, including SOC 2, ISO 27001, and GDPR* Earn and maintain the trust of customers and vendors alikeWith Vanta, you can save up to 400 hours and 85% of costs. Win more deals and enable growth quickly, easily, and without breaking the bank.For a limited time, Newcomer listeners get $1,000 off Vanta. Go to vanta.com/newcomer to get started.Find the Podcast Get full access to Newcomer at www.newcomer.co/subscribe
2023 április 17., hétfő 9-10 óra Mihálovits gazda rovatunkban: Kilátások a mezőgazdasági géppiacon 2023-ban. A mezőgazdasági termeléssel kapcsolatban az elmúlt időszakban főként a tavalyi történelmi aszály következményeiről és a hatékonyság növelésének lehetséges módjairól volt szó. Kevesebb figyelmet kapott a géppiac, amely szintén meghatározó jelentőségű, többek közt a precíziós gazdálkodásra való átállásban. Fazakas Péter, az MKB Bank és Takarékbank Agrár- és Élelmiszeripari Üzletág Elemzési és Kompetencia Központ vezetője. Tőzsdenyitás Varga Zoltánnal, az Equilor Zrt. senior elemzőjével. Heuréka élmény rovat. ChaosGPT - AutoGPT - BabyAGI. Mi is van ezekkel a világvége gyártó AI-okkal? A ChaosGPT (egy uj destruktiv AI) fogja kiírtani az emberiséget? Ráadásul mindezt úgy, hogy ezeket a probálkozásokat még “gyerekcipőben” teszi. Beszélgetünk kicsit az AI automatizálásról vagyis arról, hogy több AI összekapcsolásával hogyan kapunk egy korai AGI-t, vagyis az általános célú mesterséges intelligencia jeleit mutató rendszereket. Ez a BabyAGI. Illetve szó esik arról, hogy mire képes már most egy AI, ha hagyják… Milyen hatással van ez a jelenség a közeljövő biztonságára? Az AI demokratizalodasa es automitizálása lehet, hogy rossz ötlet? A betiltás mire jó? Pl. ha bizonyos országokban nem elérhető a ChatGPT, akkor ettől változik valami? Keleti Arthur, kibertitok jövőkutató, az Informatikai Biztonság Napja alapítója.
Welcome back to "AI Prompts," your go-to podcast for all things artificial intelligence! Today, we have an exciting episode for you as we discuss an incredible week in Generative AI. We'll cover two groundbreaking topics that happened this past week in Artificial Intelligence. First up, let's talk about Auto GPT, a version of GPT-4 that runs completely autonomously. Imagine an AI so advanced that it not only has a voice but can also fix code, set tasks, create new instances, and more. The possibilities with Auto G P T are truly astounding, and the implications for our future are immense. By connecting Auto GPT to virtually any application or system, we can harness the power of GPT-4 and let it work its magic without any human intervention. This breakthrough opens the door to a vast array of world-changing creations. Picture a world where AI agents talk to other AI agents, streamlining processes, and making human intervention almost obsolete. Moving on to our second topic, let's explore BabyAGI, a fascinating new program that has recently been open-sourced. Given a task, BabyAGI creates a task list and then executes the tasks over and over again. Its ability to automate processes has made it one of the top trending repositories on GitHub. If you're interested in checking it out, we've provided a link to the repo in the show notes. To get started with BabyAGI, you can follow the provided instructions to run it locally on your machine. As you tinker with the code, you'll begin to see the potential of this AI in streamlining workflows and automating repetitive tasks, ultimately increasing efficiency and productivity. In another exciting development, someone has created a tool that translates code from one programming language to another. This innovation is not only a time-saver for developers working on cross-platform projects, but it's also an excellent way to learn new programming languages. By leveraging this translation tool, programmers can gain insights into the structure and syntax of various languages, accelerating their learning process. As the world of programming continues to evolve, tools like these will become indispensable for staying up-to-date with the latest languages and techniques. Next up, let's talk about Worldcoin's recently launched Proof of Personhood SDK. This innovative solution aims to verify someone's human status on the internet. By providing a secure and reliable way to confirm human users, this technology has the potential to combat bots and fake accounts that often plague online platforms. The applications of Worldcoin's SDK are vast, ranging from ensuring fair voting processes in online polls to implementing secure authentication methods for online services. As our digital lives become increasingly intertwined, tools like this will play a crucial role in maintaining the integrity of the internet. We also have an interesting tool that lets you scrape a website and then query the data using Langchain. This powerful combination enables users to extract valuable information from web pages and analyze it using natural language queries. By simplifying the process of web scraping and data analysis, this tool empowers users to access and understand information more efficiently. As AI continues to integrate with our daily lives, we can expect to see more tools like this one, making complex tasks more approachable for everyone.