POPULARITY
We're announcing AIEWF speakers this week! Take the AI Engineering Survey!Today's guest Ethan first joined us for the LS Paper Club as the lead on NVIDIA Cosmos World Model, but then joined xAI and built Grok Imagine in 3 months:He comes back on Latent Space with some nuclear hot takes: that Video Models primarily get their intelligence from LLMs, not from training on video data, and that the next frontier for truly interactive, realtime, long-horizon world models is to work on LLMs (perhaps Interaction Models as well…)Put it this way: In the near term, the next Sora won't be a better video model, but a video agent.Generative Media may more closely follow the evolution of AI coding which went from focusing on one-shot output performance and cost, to multiturn reasoning and planning models for agents and systems that can plan, edit, test, debug, and submit PRs.At a certain point, coding models got so good that the only significant next step to improve performance was handling the orchestration of these models.Now as the performance of video models increases significantly across realism, consistency, & prompt adherence while becoming more cost efficient, the next evolution of video generation may also be systems that can plan, generate, edit, critique, and iterate across an entire creative task. In this episode, Ethan joins swyx and Vibhu to unpack what it actually takes to build frontier image and video systems: data, VAEs, diffusion transformers, audio-video alignment, inference speedups, and the hidden cost of storing and moving massive video datasets. From building NVIDIA's Cosmos world model to joining xAI as Grok Imagine was being built from zero to one, Ethan He has been at the center of some of the most important work in video generation, multimodal models, and real-time world models.We go deep on Grok Imagine, how a small xAI team shipped its first multimodal video model in three months, why iteration speed matters more than almost anything in model development, and why many of the biggest gains come from fixing tiny bugs in data and training pipelines. Flipbook: The future of VideomaxxingVideo agents are almost a sure bet to be the trend in the coming year. We end with a glance at what's beyond video agents:Flipbook caused a minor sensation this year when it was released, but most treat it as a fun demo. Ethan takes it very seriously — with the speed and cost of inference coming down every year, the future of custom video JIT UI is closer than you think. We talked about why videogen models may become the front end of AI, how generative UI could replace traditional HTML/CSS, why world models need to be real-time, interactive, and long-horizon, and why the future of video generation may depend more on language models and agents than on diffusion alone.We discuss:* Why fast iteration mattered more than meetings* Why small training bugs can drive huge model quality gains* Why coding models may make compute the bottleneck again* How image and video models are trained with synthetic captions* The role of VAEs and latent space in frontier video models* Why image models are the foundation for video models* The tradeoff between temporal compression and real-time interactivity* Flipbook, Neural OS, and the future of generative UI* Why future interfaces may go from user intent to pixels* The hidden cost of training video models: storage, egress, and GPU hours* How step distillation and consistency models (like OpenAI sCM) makes video inference orders of magnitude faster* Grok Imagine 0.9 and large-scale audio-video generation* Why audio-video alignment is harder than text-video alignment* Ethan's definition of world models* Reference-to-video, video extension, and long-context video generation* Why xAI's research communication undersells Grok Imagine* How xAI culture shaped the speed of development* AI watermarking, SynthID, and detecting generated media* Why prompt rewriting matters for video models* Grok Imagine Agent and the rise of video agents* Why language models may unlock better video generation* Robotics, physical AI, and embodied world models* Why Ethan left xAI and shifted focus toward LLMs* Self-managed context, memory, and the next frontier for language modelsEthan He* LinkedIn: https://www.linkedin.com/in/ethanhe42* X: https://x.com/EthanHe_42Timestamps00:00:00 Introduction00:01:25 From NVIDIA Cosmos to xAI00:03:24 Building Grok Imagine from Zero to One00:10:07 How Image and Video Models Are Trained00:18:53 Video Compression, VAEs, and Real-Time Tradeoffs00:22:10 Generative UI, Flipbook, and Neural OS00:32:10 The Cost of Training Large Video Models00:37:04 Distillation, GANs, and Fast Video Inference00:41:21 Audio-Video Generation and Grok Imagine 0.900:48:34 What Makes a World Model?00:55:51 Reference Videos, Long Context, and Video Memory01:00:11 xAI Culture, Research, and First-Principles Building01:09:45 AI Safety, Watermarking, and Prompt Rewriting01:13:10 Video Agents and AI-Assisted Creation01:27:32 Why Language Models Unlock Better Video01:31:15 Robotics, Physical AI, and Embodied World Models01:32:38 Why Ethan Left xAI01:34:16 Self-Managed Context and the Future of LLMs01:38:43 Ethan's Career Path and Closing ThoughtsTranscriptIntroduction: Ethan He, Latent Space, and the Path to xAISwyx [00:00:00]: We're here in the studio with Ethan He, most recently of xAI. Welcome.Ethan [00:00:10]: Thank you. Glad being here.Swyx [00:00:11]: We're also here with Vibhu. you were first coming to us or joining the latent space world because you were working on Kosmos at NVIDIA, and you did a paper. We loved it. you presented it as well, so thank you for doing that.Ethan [00:00:23]: I've actually, I also presented the MoEs twice at latent space.Swyx [00:00:29]: How did you actually hear about us? Did we reach out to you? Is that how it worked?Ethan [00:00:33]: No, actually, I-- the community. Like I realized, oh, there is this online community that people talk about AI and also learn from each other through papers every week through the Paperclip. It's very nice.Ethan [00:00:49]: I learned a lot.Swyx [00:00:49]: I think three years stop. We haven't stopped even on Christmas and New Years. many weeks I want to stop but it keeps going.Vibhu [00:00:58]: No, that was good. I think you had posted that you worked on a paper, and I was “Oh, very cool. We have Paperclip. Present then.”Vibhu [00:01:04]: But I might have reached out to you after.Swyx [00:01:05]: you-- because it's an amateur club, right?Swyx [00:01:08]: so it's very unusual and but we have sometimes paper authors come by and actually explain the paper. Today we just did, the poolside paper, which was apparently very good.Vibhu [00:01:18]: Came out yesterday.Vibhu [00:01:19]: pretty interesting, right? Fully open. They talk about everything, systems. So it's a good one. We'll, we'll recommend people to read it.Swyx [00:01:25]: Bring us up to speed on your transition to xAI, ‘cause I actually don't even know when you joined. just like tell the, tell the story about the sort of transition.From NVIDIA Cosmos to xAI: Scaling Video and World ModelsEthan [00:01:34]: Before xAI, I was working on Kosmos world model as in-- at NVIDIA. So Kosmos is, it's a giant video foundation models that can-- that aims to simulate the world and for-- it serves as a foundation of-- for all of the roboticists to build on top of. There, once I built the Kosmos one, I realized as this thing also has a scaling law similar to language model, we need to scale up the video models further. that's, that's why I realized I need to move to somewhere with much more compute resources. That's how ISwyx [00:02:13]: Than NVIDIA?Vibhu [00:02:14]: The GPU rich came themselves.Vibhu [00:02:19]: And timeline-wise, when was Kosmo? It was pretty early, right? It was open world model, open paper, everything.Ethan [00:02:25]: It was end of twenty-four.Vibhu [00:02:28]: End of twenty-four.Ethan [00:02:30]: Then at mid twenty-five, I moved to xAI. At that time-- I joined about the time when xAI was about to build video models and in multi-model models. There were no infra, no data, and no model, and it just-- as a few engineers, we built it in three months and released the first model, Grok Imagine zero point nine.Ethan [00:02:55]: And since then, I keep working on video models and move more from training and to post-training of the video models. For example, like a reference to videos, kind of like the cameo feature and, video extensions. And, before I left, I worked on a world model, leading a small team to focus on the real-time long horizon video generation.Building Grok Imagine From Scratch in Three MonthsSwyx [00:03:24]: Can you give like a rough roadmap of okay, you're on a brand-new team. Grok previously was only text, or they partnered with BFL for their image gen stuff. What do you-- what are the building blocks, right? You have compute, data you can procure somewhere. Like just what are like the sequence of things that people should think about when you're setting up a new team?Vibhu [00:03:43]: actually even deeper, not just data you can procure. You guys had to go through getting the data too, right? So you shipped it pretty fast, but yeahSwyx [00:03:51]: three months is likeVibhu [00:03:52]: From everythingSwyx [00:03:52]: actually like very surprisingly fast.Ethan [00:03:55]: One thing I say like thanks to my experience at NVIDIA, ‘cause first time when we were building Kosmos together, we built it, for about a year. So this is like the second time I do it. Roughly have an idea, what to do. I say the most important thing is the talent. Everyone were very strong and clever, very close with each other towards a common goal. So that speed up things a lot. So you reduce the communication bandwidth among people, and everyone can work towards the same goal. It's, it's like every day there's not that much meetings on the calendar, like maybe like a, like a sync a day, and after that it's, it's just all building. It was pretty fun at that time.Ethan [00:04:47]: And another thing is that xAI has very strong foundations of like data inference, model inference, and the supporting there can help the model develop a lot. When I look at, training models, I don't so actually the top important thing is like how many, how many iterations can you do, per day? and the more iteration can you do, you can, you can train the model much faster. So if you have very strong infra and you have a lot of compute, you can, you can train these models in very short period of time. That can give you a much larger buffer to, for errors, and it also gives you the opportunity to spot more bugs.Iteration Speed, Compute, and Debugging Model PipelinesSwyx [00:05:46]: What is an iteration? Is it like a few hundred steps or what are youEthan [00:05:50]: Let's say just the train-training the model, like from acquire new data and maybe design new algorithms and train a new model, maybe at smaller scale orSwyx [00:06:01]: So cycle time for like any hyperparam that you're searching.Ethan [00:06:04]: Cycle time and tune to like eval this model. Is this model better than my previous iteration?Ethan [00:06:11]: SoSwyx [00:06:11]: So it's like before you, someone had already set this up that you can iterate very quickly.Ethan [00:06:15]: I think the foundation there is extremely good forDeveloping and research models.Ethan [00:06:23]: And often I find is it-- this is kind of boring, but like a lot of the improvements does not come from new algorithms. It comes from finding small bugs here and there in the data pipeline, in the, in the model training pipeline. Those give, those give the biggest boost to the model quality.Vibhu [00:06:46]: It's interesting, right? So you say it's like small team, less communication bandwidth, but also a lot of quality is like find little bugs. It seems counterintuitive, right? You have a lot of people, you can iron out more of those, but it's interesting to see the other side, right?Swyx [00:07:00]: I also wonder, have you-- do you try using LLMs to look for bugs? I don't know.Ethan [00:07:05]: I remember at that time it was mid two thousand and twenty-five, so it's the coding model wasn't quite there yet. I remem- I remember like December two thousand and twenty-five, it was extremely good. Yeah, I've been, I've been using it at that time. It's, it's helpful. sometimes it produce codes that are kind of difficult to maintain, even though like the first time it built something extremely fast. But it gave the, like a spaghetti code, thousands of lines that I couldn't maintain, and the LLM itself couldn't figure out what's, what's wrong and how to improve on top of it. But now I find it much better. Yeah, I want to bring up another point here is now coding models are much more efficient and can help us implement stuff much faster. Compute might become a bottleneck again because previously, like if you want to train a new model, say you want to generate new synthetic data and then or write a new algorithm, it might take a few weeks. And during that period of time, you don't-- you might not have experiments to run. But now you can build that thing within a few hours, then you can immediately train a model.Ethan [00:08:24]: Now you have to have enough compute to try all of the ideas. So compute might be the bottleneck of iterating speed again.Swyx [00:08:36]: yeah, I actually, honestly, I think it's like kind of a stressful job because you're “Well, I should be trying everything, and if I'm not, then I'm not doing my job well.”Vibhu [00:08:48]: there's also the stress of you're eating thousands of GPUs per hour, which is very expensive and, compute can go to other researchers.Swyx [00:08:56]: You got the daddy Elon toVibhu [00:08:57]: You got daddy Elon.Ethan [00:08:59]: It wasVibhu [00:09:00]: But there's still finite amount of compute, like you want to use it, you want to use it well, you want more of it.Ethan [00:09:06]: That was quite stressful indeed. Yeah, I think one thing is the-- with coding models now, like a lot of these jobs can be automated, which is much better. A second, it's a, it's a marathon, so you got to maintain good health and, a regular schedule.Vibhu [00:09:28]: It's, it's hard to hear that when you shift from zero to nothing in two months.Swyx [00:09:32]: and, I think obviously the culture at xAI is very famously, people work very hard. one thing I did want to dive into, in our-- in the notes that you, that you sent ahead of time, you had specific comments about the cost of Video Gen training. presumably this is on the Colossus-1, right? the two hundred megawatt cluster. Any whatever you want to just share on that.Vibhu [00:09:54]: I think there's, there's three things we're talking about, right? So there's Video Gen, there's also the Image Gen model that you put out. Do you want to like complete the, okay, so zero to one, you have a few months. Just what are the stages of create Image Gen model?Swyx [00:10:06]: Oh, yeah, maybe I got distracted.How Image and Video Models Are Trained: Synthetic Captions, Tokenizers, and VAEsVibhu [00:10:07]: Sorry. and then, from there's Video Gen, there's Audio Gen. Would love to get into those next. But what is that first few months like? So small team, a lot of bugs, iterations, but what does it look like? Do we take something off the shelf? Do we just get data compute? What's, what's the few months like? How do you go to state-art Image Gen model? How do you just start?Ethan [00:10:28]: I cannot comment specifically how xAI did, but it's, it's a quite standard process. I can draw some, examples from Cosmos. So mainly it's building a video model, you actually need to build a image model first. And building these two models, the data you need is a hundred percent synthetic pair of language and image or language to video. Because on the, on the internet, actually, the videos don't naturally associate with text. So you can say, oh, like on YouTube, you have the title and you have the description and the commentsSwyx [00:11:11]: TitleEthan [00:11:11]: of a video, but usually they're not relevant to the video itself. And say maybe like the video is a natural scene of mountains or something, and the title is, I'm so happy today.Ethan [00:11:26]: So they have they have no correlation at all. So the first step is to, you have to generate synthetic pair of language with the videos. So you gather videos from the internet, and you use a VLM to caption the videos. So that part, here's a question, like how do you, how do you gather VLM to begin with? So if there's noSwyx [00:11:55]: You, so you fuse the model, right? LikeEthan [00:11:57]: Say if there's no like VLM exists, like how do you generate the text to the beginning, right? It's, it's impossible.Swyx [00:12:04]: I see.Ethan [00:12:05]: In the beginning, it's like you ask human to describe the video as detailed as possible.For example, you ask them to describe everything, like all objects, all characters, and all interaction and dialogues in the, in the videos. So that's in the protocol of Cosmos labeling. We require the objective we give to the labelers was that you have to describe the video as detailed as possible, such that a blind person hears a blob of text can reconstruct what the video is like from their head.Swyx [00:12:43]: Video or image? You're talking about images.Ethan [00:12:44]: Video or image, either one of them.Vibhu [00:12:47]: This was pretty common when we went from clip and DALL-E, right?Vibhu [00:12:51]: It's all training on really detailed captioning of images. So same is applied to video, but insteadEthan [00:12:57]: same appliedVibhu [00:12:57]: of using multimodal model to pass in video images and write rich descriptions, you can alsoSwyx [00:13:04]: I think there's this traditional perspective of supervised, or, very highly human curated thing. I feel like there's a unlock with unsupervised, right? Where like you have enough to bootstrap that you can just throw common corpus on it or, whatever. like unsupervised vision and language pairing, right? Like where you just have, interspersed image and text and it just learns. To me, that is the VLM breakthrough that is different from the clip, different from the LM era.Ethan [00:13:36]: It's interesting to see that you kind of need both data.Ethan [00:13:41]: For example, for theSwyx [00:13:41]: You need it to bootstrap it up. YeahEthan [00:13:43]: for the generative model training, there's also usually like a small percentage of unlabeled data. So the model is instructed to generate a video without any text instruction. That can also help the model generalize. So after this stage of generative synthetic pair, so, one important common step is to train a compressor or a tokenizer of the image or videos. So because, if you train-- If you can technically, theoretically train image or video models on pure pixels, but the problem is that the, it's, it's a lot of tokens. So like one image, it's, a thousand by a thousand, it's like one million tokens, one million pixels. It's impossible to train transformer on that. So it's, you need to train a tokenizer, which can go from image to latent space and latent space back to image.Swyx [00:14:45]: That's why we named the podcast.Swyx [00:14:48]: But, basically, you're talking about vocabulary science.Ethan [00:14:50]: so vocab.Swyx [00:14:51]: And so, what is, what is imp-- like a million is impossible?Ethan [00:14:54]: In generative models, the vocab is continuous. It's a continuous space. We can think about like you map an image to a vector. It's a, it's a fixed length vector. It's sixteen or forty-eight, something like that. And then you map that vector back to the image space. And the mapping is, has-- The mapping is patch-based. So you say you haveEthan [00:15:22]: a sixteen by sixteen patch and you match, you map that patch of pixels into this latent space.Swyx [00:15:29]: We've covered thisVibhu [00:15:30]: This is like the vision transformersSwyx [00:15:32]: VAEs,Ethan [00:15:33]: VAEs.Vibhu [00:15:34]: You basically compress your input, you do your generation, you're reasoning all that generation in smaller dimension, and then you project back out.Swyx [00:15:43]: VAE is a form compression, but I think the for me, the patching thing is from VIT, right?Ethan [00:15:48]: You can make those.Swyx [00:15:49]: Literally the, yeah, the paper is titled like sixteen by sixteen is all you need. something like that. and then I think also, people make a lot of comparisons with this kind of patching with convolutions.Swyx [00:16:02]: Which is you're, you're kind of re- reconstructing the old paradigm with the new.Ethan [00:16:05]: Actually, in VAEs, there are, there are both convolution networks and transformers. You can actually do both.Ethan [00:16:14]: After this VAE, so what you've got is you've got latent space tokens and you've got the language tokens. So now the training of the diffusion transformer, usually generative models use diffusion transformers. It is actually quite standard. It's, it's very similar to how you train a language transformer models. It's not that much difference. It's just the tokens, the visual tokens in, visual tokens out. The only difference is there's a denoising process. So you train the model to unmask some of the noise. So you add, you add random noise to the visual tokens, and then you train the model to remove those noise to generate the clean tokens. Any inference, the model can iteratively remove noise from a hundred percent noise.Swyx [00:17:12]: And then there's also, to speed things along on the tech tree of diffusion, there's CFG, and then there's, there's also, latent diffusion that, there's, there's someone in there. I think, somewhere along the line, obviously, like stability and all these other guys, pioneered a lot of this, architecture. I don't know if you want to get into that or just, or do the video side up to you.Bootstrapping Video from Image Models and Temporal CompressionEthan [00:17:37]: After you train such model, such image model, the reason it's a, it's a foundation for video models is that image models are cheaper to train, and they have much denser connection between language and text. So, sorry, language and images. For example, you train a billion, you train on a billion images, and there's a mapping from the text to the image. And the cost to train the same, like the, a billion, a billion text to a billion videos, that's much more expensive because videosNaturally have more tokens than images. Because the diffusion models, their understanding of, language purely come from this mapping. So if you don't have enough mapping, so if you only train on like a ten million videos or something, there-- you might not see enough language tokens in your training, so your model does not understand human intention enough. So that's why you really-- you train-- you first train this image diffusion models, and then you bootstrap the video model from there.Swyx [00:18:53]: One thing I did want to ask, because I-- actually, I think you're, you're the first per-- video model person I've ever talked to, I think. we've, we've like talked to Luma and all those folks. There's all these tricks in video compression where basically frame by frame there's not that much difference, so actually you don't have to regenerate or save the whole frame, right? but I think MP4 compression or something else like that.Swyx [00:19:16]: is it tempting to use that? Or as far as I can tell, everyone just treats it as, “No, we would just generate every frame.” Is that roughly the state-art?Ethan [00:19:27]: There are a few different approaches. Let's say first, like you want to just directly use MP4 compression and use that as the tokens for the transformers to train, right? So people actually have tried that, but the main challenge is the latent space for the MP4 tokens were not, were not very comprehensible for the models. It's, it's extremely hard to train on that. And there's aEthan [00:20:01]: So that's why they created VAEs, which creates more continuous, latent space, so the models can understand that latent space and learn from it much easier. Even within the VAEs, there are different difficulties of the latent space. So you can imagine something the simplest, the most naive VAE is like you have an image, and you just shuffle all of the images into a, into a vector. So you don't need to train any VAEs, right? But that latent space is extremely hard for models to train on top of. That's why there are some debate on like how do you compress the tokens. So you mentioned like you can compress frame by frame. Also, you can compress, the temporal dimension.Ethan [00:20:52]: The difference is if you compress the temporal dimension, you get a much higher compression rate. Because there's temporal redundancy between frames, because, this frame and the last frame, likely they are mostly similar, so there's only some small difference. for example, I think in 12.1 VAE, they have like a eight by eight by four compression rate. So the four temporal tokens are compressed into one tokens. That can save a lot of, save a lot of the context length. If you do it frame by frame, you have to do maybe like eight by eight by one. Your context length will be four times larger. That being said, the benefit of the frame-- per frame compression, we might come back to this later, is, real-timeness and interactivity. ‘Cause if you, if you strain the output of the model, frame by frame, you can-- the model can respond to any user request immediately. So if you have like a temporal four compression, four times compression, thenSwyx [00:22:06]: It might be laggyEthan [00:22:07]: there's a lag there in nature.Swyx [00:22:10]: So you're very pilled on this. let's just go ahead and bring it up ‘cause we have the visual prepared anyway. There's some frontier applications of real-time video gen. So Flipbook is one of the examples that went viral recently, right? What is Flipbook?Real-Time Generative UI: Flipbook, Neural OS, and Diffusion Front EndsEthan [00:22:23]: Flipbook is kind of like a web brow- web browser. You can see like it has the web bro- browser UI on top. The difference is all of the UIs are generated by generative image model in real time, and anything here are fake. But you can, you can explore inside this wor- this imaginary world. Say like we-- here we have engineering the Great Pyramid. Like the model generates this for us to understand how it works, and if we want to navigate around and understand further, we can click on some of the, some of the description here, and the model will generate a new page, new subpage describing the details we want to know about.Swyx [00:23:14]: So it's basically kind of we're playing a video, but it's pausing for our next interaction, and then it just plays the next thing based on our interaction.Swyx [00:23:23]: Which is kind of cool.Vibhu [00:23:25]: and you kind of decide your story. So this was, how do you make a pyramid? levering technique seemed interesting, right? It shows how do you take Okay, I want to know what is thisSwyx [00:23:35]: The demo, the demo tweet had more animation between frames.Vibhu [00:23:38]: I think it's just skipping,Swyx [00:23:39]: Oh, it's just skipping a lot of frames.Ethan [00:23:40]: they also have a video modeVibhu [00:23:42]: It takes a lot. There's a lot of peopleEthan [00:23:42]: but, a lot of people are using it.Ethan [00:23:45]: So it's not available.Vibhu [00:23:46]: There's a live video stream. We can try,Swyx [00:23:50]: So this is an example of the kind of future that you see at the extreme. We don't-- we're obviously not in it today.Swyx [00:23:56]: But in a world where inference is completely free this is better than generating code and text?Ethan [00:24:02]: So this is, this is a final state of where Viva will be at for word model, I think. Imagine internet doesn't exist, and then you type in google.com. Like what should, what should, what should a model show you?the model can imagine something, and this is what the model imagine. And these web pages, they completely do not exist. So I think as the inference costs come down, we are going to have generative UI for everything. If you think about how the coding model works, so they write code for a web page, and they render the code might be con- converted into binary, and the binary render the pixels on the screen. So we in machine learning, every time we have some breakthrough, obviously it's, it's more intuit. So why don't we have like user instruction to the pixel directly? So the generative UI will be user intention to the pixels directly. And say like even if I want email, let's say everyone have the same interface, but I want, I want it slightly different. I want the email to show to me like a TikTok, so I can swipe left and right for the emails. And or maybe you want something else. We can have completely different things. Or like I have I'm looking at, Instagram stories, and I don't like the Like button. I always may click it. And, generative UI resolved it. So it's going to be a revolutionary replacement of the interface. So in the future, we might have much more powerfulEthan [00:25:50]: LLMs and coding models running behind the scene. And in the, in the front-end, the diffusion model will actually be the front-end to show stuff to you. That's how I imagine it.Swyx [00:26:02]: Diffusion front-end, deterministic back-end.Swyx [00:26:04]: Something like that. I find that very expensive, but,Vibhu [00:26:08]: I find it interesting you called LLMs writing code on the back end deterministic, but okay.Swyx [00:26:14]: you write it onceVibhu [00:26:15]: Compare it toSwyx [00:26:16]: And then you execute.Ethan [00:26:17]: If you think about the cost, say, let's say H100 costs $1 per hour, and if you use this eight hours a day and thirty days, so, every month you're paying this two forty, you'll actually not wanna pay for that. That's even more expensive than Cloud Code Max. But if you think about the compute costs come down like two times every year, and I think the future will likely arrive like within few years.Vibhu [00:26:49]: It's everything, right? compute cost comes down, compute gets faster, model gets smarterEthan [00:26:54]: More efficientVibhu [00:26:54]: model gets smaller.Swyx [00:26:55]: I don't know why you say two times, ‘cause I think it's like 100 times. In language models, it is roughly one hundred to a thousand times every twelve to eighteen months, for the same given level of LMSys, ELO.Vibhu [00:27:08]: That's a net of everything, right? That's model performance alongside compute. So different than just compute costs come down. But, a very interesting future.Swyx [00:27:19]: So the web designers will have to shout out that accessibility is an issue, right? how do you deal with screen readers or whatever. But yes, this is higher bandwidth storytelling than anything you can possibly generate with code, right? So I think that's the rough idea.Ethan [00:27:34]: And I'd like to add a little bit that so human naturally have the maximum bandwidth when we are looking at things, look at videos, and we also have maximum output bandwidth when we are talking. So in the future, it might be something like we talk to AI models, and the AI model responds back with a generative UI. So that would be the maximum input and output bandwidth to interact with AI models before neural link happens.Vibhu [00:28:06]: And it's also very custom, right? Some people are very visual, some people are not as visual, right? They prefer the text. But the best thing about generative UI, right, it can also be text.Swyx [00:28:17]: There's another project that we wanted to highlight, which is the Neural OS. Kinda similar idea, but here you're literally operating, simulating an operating system with a video model.Swyx [00:28:27]: and you can play Doom, you can do Firefox. I find this like mildly less impressive, obviously, because it's an OS that I can run.Swyx [00:28:37]: But here everything is imagined.Vibhu [00:28:40]: I was, used to the Command+W to close the Firefox tab. It didn't crash. That's why I saidSwyx [00:28:45]: It's too immersive.Vibhu [00:28:46]: It's, it's too immersive for me.Swyx [00:28:47]: Too immersive.Vibhu [00:28:48]: I wanted to close the tab.Vibhu [00:28:49]: But yes, I can play generated diffusion.Swyx [00:28:51]: this is shockingly fast.Swyx [00:28:54]: Because I remember there was a demo about like maybe one to two years ago. Someone tried to do the first-person shooter with a image model. There was no consistency. It was very slow. But here it looks like realistically it's-- this is Doom.Vibhu [00:29:07]: I think there's two sides to that, right? There's okay, what is running a game? The heavy part of it is actually the game engine, all the lighting, all that stuff, the graphics. This is just kind of video, right? Like we've solved consistency. This is still, it looks like a few years old image generation. There's some temporal consistency, but it's, it's kind of just images stitched together as frame video. But it's a good visual representation to pi- to picture the future you wanna see, right? that's, that's what I see in these more so.Ethan [00:29:38]: This reminds me of how the video models gets better and better. So Neural OS is kinda if you just look at it feels like it's just a crappy version of the, like the Windows we could have, right? And, but the difference is, so the model, this model is overfitted on the existing operating systems. It can generate nothing different than that. But it's actually also similar to video models. So when we are training these video model, image model, we train them on internet. There's no imaginary supernatural stuff on the internet. But once we train this model, you can prompt the model to generate something supernatural that have never existed in the data set. So if you train your Neural OS or neural computer on the standard screen recordings on the entire internet. The model can imagine completely new interface to interact with the computer.Swyx [00:30:43]: This is one of those things that is magical to me. usually generalizing out of distribution is bad, but somehow we have learned some kind of internal world model that you say, this plus, but it looks like rainbows and butterflies, it'll do it and it will kind of make sense.Swyx [00:31:03]: So yeah, that's kind of cool. Yeah, I don't know if there's any comment more on there. I do, I do wanted to, I did wanted to touch a little bit more on the model architecture stuff, which I think you were getting. It's, really fascinating. We don't get a chance to talk about this enough. So one of the papers that we covered, we've covered every annual, segment anything release. and I don't know if you follow-- you're a computer vision guy, so youEthan [00:31:26]: I knowSwyx [00:31:27]: . So they did memory attention, which is kind of interesting. And I always think, anything where you can, across the temporal dimension, keep some consistency, I think it's, very fascinating, and I don't know if Basically, does that-- the CV side bleeding into video gen side, I think is underexplored, right? we talk about it for labeling, but actually you can borrow the architecture itself.Ethan [00:31:50]: There's, there's also complete different approaches, right? you brought up the term world model, so we went from video model to world model. There is diffusion, but there's also other approaches that people are doing. So maybe we get into those after as well,?Swyx [00:32:03]: He has a whole definition of world models and stuff. I feel like we threw a lot at you. Whatever you want to comment on.Why Video Models Are Expensive: Storage, I/O, and Training ScaleEthan [00:32:10]: I think one thing that we should actually comment back on is okay, so we were talking about the steps to train image gen to video model. One thing we don't see as much of is okay, you brought up the delta in training data, right? SoEthan [00:32:24]: you won't have as much a video model might not generalize, but what is the cost of training a large video model? So we know for LLMs roughly, okay, even like the poolside thing that came out today, right? It's a Gemma level model trained on roughly forty trillion tokens at this many H200s over this much time, right? You can see what is the exact cost of that. So how many GPU hours over how much H200 costs? So how do we do the back-end math of, same thing for video models, image models. How do you, how do you kind of break that down? I can share some back-envelope calculation. So surprisingly, video models is-- the cost is very-- is comparable to language models and obviously the largest scale is language model, maybe like a medium scale to language models. I said just storing the videos alone, it costs a lot. You can, you can maybe look up on AWS or something.Ethan [00:33:20]: You really, say if you have a billion videos and let's say, let's just say like each video, like five megabyte, then you need five petabyte to just store those videos. And also remember we talk about you use a VAE to compress the videos, and you also need to store, typically you need to store those continuous feature, in-- also in your storage. That's also comparable size with the videos themselves. So just storing these videos and the features is tens of petabytes alone. And,Swyx [00:33:58]: I just, I just looked up the calculation. Five petabytes on S3 Standard is one hundred K per month.Ethan [00:34:05]: AndSwyx [00:34:05]: It's comparableEthan [00:34:05]: and you needSwyx [00:34:06]: AndEthan [00:34:06]: And then like tens of petabytes, two hundred K. And even more expensive is you have the ingress and egress.Swyx [00:34:13]: Oh, yeah.Ethan [00:34:14]: Like you-- through the internet. You have to just to download those videos, I believe it's, it's more expensive on AWS than just storing those videos.Swyx [00:34:25]: Storing, yeah.Ethan [00:34:25]: And each training runs, you probably need to pull them once. If you train multiple times, it's, it's even more than that. So it's like just storing the network, those costs is just, it would be a few, a few millions per month to just storing everything, not to mention the GPU cost.Ethan [00:34:45]: AndSwyx [00:34:45]: my side tangent, the compute rental, like GPU rental is very efficient. There's one side, okay, you can be XAI and build your data center. Should we not just build our, storage compute as well? LikeEthan [00:34:57]: Of courseSwyx [00:34:57]: cloud cost compared to just,Ethan [00:34:59]: You save so muchSwyx [00:35:00]: store. Yeah, exactly.Swyx [00:35:01]: Especially with like egress and stuff. So.Ethan [00:35:04]: That's a good idea, but it also comes to-- there are some of its own challenges.Swyx [00:35:09]: Of course, of course.Ethan [00:35:10]: like people who build the GPU data centers, they might not expect this much, storage. And yeah, people build storage, typically they just build it somewhere with just CPUs.Swyx [00:35:23]: I just looked it up. Five-- AWS only charges for egress, not ingress. Tier five for five petabytes is two hundred and thirty K.Ethan [00:35:32]: Even more expensive than the storage.Swyx [00:35:34]: But storing is per month, right? You check in, then you cannot check out. so it's so cool. It's okay. So there's that side.Ethan [00:35:41]: So the TLDR, my backhand mathSwyx [00:35:42]: Data is larger than you think. Yes.Ethan [00:35:44]: my backhand math of GPU hours times GPU cost is also very much, I'm missing some storage.Swyx [00:35:49]: You're also-- you're basically like also more IO bound than normal training.Swyx [00:35:55]: Yes. ‘Cause like data loading, so caching everything, it becomes super important.Ethan [00:36:00]: So in Cosmos, we did a lot of optimizations to make it not IO bound. So, speaking of the training, actually training the model, the GPU cost, if you look up like the open source model, how big these video models are, I think like LTX has nineteen B parameters. That's a dense model. And people are also exploring, MoEs, so it might be twenty B active and, like a hun- hundreds B, total. So that's, that's even-- that's similar size as medium-sized LLM models. And if you, if you look at number of tokens-Uh, we disclose that in Cosmos. It's also like tens of trillions of tokens on the visual tokens. So putting this together, the cost of, training these video models, it's actually comparable with LLMs. Not to mention, the infra is slightly different from LLM, so it might be less efficient to train these models.Inference Speedups: Step Distillation, Consistency Models, and GANsSwyx [00:37:04]: Do you get the benefits of traditional diffusion speed-up? So for, images, there's LCM, LoRAs for, fine-tuning. There's, there's a lot of stuff that's beenEthan [00:37:15]: Flow matching.Swyx [00:37:16]: there's flow matching. There's a lot of stuff that's been done. there's some overlap that applies to diffusion on the inference side and stuff or?Ethan [00:37:23]: so the difference-- the inference side is a completely different story.Ethan [00:37:28]: I think for the training side, it might be a little bit hard to reduce that cost. And for the inference side, the biggest gain is from the distillation of these models. You can-- It's called step distillation, slightly different from knowledge distillation in LLMs. So you-- Typically, for flow matching models, you need like 100 steps or something. Like a distortion model even need even more, like 1,000 steps to generate a good image or video. A step distillation is try to learn to generate fewer step from the model itself. It's kind of like now we-- you use the full model to generate in 100 steps, and then you take a model that only generate 10 steps and let that model to learn from the perfect one.Ethan [00:38:25]: why this workSwyx [00:38:27]: Strong to weak seemingly.Ethan [00:38:28]: It is. It's kind ofSwyx [00:38:29]: DistillationEthan [00:38:29]: kind of like strong to weak. the-- from the modeling perspective, the strong model, the teacher model is trying to model the image and videos of inter-internet, and that distribution is extremely complex. But the step distilled model is just trying to learn from the teacher. The teacher is a model, and the size is fixed, as the distribution is much simpler than the whole internet. That's the intuition I have why step distillation can work. So usually these models serve in productions, they only run in a few steps. In Cosmos, I believe we have, we have like four step and eight steps. If you do some simpler task, image-image translation, it can even run in fewer step, like one step in Cosmos Transfer.Swyx [00:39:22]: I think this is the same intuition that guides a lot of the consistency model work. I sent you a link for, SCM. I don't know if you covered that. To me, that was actually one of, the most impressive papers I've ever seen from OpenAI.Swyx [00:39:34]: That this is the unifying grand concept of consistency models. I don't know if you have any comments on this.Ethan [00:39:41]: So there are, there are a few different approaches,Swyx [00:39:46]: Oh, yeah. Here it is.Swyx [00:39:47]: Two steps versus twenty or 100 steps, whatever. It's already done.Ethan [00:39:52]: So there are, there are a few different approaches, for example, consistency model, and there are also Actually, we shouldn't forget GAN. So GAN, actually, that was, that was the OG ofSwyx [00:40:05]: OGEthan [00:40:05]: step distillation ‘cause it trained just one step to begin with. So actually, a lot of, uh-- For example, there's a distribution matching distillation which use, which uses GAN, as one of the laws for distillation. It-- GAN just tells you, “Hey, generate an image,” and thenEthan [00:40:31]: it has a discriminator to tell, is this image real or not? So the model, the model just need to learn one of the distribution, not the full distribution. Because in training, the model is asked to reconstruct the ground truth image from the internet, which is extremely hard. And in-- When you're training GAN, it's a step process. It's just a, “Hey, you generate image. Does this image look as real as the image from the internet?” Which is a much simpler task. And, yeah, combining a lot of these approaches together, people typically do that, like consistency model and distribution matching and GAN, and we can get these few step models.Audio-Video Generation and Time AlignmentSwyx [00:41:21]: Then there's one step I wanted to add, which is audio and video.Ethan [00:41:26]: So, Grok Imagine zero point nine, I believe it's, it's a first audio video transmodel deployed at a large scale. SoSwyx [00:41:39]: And that was your first model?Ethan [00:41:40]: that was, Grok Imagine's first model. It's, it's audio video, joint generation. I think the hard part is, the modality alignment, ‘cause before this transmodel, we have, we have text to video alignment. We have this, correspondence between text and video. Typically, most of the VLMs, they understand images and videos. Video's very rare, and they don't understand audio mostly. And if you look at the audio generation on the LLM side, you can talk to them perfectly fine, but if you ask them to sing a song or something, it typically is not very good. Also, they don't have, they don't have music either. The hard part is thatUh, actually audio has two component. It has like a discrete component, a continuous component. The discrete component is like the language.Ethan [00:42:44]: So when we speak, it's just, someSwyx [00:42:47]: It's an ASR issue, yeah.Ethan [00:42:49]: It's, it's text token with some characteristics, I would say.Ethan [00:42:54]: But musicSwyx [00:42:56]: I think the speech guys would disagree with this.Swyx [00:42:57]: Like disfluencies and then,Vibhu [00:43:00]: There's tones you can get angry.Ethan [00:43:01]: Well, I say largely.Ethan [00:43:03]: the mu- but the music is completely different. It's, it's very continuous, and you cannot model them like discrete tokens in language models. this is like the hard part for models is, not to mention we have to align text, video, and audio together.Ethan [00:43:26]: SoVibhu [00:43:26]: How?Ethan [00:43:28]: So significant-- some significant challenges are like-- So first, like we talk about as the VLMs, they cannot understand most of them cannot understand audio.Ethan [00:43:39]: So you have to have some way to do the synthetic data generation for audio. You have to caption the model, and that involve, that involve synthetic data and human data effort a lot. And not just surprisingly, most of the LLMs are very bad at recognizing, like the beat, tone, and the details of the of music. They can, they can give some general prediction of which song is this, but it's very hard to describe the details of the music. like we mentioned in image generation, like you have to describe image as detailed as possible so that someone blind can reconstruct that. So here is like someoneVibhu [00:44:32]: DeafEthan [00:44:32]: someone deaf can reconstruct how the music sounds like without actually listening to it. Maybe you can think of it need to have the-- or they call the script.Vibhu [00:44:49]: Subtitles, yeah.Ethan [00:44:49]: You gotta have all the details of the music, and the dialogue.Vibhu [00:44:55]: So is the challenge there typically stuff like music and audio, or is it just Like is there a baseline? Okay, there's enough data where we can understand, narration, conversation, but there's nuances in audio that's where you hit all the data issues or is it just from stage zero, you just do it all right?Ethan [00:45:15]: So one important thing is like the alignment. So the model, the model has to know like the video and audio, the, uh-- it has to have a time-based alignment, like at which time step the video and the audio token correspond to each other. But we actually don't have this kind of alignment for most of the other modalities. If you think about like text and image, text and video, they are loosely aligned. So you can, you can have a description of what's going on in the video, but you don't have to exactly, You typically don't have exact description, oh, at, time step one second like what happened?Vibhu [00:46:02]: It's veryEthan [00:46:03]: At time step two second what happenedVibhu [00:46:03]: coarse. Yeah.Swyx [00:46:05]: So what was the ideal time step? You have to oblate it, and then it's like four seconds or something.Ethan [00:46:09]: So that comes down to how you design the model to, for the model to be aware of as a time, as a time modality. So the model is like a time aware. And that's something pretty unique if you think about LLMs. So if you ask LLM to complete a task, say they, uh-- you ask them and they will say, “Oh, this task will probably take twelve hours to complete,” and they come back in one hour. Say “I've already spent two days on this and I've exhausted everything.”Ethan [00:46:47]: So the LLMs them-themselves, they don't have a sense of time there.Vibhu [00:46:53]: I actually don't think that's just them not having a sense of time. I think it's somewhat based, right?Vibhu [00:46:58]: Like you tell someone, “Okay, go work on this feature. Go implement this,” there's a general understanding you would have of how long that would take without LLMs working at LLM speed, right? So you think back like two years ago, if I tell you to like build me like a new front end for latent space, have a search bar, have all this, you'll estimate that it'll take a few days, right?Vibhu [00:47:19]: So you tell an LLM, “Go build this.” It'll take me a few days. But I think it's somewhat grounded as opposed to them not having the best-- Not saying that they have a great understanding, but I think that example is like you can see where it comes from, right? You're trained on all over the text.Swyx [00:47:35]: They're, they're trying to estimate what a human would say.Vibhu [00:47:37]: because that's what the, that's what the data kind of represents. It's not themEthan [00:47:41]: It came from the corpus on the internet. People have a estimate of how much time.Vibhu [00:47:45]: And not even just in direct like training samples, right? Just your world understanding of tokens of how long stuff takes, right? Go read a book. It'll take you a while, right?Vibhu [00:47:56]: Even if you do nothing but read a book, it takes a few days. So yeah, LLM, I read it took me a few hours.Vibhu [00:48:01]: It'll take me a few hours to go through this research. But this is a tangent.Swyx [00:48:05]: Somewhat, yeah.Swyx [00:48:06]: This is a train of thought I haven't really expressed until now is, which is basically like a full world model must also be recursive, meaning that the participant in the world model must also be aware that they have a world model. which is like this whole recursive thing down the, down the line. but yes, and that the world model can be wrong and that they need to update it and blah. Yeah. We've, argued this on the, newsletter as well, that there needs to be sort of recursive or adversarial world models.World Models: Real-Time, Long-Horizon, Interactive VideoVibhu [00:48:34]: just, to ask, how do you define world model?Swyx [00:48:38]: Oh, yeah, let's go there.Ethan [00:48:40]: SoVibhu [00:48:40]: So just for context, we talked about, video generation, and then there's a-- if you say there's a distinction between world models, what's your, what's your definition? How do you see the two?Ethan [00:48:53]: So disclaimer, I'm not going to debate, what is world model. Yeah. there are many definitions, so I'll just talk about my definition. Since I came from the multi-model, multi-model domain, so mainly talking from video. So world model is like real-time interactive long horizon videos. So there are three parts. so we-- let's talk about them one by one. So the so interaction, so we just, we just look at Facebook and neural computer. So the interaction part of it, so you, world model can allow you to interact with them through keyboard, mouse, and maybe also voice. So these all is-- all is a modality. You can, you can interact with the model, and the model should respond reasonably. Second part is real time. So once you, once, say, you move your mouse, if, say, the world model generate a game, how fast can the game respond? So if you're like professional CS: GO players- -my say, oh, you have to respond- He's beginner within sub ten milliseconds or- Yeah even less. So that's not most of the- No, sixty FPS. Let's go. Oh, three hundred FPS. Oh, five hundred FPS. Wait. okay, yeah. I didn't do the math, but yeah, okay. Uh- Yeah, three hundred FPS, that's a three millisecond. So you have to respond- Oh, s**t. Okay. YeahEthan [00:50:29]: within a millisecond. Most of the video models cannot do that. Yeah. And, but if you, say, if you have a video model that is, say, like a digital human, the response time might be more generous. Maybe typically, for real-time voice interaction, it's like two hundred millisecond. So that's, that's much more generous. But even two hundred millisecond is pretty, it is pretty tricky, ‘cause remember we mentionedEthan [00:51:01]: you have this, temporal compression coming from the VAE. So if you, if you don't compress the temporal dimension, your sequence length is going to explode. So if you want to have this real-time, real-timeness in your model, you have to do is one context problem. And the third part is long horizon, ‘cause we-- if you're not going to just play with, video games just, a few seconds, most video models only a few seconds. We're going to play with minutes, hours. The model have to be able to generate long-form content.Ethan [00:51:42]: So putting these three together, it's, real-time, long horizon interactive videos. I think the final state will be, for example, like a video, a video version of Playbook, where you can, you can interact with, a neural computer. You move your mouse, and you click on the generative interface, and it will reply to you through pixels- generating in real time. But getting there, it's, it's a very long way to get there. So one of the first step, at Grok Imagine, where I led a small world model team there, was to build video extension. So, video extension- it's the first step of interactivity. Yeah. It's, it's the first step. Yeah. So it's the first step- You have it here, video editing, yeah. Yeah. Yeah. So the first step is because, this unlocks long horizon videos. Typically, for most of the video generation models, you give it a prompt or an image as an initial frame. You generate video, that's it. That's just, one time, done. And some creators would try to, use the last frame as a first frame for the second video. It can-- sometimes it works, but if you do it a few times, it says the quality would decrease. And- It doesn't have that context- Yeah over the full video, so the temporal- Yeah, exactly. Yeah, ‘cause you only gave it the last frame, of course, right? Yeah. Exactly. And- it's actually a pretty fun hack. if you've seen like- Oh, no, he's saying something better. Yeah. And for example, like Vue, I remember Vue 3 has like a second context of the last video. It is slightly better than using the last frame, but it has the same problem-- similar problem that it, the quality would decrease. if you extend a few times to, one minute, the video quality would look much worse than the first video. Second, another problem is that the model doesn't have long-range knowledge of, what's happening before. Say, if they generate some dialogue, some, two people speaking, and their voice might change, over some time, especially if the second conditioning, it does not cover the previous context. So these are the core challenges. So the Grok Imagine video extension, it has historical context of all of the previous generated videos. It can, It has, it has the context of, who is speaking and what objects have appeared and everything, having that to generate the next video. So if we naively do this, you can imagine, just, put all of the previous history video tokens into the context. The context lens will easily explode. Especially for video models, that can be like a few, a few million context, I would imagine- context lens. Yes.Yeah.Swyx [00:54:58]: Let's run with that.Ethan [00:54:59]: for example, like in Cosmos, I think just five seconds of video is like a fifty K or sixty K number of tokens. So like if you do, if you do fifty second, that's a five hundred K tokens. If you do longer than that, easily explode. This long horizon, problem was the first step we're trying to solve world model. It turns out people, yeah, people love video extension. Like a lot, a lot of the creators love using video extension to create longer form videos. This is the part I liked that you have a, you have an intermediate step toward the final goal instead of just a straight shot to the final version very much.Swyx [00:55:48]: But I can see you have a strong vision of where we want to end up.Long Context, Redundancy, and Efficient Interactive VideoVibhu [00:55:51]: Does it seem like it's an efficiency issue? okay, we're at a few million tokens context,. If you draw the parallel to language models, we had very short context, two thousand, eight thousand, then, you scale it up one million, ten million. sure, there's effective context, but at the end of the day, it's just what's it worth? sure, there's a whole training data side. In video, it might be slightly easier ‘cause we have a hundred million token video, right? Just take a movie with the full context there. Like is this efficiency from an inference standpoint that like it's expensive, but we know how to solve it? Or like why is this not the approach? So like my broader point was on your second point of world models, you say it needs to be interactive and live, right? You should be able to play a game and see the interaction live. So one thing I see with research is a lot of what you actually serve is different than what you build, right? So we talked about distillation. You train big model, you distill it, you do quantization, speculative decoding. We do all this stuff to serve it efficiently. Should we not just have a solution, like a world model that can interact well, do inference optimization, serve it, distill it secondary, so make it real time after you solve it? So like a-- another parallel is say, continual learning, right? What we need is someone to solve it and show it works inefficiently. Give it a few years, people will make it efficient. Same thing with regular attention, right? It worked. Over a few years, people have different forms of attention, and we've scaled it to be efficient at log context,? So kind of two things there, right? One is it seems like it works. You've scaled it. Can we not just scale it a lot more efficiently over time? Do we need a separate approach if this works? And same thing with interaction, right? if we can get it done, like if we can solve some way that it works, we can solve making it more efficient from an inference standpoint later.Ethan [00:57:53]: that's actually a very good point. So in videos, there's actually a lot of redundancies. So we solve a lot of the pixel redundancy from VE, but there's more redundancy in long range and long horizon videos. Say, if a character appear in the first clip and then it disappeared, it only reappear at the end of the video, you probably don't need the-- the context, like in the middle of the generation. So you only need that character, where you need. So that's why, I helped build another feature. It's a reference video.Vibhu [00:58:36]: Is it here?Swyx [00:58:36]: is it the same model release or different one?Ethan [00:58:39]: It's a different one.Ethan [00:58:41]: You probably need to search onSwyx [00:58:43]: I'll find itEthan [00:58:43]: X reference to video.Ethan [00:58:46]: So reference video allow you to like upload up to seven images as condition and generate the video. Say, if like I want-- it can, it can be characters or objects or even scenes. Say like I want, I want condition on, Sean's selfie and holding a bladeSwyx [00:59:07]: We have a dogEthan [00:59:08]: or whatever.Swyx [00:59:08]: We put the dog in the thing.Ethan [00:59:09]: you can put them there and the video models will generate the video from and copies the context over. So that can solve a lot of the problems there, like the long context problem. It doesn't need to have a very long context, but it's-- I feel like it's an intermediate solution. The modelSwyx [00:59:29]: It's cheating.Ethan [00:59:30]: the model should be able to like selectively know, where should I draw the references. So say if I want to generate a movie, I generate it autoregressive, like a ten second at a time or something. And now this character appear, I can look back to where it first appear and, bring that back. Yeah, this one, I put the references. Yeah, that's, Optimus, Einstein myself, Annie.Vibhu [01:00:02]: Oddly enough, I used Grok Search to find it, and it pulled your LinkedIn post. But yeah we found it.Ethan [01:00:08]: Interesting.Vibhu [01:00:10]: ButxAI's Underrated Work, Culture, and WatermarkingSwyx [01:00:11]: this is a problem. This is not your fault, but like XAI doesn't communicate all this work that you do very well because they just have the model release and then that's it. But actually, these details are very good.Swyx [01:00:22]: As far as I understand, everything you just described is state-art, like no one else has done it.Vibhu [01:00:30]: A lot of-- yeah, I have a lot moreSwyx [01:00:32]: And then, and then you just put this blog post with the cookies. I'm this is not enough,?Swyx [01:00:37]: but I, obviously this is like the high level numbers that people want to know. But no, okay, soVibhu [01:00:42]: And I wonder, like part of that is also some labs don't share research into what happens. And ifSwyx [01:00:50]: No, but this is literally bragging about how good they are, right?Swyx [01:00:54]: Like, why would you not say that you are capable of extending with full context? this is not a secret sauce. This is like we did the work. yeah, I don't know.Ethan [01:01:02]: different labs have slightly different communication styles.Swyx [01:01:07]: Anyway, if anyone from XAI is listening we are always happy to help you tell your story. Yeah, okay, so you did references, and I think, I think kind of the point you're, you're making is it is sort of like a kludge, right? this is-- you can do seven, but what about 100?Swyx [01:01:23]: Right? Then you need a completely different thing.Ethan [01:01:26]: So I think it's-- this is, a mechanism to, select the context from the history, and you might not put the entire history into the context. for example, there's a paper called Frame Pack, which haveEthan [01:01:41]: a heuristic that the latest history, the last one second, I put the entire history, and the history before that, I would, compress it and makes the video smaller. So they follow this pattern, this build overall pattern that the maximum sequence length is fixed. So the further you are from the current frame, you have a smaller image. So this is just a heuristic. I think it can be more automatic. The model is aware like which history part of it can be select. So this part of the research is actually being actively, worked on by a lot of people. It's also quite interesting. I feel this is actually, this part of long context is a little bit ahead of the LLM part.Ethan [01:02:31]: So for example, like in LLMs, if you-- so contexts keep growing. Let's say if you call tool and the tool call history is extremely long, that's still in context, and keep growing, keep growing. Even if you switch the topic to something else, the whole context was there. There are some agentic harnesses that help you to, say, prune the tool results and, prune Like when you, when you query a file, only show like the top 200 lines or something. Those were very heuristic-driven.Swyx [01:03:08]: For listeners, we did a write-up on the cloud code, leak where there are eight different kinds of pruning, including like you prune the tool results and all that. So you can, you can read up on that kind of thing.Ethan [01:03:17]: I think, one breakthrough in continual learning might be like a way to automatically, manage its own context.Swyx [01:03:27]: These are all heuristics, and they will be replaced by machine learning.Ethan [01:03:30]: InterestinglyVibhu [01:03:32]: TheEthan [01:03:32]: the same thing is being researched in both LLMs and video models.Vibhu [01:03:36]: The interesting thing is also like in the paper you showed, it's actually happening at the model level, right? Compared to like language models, sure, we have base attention, but we'll do our own compression, we'll do our own pruning, which is separate from model error.Vibhu [01:03:49]: Eventually, it all just boils in, hopefully.Swyx [01:03:52]: I think this is a form of like attention, but like also know sort of reasoning attention. I feel like that's different than normal attention.Swyx [01:04:03]: Does that, does that make sense?Ethan [01:04:04]: It's, it's different in the sense that attention, not to mention, set sparse attention aside,
EU AI Act 2026: Was jetzt gilt und was noch auf dich zukommt Der EU AI Act ist verabschiedet — aber er wird bereits überarbeitet, bevor er vollständig in Kraft ist. Was das konkret für dich als Selbständige oder Unternehmer bedeutet, welche Regeln schon heute gelten und worauf du dich bis 2027 vorbereiten musst, erklärt Rechtsexperte Philipp Hacker im Koerting-Institute-Podcast. Hier die wichtigsten Punkte im Überblick. Philipp Hacker auf LinkedIn: LinkedIn - https://www.linkedin.com/in/philipp-hacker-078940257/ Was bereits gilt: Verbote, KI-Kompetenz und GPAI-Regeln Einige Teile des AI Act sind schon jetzt verbindlich — darunter das Verbot von Emotionserken-nung am Arbeitsplatz und die Pflicht, dass alle, die mit KI arbeiten, ein Grundverständnis über tech-nische und rechtliche Zusammenhänge mitbringen. Wer ein bestehendes KI-Modell unter eigenem Namen vermarktet, kann außerdem rechtlich als Anbieter eingestuft werden. Neue Zeitpläne: Chatbots, Wasserzeichen und Hochrisikopflichten Ab dem 2. August 2026 muss jeder Chatbot zu Beginn einer Interaktion klarstellen, dass es sich um KI handelt… Nutzer dürfen nicht den Eindruck bekommen, mit einem Menschen zu sprechen. Die Regeln zu Watermarking und Labeling folgen erst ab Dezember 2026, die Hochrisikopflichten zu Risikomanagement, Datengovernance und menschlicher Aufsicht sogar erst 2027 oder 2028. Diese Verschiebungen sind das Ergebnis der laufenden AI-Omnibus-Überarbeitung, mit der die EU den AI Act bereits nachschärft, bevor er vollständig in Kraft ist. Deepfakes und Recruiting: Die zwei Themen, bei denen du jetzt handeln solltest Deepfakes sind weiter gefasst als oft angenommen: Darunter fällt nicht nur das gefälschte Perso-nenvideo, sondern auch ein KI-generiertes Produktbild auf deiner Website oder substanziell bear-beitete Marketingfotos… sobald KI wesentlich eingreift, gilt Kennzeichnungspflicht. Draft Guide-lines zu Deepfakes sind seit dem 8. Mai 2026 im Entwurf verfügbar und erklären anhand von Bei-spielen, was kennzeichnungspflichtig ist und was nicht. Fazit Der EU AI Act ist kein abstraktes Brüsseler Projekt mehr… er ist bereits in Teilen geltendes Recht und betrifft auch Selbständige und kleinere Unternehmen direkt. Die unmittelbar relevanten Punkte sind überschaubar: Chatbots kennzeichnen, keine Emotionserkennung gegenüber Mitarbeitenden, KI-Kompetenz aufbauen und KI-generierte Inhalte als solche ausweisen. Wer diese Entwicklungen regelmäßig verfolgt, ist nicht nur gesetzeskonform, sondern baut echtes Vertrauen bei Kunden und Partnern auf. Noch mehr von den Koertings ... Das KI-Café ... jede Woche Mittwoch (>350 Teilnehmer) von 08:30 bis 10:00 Uhr ... online via Zoom .. kostenlos und nicht umsonstJede Woche Mittwoch um 08:30 Uhr öffnet das KI-Café seine Online-Pforten ... wir lösen KI-Anwendungsfälle live auf der Bühne ... moderieren Expertenpanel zu speziellen Themen (bspw. KI im Recruiting ... KI in der Qualitätssicherung ... KI im Projektmanagement ... und vieles mehr) ... ordnen die neuen Entwicklungen in der KI-Welt ein und geben einen Ausblick ... und laden Experten ein für spezielle Themen ... und gehen auch mal in die Tiefe und durchdringen bestimmte Bereiche ganz konkret ... alles für dein Weiterkommen. Melde dich kostenfrei an ... www.koerting-institute.com/ki-cafe/ Mit jedem Prompt ein WOW! ... für Selbstständige und Unternehmer Ein klarer Leitfaden für Unternehmer, Selbstständige und Entscheider, die Künstliche Intelligenz nicht nur verstehen, sondern wirksam einsetzen wollen. Dieses Buch zeigt dir, wie du relevante KI-Anwendungsfälle erkennst und die KI als echten Sparringspartner nutzt, um diese Realität werden zu lassen. Praxisnah, mit echten Beispielen und vollständig umsetzungsorientiert. Das Buch ist ein Geschenk, nur Versandkosten von 9,95 € fallen an. Perfekt für Anfänger und Fortgeschrittene, die mit KI ihr Potenzial ausschöpfen möchten. Das Buch in deinen Briefkasten ... https://koerting-institute.com/shop/buch-mit-jedem-prompt-ein-wow/ Die KI-Lounge ... unsere Community für den Einstieg in die KI (>2800 Mitglieder) Die KI-Lounge ist eine Community für alle, die mehr über generative KI erfahren und anwenden möchten. Mitglieder erhalten exklusive monatliche KI-Updates, Experten-Interviews, Vorträge des KI-Speaker-Slams, KI-Café-Aufzeichnungen und einen 3-stündigen ChatGPT-Kurs. Tausche dich mit über 2800 KI-Enthusiasten aus, stelle Fragen und starte durch. Initiiert von Torsten & Birgit Koerting, bietet die KI-Lounge Orientierung und Inspiration für den Einstieg in die KI-Revolution. Hier findet der Austausch statt ... www.koerting-institute.com/ki-lounge/ Starte mit uns in die 1:1 Zusammenarbeit Wenn du direkt mit uns arbeiten und KI in deinem Business integrieren möchtest, buche dir einen Termin für ein persönliches Gespräch. Gemeinsam finden wir Antworten auf deine Fragen und finden heraus, wie wir dich unterstützen können. Klicke hier, um einen Termin zu buchen und deine Fragen zu klären. Buche dir jetzt deinen Termin mit uns ... www.koerting-institute.com/termin/ Weitere Impulse im Netflix Stil ... Wenn du auf der Suche nach weiteren spannenden Impulsen für deine Selbstständigkeit bist, dann gehe jetzt auf unsere Impulseseite und lass die zahlreichen spannenden Impulse auf dich wirken. Inspiration pur ... www.koerting-institute.com/impulse/ Die Koertings auf die Ohren ... Wenn dir diese Podcastfolge gefallen hat, dann höre dir jetzt noch weitere informative und spannende Folgen an ... über 500 Folgen findest du hier ... www.koerting-institute.com/podcast/ Wir freuen uns darauf, dich auf deinem Weg zu begleiten!
Con lo sviluppo esponenziale dell'Intelligenza Artificiale Generativa, è diventato sempre più difficile distinguere i contenuti creati dalle macchine da quelli prodotti dagli esseri umani. Se 5 o 6 anni fa eravamo già preoccupati dal dilagare di fake news e video deepfake, oggi la situazione è ancora più grave. I social sono invasi da “AI slop”, mentre il web si riempie di applicazioni sviluppate interamente con l'IA, spesso prive di supervisione adeguata. È quindi diventato fondamentale sviluppare sistemi e tecnologie in grado di permettere a utenti e applicazioni di riconoscere se un contenuto – sia esso un video, un testo, un'immagine o una musica – è stato generato totalmente o parzialmente da una macchina. Ma è ancora possibile farlo? E quanto sono affidabili i sistemi che stanno nascendo per questo scopo? In questa puntata proviamo a rispondere a queste domande.Nella sezione delle notizie parliamo di una nuova tecnica CRISPR contro i tumori, della class action contro Apple per le promesse non mantenute su Siri e Apple Intelligence e infine dell'avvio della produzione del camion elettrico di Tesla.--Indice--00:00 - Introduzione01:12 - Una nuova tecnica CRISPR contro i tumori (ANSA.it, Luca Martinelli)02:58 - Siri nel mirino di una class action (Wired.com, Davide Fasoli)04:10 - Tesla avvia la produzione del tir Semi (DMove.it, Matteo Gallo)05:37 - Riconoscere contenuti generati dall'IA. È ancora possibile? (Luca Martinelli)18:35 - Conclusione--Testo--Leggi la trascrizione: https://www.dentrolatecnologia.it/S8E19#testo--Contatti--• www.dentrolatecnologia.it• Instagram (@dentrolatecnologia)• Telegram (@dentrolatecnologia)• YouTube (@dentrolatecnologia)• redazione@dentrolatecnologia.it--Brani--• Ecstasy by Rabbit Theft• Falling For You by SouMix & Bromar
Stegawave, an Irish technology company specialising in forensic watermarking for video content, has announced the launch of its anti-piracy platform for live sports streaming. Using a proprietary watermarking algorithm to embed invisible patterns into live streams, Stegawave identifies piracy sources in real time, shutting down illegal redistribution within minutes of detection. The platform integrates with existing streaming workflows and distribution systems, requiring no changes to a content owner's existing infrastructure. Stegawave customers also have the option to push alternative content or messaging to the illegal stream destination. Live sports piracy costs the global broadcasting industry billions annually, with the U.S. Chamber of Commerce estimating the impact on the US economy alone at more than $29 billion per year. Illegal IPTV services, often referred to as 'dodgy boxes', restream content within minutes of broadcast, and legacy content protection technology cannot identify the source of a leak. For smaller and mid-tier organisations, the financial impact is proportionally even greater, as lost subscribers directly threaten the viability of grassroots and regional sports attendance and coverage. The announcement follows a successful deployment with Clubber TV, a leading live sports platform. Sports broadcasters worldwide are facing persistent piracy from illegal IPTV services that directly impact subscription revenue and the financial viability of the coverage itself. Stegawave was deployed across live broadcasts, detecting pirated streams and identifying the specific subscriber accounts responsible. Stegawave achieved a 100% detection rate across all streams, and because many illegal IPTV services share the same source account, blocking a single compromised account simultaneously disabled multiple pirate streams, amplifying the impact of each enforcement action. "Piracy is a massive threat to the sustainability of sports broadcasting at all levels, including grassroots coverage. This technology is potentially game-changing for us to ensure, following significant rights fee investments, that fans are only watching Clubber games on our platform," said Jimmy Doyle, CEO, Clubber. "It's been a pleasure to work alongside Clubber in stopping piracy of their premium matches. The work we have done together has helped us improve the Stegawave product and also resulted in new features. With the increase in illegal streaming not just in Ireland but worldwide, there is real momentum behind tackling this problem, and Stegawave can play a key role in tackling it both here and internationally. We look forward to supporting sports rights holders and broadcasters across the globe in recovering lost revenue and protecting their premium content." said Sean Fahey, CEO, Stegawave. Stegawave is now available for streaming platforms, sports broadcasters and rights holders who are looking to protect their premium content and maximise their Pay-Per-View and Subscription revenues. The Stegawave team will be attending NAB Show in Las Vegas from 18-22 April 2026, showcasing the platform to technology partners, content owners and wider media. See more stories here. More about Irish Tech News Irish Tech News are Ireland's No. 1 Online Tech Publication and often Ireland's No.1 Tech Podcast too. You can find hundreds of fantastic previous episodes and subscribe using whatever platform you like via our Anchor.fm page here: https://anchor.fm/irish-tech-news If you'd like to be featured in an upcoming Podcast email us at Simon@IrishTechNews.ie now to discuss. Irish Tech News have a range of services available to help promote your business. Why not drop us a line at Info@IrishTechNews.ie now to find out more about how we can help you reach our audience. You can also find and follow us on Twitter, LinkedIn, Facebook, Instagram, TikTok and Snapchat.
Is Generative AI moving too fast? From viral deepfake videos to powerful coding assistants, AI is reshaping our world at a breathtaking pace. But with this power comes immense risk: to our privacy, to intellectual property, and even to our ability to tell what's real. How do we navigate this complex new landscape responsibly?In this episode, Allen sits down with Maggie Engler and Numa Dhamani, authors of "Intro to Gen AI, Second Edition" and veterans in the fields of cybersecurity and trust & safety. They pull back the curtain on how these powerful models are built, the societal impact they're having, and the urgent conversations we need to have about data governance, AI agents, and the looming digital trust crisis.IN THIS EPISODE00:00 - Prompt Engineering, AI Agents & More03:27 - The Guests' Backgrounds in Cybersecurity and Trust & Safety09:31 - The Hidden Risks of Sharing Your Data with AI13:02 - Copyright vs. AI18:52 - The Digital Trust Crisis24:36 - Watermarking and Digital Verification30:25 - Using Proprietary Code with AI Assistants36:56 - AI Agents42:37 - Who This Book Is For (and Who It's Not For)
We celebrate Thanksgiving with some light hearted banter, and exploring the question of whether "High Watermarking" can lead to disappointment (even if it's hypothetical). We hope you're enjoying time with your family and friends this week, but if you want your weekly dose of personal finance content, we're here for you!Send us a textSend your questions for upcoming show to checkyourbalances@outlook.com @checkyourbalances on Instagram
Show NotesAs artificial intelligence begins generating music from vast datasets of human art, a fundamental question emerges: who truly owns the sound of AI? This episode of Music Evolves brings together a law student and former musician Chandler Lawn, music industry executive and professor Drew Thurlow, Michael Sheldrick, Co-Founder of Global Citizen, and intellectual property attorney Puya Partow-Navid, alongside hosts Sean Martin and Marco Ciappelli, to examine how AI is reshaping authorship, licensing, and the meaning of originality.The panel explores how AI democratizes creation while exposing deep ethical and economic gaps. Lawn raises the issue of whether artists whose works trained AI models deserve compensation, asking if innovation can be ethical when built on uncompensated labor. Thurlow highlights how, despite fears of automation, generative AI music accounts for less than 1% of streaming royalties—suggesting opportunity, not replacement.Sheldrick connects the conversation to a broader global context, describing how music's economic potential could drive sustainable development if nations modernize copyright frameworks. He views this shift as a rare chance to position creative industries as engines for jobs and growth.Partow-Navid grounds the discussion in legal precedent, pointing to landmark cases—from Two Live Crew to George R. R. Martin—as markers of how courts may interpret fair use, causality, and global jurisdiction in AI-driven creation.Together, the guests agree that the debate extends beyond legality. It's about the emotional authenticity that makes music human. As Chandler notes, “We connect through imperfection.” Marco adds that live performance may ultimately anchor value in a world saturated by digital replication.This conversation captures the tension—and promise—of a future where music, technology, and law must learn to play in harmony.GuestsChandler Lawn, AI Innovation and Law Fellow at The University of Texas School of Law | On LinkedIn: https://www.linkedin.com/in/chandlerlawn/Drew Thurlow, Adjunct Professor at Berklee College of Music | On LinkedIn: https://www.linkedin.com/in/drewthurlow/Michael Sheldrick, Co-Founder and Chief Policy, Impact and Government Affairs Officer at Global Citizen | On LinkedIn: https://www.linkedin.com/in/michael-sheldrick-30364051/Puya Partow-Navid, Partner at Seyfarth Shaw LLP | On LinkedIn: https://www.linkedin.com/in/puyapartow/Marco Ciappelli, Co-Founder, ITSPmagazine and Studio C60 | Website: https://www.marcociappelli.comHostSean Martin, Co-Founder at ITSPmagazine, Studio C60, and Host of Redefining CyberSecurity Podcast & Music Evolves Podcast | Website: https://www.seanmartin.com/ResourcesLegal Publication: You Can't Alway Get What You Want: A Survey of AI-related Copyright Considerations for the Music Industry published in Vol. 32, No. 3 of the Texas State Bar Entertainment and Sports Law Journal.BOOK: Machine Music: How AI Is Transforming Music's Next Act by Drew Thurlow: https://www.routledge.com/Machine-Music-How-AI-is-Transforming-Musics-Next-Act/Thurlow/p/book/9781032425242BOOK: From Ideas to Impact: A Playbook for Influencing and Implementing Change in a Divided World by Michael Sheldrick: https://www.fromideastoimpact.com/AI and Copyright Blogs:https://www.gadgetsgigabytesandgoodwill.com/category/ai/https://www.gadgetsgigabytesandgoodwill.com/2025/11/dr-thaler-is-right-in-part/https://www.gadgetsgigabytesandgoodwill.com/2025/07/californias-ai-law-has-set-rules-for-generative-ai-are-you-ready/https://www.gadgetsgigabytesandgoodwill.com/2025/06/copyright-office-firings-spark-constitutional-concerns-amid-ai-policy-tensions/Newsletter (Article, Video, Podcast): The Human Touch in a Synthetic Age: Why AI-Created Music Raises More Than Just Eyebrows: https://www.linkedin.com/pulse/human-touch-synthetic-age-why-ai-created-music-raises-martin-cissp-s9m7e/Article — Universal and Sony Music partner with new platform to detect AI music copyright theft using ‘groundbreaking neural fingerprinting' technology: https://www.musicbusinessworldwide.com/universal-and-sony-music-partner-with-new-platform-to-detect-ai-music-copyright-theft-using-groundbreaking-neural-fingerprinting-technology/Article: When Virtual Reality Is A Commodity, Will True Reality Come At A Premium: https://sean-martin.medium.com/when-virtual-reality-is-a-commodity-will-true-reality-come-at-a-premium-4a97bccb4d72Global Citizen: https://www.globalcitizen.org/Gallo Music (Gallo Records, South Africa): https://www.gallo.co.za/Global Citizen Festival: https://www.globalcitizen.org/en/festival/Andy Warhol Foundation v. Goldsmith (Shepard Fairey / “Hope” poster context): https://supreme.justia.com/cases/federal/us/598/21-869/case.pdfGeorge R. R. Martin / Authors Guild v. OpenAI (current AI training lawsuit): https://authorsguild.org/news/ag-and-authors-file-class-action-suit-against-openai/Campbell v. Acuff-Rose Music, Inc. (2 Live Crew “Pretty Woman”): https://supreme.justia.com/cases/federal/us/510/569/Vanilla Ice / “Under Pressure” Sampling Case: https://blogs.law.gwu.edu/mcir/case/queen-david-bowie-v-vanilla-ice/MIDiA Research — AI in Music Reports: https://www.midiaresearch.com/reports/ai-and-the-future-of-music-the-future-is-already-hereMerlin (Global Independent Rights Organization): https://www.merlinnetwork.org/Instagram Reel re: Spotify Terms: https://www.instagram.com/reel/DOrgbUNCYj_/ Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Robert Bateman is a Senior Partner at Privacy Partnership, which provides consultancy and training on data protection and AI regulation, as well as legal advice via its associated law firm, Privacy Partnership Law. He also hosts The Privacy Partnership Podcast.This is Robert's third appearance on the show. We have covered three hot topics:* How far do we take watermarking of AI-generated content under article 50 of the AI Act?* How do pre-defined legitimate interest scenarios work under the UK Data (Use and Access) Act?* What is the tension between the Online Safety Act and the new data protection framework in the UK?References:SIGN UP NOW for the Masters of Privacy NYC LIVE recording and networking event on Nov 6 (if you happen to be in town)* Robert Bateman on LinkedIn* Robert Bateman on Bluesky* The Privacy Partnership Podcast* AI Act (EU Commission's resources)* Data (Use and Access) Act 2025: data protection and privacy changes* The EU approach to age verification (EU Commission)* EU follows UK with age verification in 2026 (PPC Land)* Wikipedia loses challenge against Online Safety Act verification rules (BBC)* Robert Bateman: the EDPB's Opinion on auditing subprocessors and the future of Meta's unskippable ads (Masters of Privacy, Nov 2024)* Robert Bateman: Consent or Pay (Masters of Privacy, Oct 2023) This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.mastersofprivacy.com/subscribe
In this episode, Anna Rose and Tarun Chitra chat with Miranda Christ, a computer science PhD student at Columbia University, about the intersection of cryptography and AI through watermarking techniques. Miranda shares her research on developing imperceptible ways to prove that content was created by AI models, covering everything from simple red-green word lists to sophisticated pseudorandom error-correcting codes. The discussion explores the cryptographic properties of watermarks - including completeness, soundness, and undetectability - and how these parallel the properties we see in zero-knowledge proof systems. Miranda explains how watermarking differs from other cryptographic approaches like ZKML by only modifying the sampling process rather than the underlying model weights, making it computationally lightweight and practical for deployment. Related links: Episode 206: Distilling DeFi Primitives with Guillermo, Alex and Tarun My AI Safety Lecture for UT Effective Altruism Google SynthID Amazon Public Watermark Detector How ChatGPT could embed a ‘watermark' in the text it generates - New York Times Wall Street Journal on OpenAI not Deploying Watermarks A Watermark for Large Language Models Undetectable Watermarks for Language Models Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models Pseudorandom Error-Correcting Codes Ideal Pseudorandom Codes Check out the latest jobs in ZK at the ZK Podcast Jobs Board
In this AI research paper reading, we dive into "A Watermark for Large Language Models" with the paper's author John Kirchenbauer. This paper is a timely exploration of techniques for embedding invisible but detectable signals in AI-generated text. These watermarking strategies aim to help mitigate misuse of large language models by making machine-generated content distinguishable from human writing, without sacrificing text quality or requiring access to the model's internals.Learn more about the A Watermark for Large Language Models paper. Learn more about agent observability and LLM observability, join the Arize AI Slack community or get the latest on LinkedIn and X.Learn more about AI observability and evaluation, join the Arize AI Slack community or get the latest on LinkedIn and X.
Sometimes it's easy to tell whether a video is fake, other times, it's not. Watermarking is used to digitally stamp fake videos, whether that stamp is visible to the human eye or is embedded in the video's data. But with new technology that allows for the stamp to be removed without anyone noticing, how is regulation enforced? Host Mike Eppel speaks to Andre Kassis, University of Waterloo PhD candidate in computer science, and Angus Lockhart, senior policy analyst at 'The Dais' with Toronto Metropolitan University to discuss the safeguards in place to ensure AI-produced content is labelled accordingly and who can be held accountable if the rules start to bend. We love feedback at The Big Story, as well as suggestions for future episodes. You can find us:Through email at hello@thebigstorypodcast.ca Or @thebigstoryfpn on Twitter
When I reflect on all the exciting things happening with the podcast, not only am I incredibly grateful, but it's also really fun to look back at how it all started. So in this episode, I'm re-listening to my first handful of episodes and asking myself, “Do they still hold up?!” I thought before listening that I had a pretty good idea of what I said in these early episodes, but I actually kind of surprised myself sometimes! Listen in as I share the details on what has stood the test of time… and what I have a new way of thinking about. Here's a preview: Stood the test of time: 3 important things for your Instagram Email signature Directing people to a contact form on your website Getting a Google Business Profile Instagram caption ideas & social media schedulers Automation ideas & endorsement for 17hats CRM Profit First cash flow system I have a new way of thinking about: Some of the resources I created and mentioned in the early episodes Business email accounts Importance of styled shoots Watermarking photos I also share my episode plan for the year, in case you're curious! And in the UGlu Hotline, hear a tip for transporting framework in a personal vehicle. RESOURCES MENTIONED: Presenting sponsor: 17hats (get 50% off your 1st year) Other sponsors & resources: Havin' A Party Wholesale (save 5% with code BRIGHT) Courtney Lynette Creative Co 2025 Bright Balloon Business Planner UGlu by Pro Tapes (save 5% at Havin' A Party with code BRIGHT) Call into the UGlu Hotline to ask a question or leave advice! (262) 221-8514 Balloon Boss Mastermind & Summit - - - - Get bonus episodes 50 Ideas for Email Marketing | Join the Bright Balloon email list Freeform Freedom | More courses @thebrightballoon The Bright Balloon on YouTube
How fast is the AI race really going? What is the current state of Quantum Computing? What actually *is* the P vs NP problem? - former OpenAI researcher and theoretical computer scientist Scott Aaronson joins Liv and Igor to discuss everything quantum, AI and consciousness. We hear about his experience working on OpenAI's "superalignment team", whether quantum computers might break Bitcoin, the state of University Admissions, and even a proposal for a new religion! Strap in for a fascinating conversation that bridges deep theory with pressing real-world concerns about our technological future. Chapters: 1:30 - Working at OpenAI 4:23 - His Approaches to AI Alignment 6:23 - Watermarking & Detection of AI content 19:15 - P vs. NP 27:11 - The Current State of AI Safety 37:38 - Bad "Just-a-ism" Arguments around LLMs 48:25 - What Sets Human Creativity Apart from AI 55:30 - A Religion for AGI? 1:00:49 - More Moral Philosophy 1:05:24 - The AI Arms Race 1:11:08 - The Government Intervention Dilemma 1:23:28 - The Current State of Quantum Computing 1:36:25 - Will QC destroy Cryptography? 1:48:55 - Politics on College Campuses 2:03:11 - Scott's Childhood & Relationship with Competition 2:23:25 - Rapid-fire Predictions Links: ♾️ Scott's Blog: https://scottaaronson.blog/ ♾️ Scott's Book: https://www.amazon.com/Quantum-Computing-since-Democritus-Aaronson/dp/0521199565 ♾️ QIC at UTA: https://www.cs.utexas.edu/~qic/ Credits Credits: ♾️ Hosted by Liv Boeree and Igor Kurganov ♾️ Produced by Liv Boeree ♾️ Post-Production by Ryan Kessler The Win-Win Podcast: Poker champion Liv Boeree takes to the interview chair to tease apart the complexities of one of the most fundamental parts of human nature: competition. Liv is joined by top philosophers, gamers, artists, technologists, CEOs, scientists, athletes and more to understand how competition manifests in their world, and how to change seemingly win-lose games into Win-Wins. #WinWinPodcast #QuantumComputing #AISafety #LLM
Agents, agents, and more agents! In Episode 27 of Mixture of Experts, host Tim Hwang is joined by Volkmar Uhlig and Vyoma Gajjar. First, the experts chat about Mark Benioff's spicy tweet, and what this means for the future of AI agents. Next, how much energy is needed to power AI models, and should we be concerned? Then, the experts debrief Anthropic's release of computer use. Finally, Google is integrating SynthID-Text into Gemini to help watermark AI-generated text, do we need this feature? Learn more on today's Mixture of Experts.The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.
NTD Good Morning—8/26/20241. US Works to Avert Escalation After Israel-Hezbollah Exchange2. Reuters Safety Adviser Killed in Missile Strike in Ukraine3. US Targets Russian, Chinese Firms for Aiding Russia4. Harris, Walz Head to Georgia5. Watermarking, Validation Needed to Combat AI Risk: Advocate6. Flash Flooding in Grand Canyon Claims Life of Hiker7. French Media: Telegram Co-Founder Detained8. Microsoft Plans Summit After Crowdstrike Outage9. Italy Begins Manslaughter Probe Into Yacht Sinking10. SpaceX Will Bring 2 NASA Astronauts Home11. How Will Kennedy Endorsement Affect Trump Campaign?12. Dr. Fauci Recovering From West Nile Virus13. Why Teens are so Drawn to Social Media14. How Travel Spending is Shifting15. NJ Transit Offering 'Fare Holiday' Next Week16. French Mark 80 Years of Paris Liberation From Nazis17. Paralympic Torch Crosses Channel and Begins French Journey18. Titanic Article Discovered in Wardrobe Auctioned Off19. Redheads Gather to Light up Dutch Festival20. Chef Defies Doctors With Device That Keeps Him Cooking21. The Resilient Academic: Barbara Gitenstein's Journey22. What's Next for the Democratic and Republican Parties?23. Hurricane Sales Tax Holiday Underway in Florida24. Mountains See Early Dusting of Snow in August25. 102-Year-Old Woman Becomes UK's Oldest Skydiver26. Why Inflation is Low But Prices Remain High27. Realtors Adapt to New Buyer Agent Rules
Join Allen and Linda as they dive into Google's Imagen 3 and Imagen 3 Fast, a powerful new set of image generation models. We explore its capabilities, pricing, features, and limitations, including a deep dive into the API and how to use it with Python code. This episode features an in-depth look at Imagen 3's photorealism and comparison with its predecessor, Imagen 2. We examine the ethical implications of AI image generation, discussing copyright issues, plagiarism concerns, and the impact on artists. Don't miss the stunning visuals and thought-provoking discussion! Resources: * https://console.cloud.google.com/vertex-ai/generative/vision * https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview * https://fc-art.medium.com/seurat-pointillism-e240074a03dc * https://commons.wikimedia.org/wiki/File:La_Libert%C3%A9_guidant_le_peuple_-_Eug%C3%A8ne_Delacroix_-_Mus%C3%A9e_du_Louvre_Peintures_RF_129_-_apr%C3%A8s_restauration_2024.jpg Timestamps: * 00:00:31 : Introducing Imagen 3 & its Photorealistic Power * 00:02:59 : Imagen 3 vs. Imagen 3 Fast: Speed, Quality, & Pricing * 00:05:33 : Copyright & Commercial Use of Imagen-Generated Images * 00:06:14 : Exploring the Imagen API with Python Code Examples * 00:09:08 : Using Gemini to Generate Prompts for Imagen * 00:11:15 : The Importance of Seed Control for Image Consistency * 00:13:24 : Watermarking & Identifying AI-Generated Images * 00:14:51 : Navigating Imagen's Safety Filters & Limitations * 00:18:13 : Live Demo: Generating a Cat Image with Imagen 3 * 00:18:55 : Future Potential: Editing & Outcropping Capabilities * 00:22:26 : Upscaling Images with Imagen: Costs & Possibilities * 00:23:18 : Comparing Image Styles Across Imagen Versions (with Visuals!) * 00:28:58 : Confronting the Ethical Concerns of AI Image Generation * 00:29:05 : Real-World Examples: Inappropriate Content & Plagiarism * 00:36:30 : The Impact of AI on Artists & the Definition of Art * 00:37:06 : Transparency & Responsibility: Crediting AI in Creative Work * 00:39:53 : Final Thoughts: Will We Continue Using Imagen 3? Thumbnail created with Imagen 3 using the prompt: Create a compelling thumbnail for a YouTube video podcast about AI image generation, specifically Google's Imagen 3, featuring two hosts. Include two distinct, friendly faces – one male, one female – representing the podcast hosts. The male host should be wearing glasses and a light blue collared shirt. The female host should be wearing glasses, her hair tied back in a pony tail, and wearing a grey t-shirt that says "MakerSuite". They should be facing the viewer with engaging expressions, perhaps a mix of excitement and contemplation. Showcase a visually striking AI-generated image emerging from a laptop screen or a thought bubble. The overall image should be done in the style of George Seurat. (Episode number and title added manually.) #VertexAI #Imagen3 #GenAI
In episode 578, Kathy Berget teaches us what to do when our blog images have been used without our permission and how to get compensated for them. Kathy is the author, photographer, recipe developer, and writer at Beyond the Chicken Coop where she creates delicious home-cooked recipes utilizing what they grow and raise. Kathy is a former elementary school principal and has three grown kids; twin boys and one girl. Kathy and her husband live in the country on their own little farm. In this episode, you'll learn about copyright for images, ways in which you can respond and how to use image theft protection services like Pixsy to get compensated. Key points discussed: - Image theft is a common issue for food bloggers: Food blog images are often stolen or used without permission by various businesses, including restaurants, markets, and online retailers. - Copyright protection is automatic for blog images: Photographers automatically have copyright over their images, even without formal registration. - Services like Pixsy can help track and fight image theft: Pixsy is an online service that helps photographers find and fight unauthorized use of their images. - Responding to image theft requires a balanced approach: Do not obsess over every instance of image theft - selectively pursue cases that are worth the time and effort (especially if images are used for commercial purposes). - Watermarking images can increase their value if stolen: Removing or altering a watermark on an image can actually increase its value if used without permission. - Educating the public about image rights is an ongoing challenge: Many people may be unaware that using images found online without permission is considered theft. - Persistence and documentation are key when pursuing image theft cases: It's important to thoroughly document evidence and follow through with service providers like Pixsy. - Maintaining a positive attitude is important when dealing with image theft: Do no let image theft issues negatively impact your overall mindset and productivity. If You Loved This Episode… You'll love Episode 390 with Rob Finkelstein - Legal Issues Every Food Photographer Should Consider Connect with Kathy Berget Website | Instagram
CrowdStrike said Delta's woes aren't its fault after the massive IT outage, OpenAI confirms it's looking into text watermarking for ChatGPT that could expose cheating students, and Nat Geo's first Vision Pro immersive environment takes you to Iceland. It's Tuesday, August 6th and this is Engadget News. Learn more about your ad choices. Visit podcastchoices.com/adchoices
Plus, Apple has finally started sending out payments from its butterfly keyboard settlement. Learn more about your ad choices. Visit podcastchoices.com/adchoices
#SecurityConfidential #DarkRhiinoSecurity Aaron is a Security Confidential Alumni, Entrepreneur, Author, former VP of Microsoft in China, and the CEO of Nametag Inc, the company that invented “Sign in with ID” as a more secure alternative to passwords. 00:00 Intro 00:57 Our Guest 01:46 Social Engineering trends 04:03 Deep fakes: how does it work? 09:18 Watermarking content 11:30 Deepfake Prevention: Injection attack 13:11: Deepfake prevention: Presentation attack 15:00 How do you verify behind a screen? 27:16 Hidden security in your phones 32:08 Social Engineering and MFA in Healthcare 41:18 How to maintain LOYAL Employees 46:15 China: Friend or Foe? 50:13 Connecting with Aaron ------------------------------------------------------------------ Watch our other episode with Aaron: https://youtu.be/m2PLow9cWSE ------------------------------------------------------------------ To learn more about Nametag visit https://getnametag.com/ To learn more about Dark Rhiino Security visit https://www.darkrhiinosecurity.com ----------------------------------------------------------------- SOCIAL MEDIA: Stay connected with us on our social media pages where we'll give you snippets, alerts for new podcasts, and even behind the scenes of our studio! Instagram: @securityconfidential and @Darkrhiinosecurity Facebook: @Dark-Rhiino-Security-Inc Twitter: @darkrhiinosec LinkedIn: @dark-rhiino-security Youtube: @DarkRhiinoSecurity ------------------------------------------------------------------ #darkrhiinosecurity #securityconfidential #cybersecurity #cyberpodcast #ai #artificialintelligence #securitypodcast #cybernews #technews #techsoftware #informationtechnology #infosec #cybersecurityforbeginners #technewstoday
In this episode, we discuss Microsoft's investment in G42 and questions surrounding G42's ties to China (1:12), the latest reporting about the Israeli military's use of AI and policy implications advanced technologies in warfare (9:23), and Meta's new watermarking policy (23:01). aipolicypodcast@csis.org Wadhwani Center for AI and Advanced Technologies | CSIS The DARPA Perspective on AI and Autonomy at the DOD | CSIS Events Scaling AI-enabled Capabilities at the DOD: Government and Industry Perspectives: The State of DOD AI and Autonomy Policy:
Dive into the realm of innovation with Microsoft's revolutionary AI features and their strategy for implementing image watermarking. Explore how these advancements are poised to reshape the AI landscape and enhance content protection in the digital era. Get on the AI Box Waitlist: AIBox.aiJoin our ChatGPT Community: Facebook GroupFollow me on Twitter: Jaeden's Twitter
ChatGPT: OpenAI, Sam Altman, AI, Joe Rogan, Artificial Intelligence, Practical AI
Join us as we uncover Microsoft's next-gen AI features and their plans to introduce image watermarking. Delve into the potential of these advancements to drive innovation and safeguard digital content in an AI-powered world. Get on the AI Box Waitlist: AIBox.aiJoin our ChatGPT Community: Facebook GroupFollow me on Twitter: Jaeden's Twitter
Explore the forefront of AI innovation with Microsoft's cutting-edge announcements and their strategic approach to image watermarking. Join the conversation as we examine the potential impact of these advancements on digital content creation and protection. Get on the AI Box Waitlist: AIBox.aiJoin our ChatGPT Community: Facebook GroupFollow me on Twitter: Jaeden's Twitter
Explore the latest from Microsoft as they announce groundbreaking AI features and unveil plans for image watermarking. Delve into the potential of these advancements to revolutionize the AI landscape and enhance content security in the digital age. Get on the AI Box Waitlist: AIBox.aiJoin our ChatGPT Community: Facebook GroupFollow me on Twitter: Jaeden's Twitter
Join the revolution in visual security as DeepMind and Google Cloud unveil their invisible AI image watermarking solution. Delve into the collaborative efforts reshaping the landscape of image protection and authenticity. Get on the AI Box Waitlist: AIBox.ai Join our ChatGPT Community: Facebook Group Follow me on Twitter: Jaeden's Twitter
In this episode, we analyze Steg.AI's recent $5 million seed round, exploring the significance of their invisible watermarking technology and its potential applications in enhancing the security of images and documents. Invest in AI Box: https://Republic.com/ai-box Get on the AI Box Waitlist: https://AIBox.ai/ AI Facebook Community Learn About ChatGPT Learn About AI at Tesla
Embark on a journey into the realm of invisible safeguards as DeepMind and Google Cloud collaborate on AI image watermarking innovation. Explore the protective measures that redefine the security of visual content. Get on the AI Box Waitlist: AIBox.ai Join our ChatGPT Community: Facebook Group Follow me on Twitter: Jaeden's Twitter
ChatGPT: OpenAI, Sam Altman, AI, Joe Rogan, Artificial Intelligence, Practical AI
Explore the collaboration between DeepMind and Google Cloud as they become guardians of visual integrity through invisible AI image watermarking. Uncover the transformative impact on securing and preserving the authenticity of visual content. Get on the AI Box Waitlist: AIBox.ai Join our ChatGPT Community: Facebook Group Follow me on Twitter: Jaeden's Twitter
Step into the future of image protection with the invisible AI image watermarking partnership between DeepMind and Google Cloud. Explore the innovative collaboration that promises to revolutionize visual content security. Get on the AI Box Waitlist: AIBox.ai Join our ChatGPT Community: Facebook Group Follow me on Twitter: Jaeden's Twitter
In this episode, we explore the world of invisible signatures as I delve into the confluence between DeepMind and Google Cloud in the arena of AI image watermarking. Join me for a solo discussion, where we uncover the techniques, applications, and the transformative potential of this collaboration in safeguarding digital imagery. Invest in AI Box: https://Republic.com/ai-box Get on the AI Box Waitlist: https://AIBox.ai/ AI Facebook Community Learn About ChatGPT Learn About AI at Tesla
Marvel at the technological prowess as Steg.AI raises $5M to perfect invisible watermarking on digital assets. Join this episode to explore the marvels, potential applications, and the transformative impact of Steg.AI's cutting-edge technology on the security of digital content.
ChatGPT: OpenAI, Sam Altman, AI, Joe Rogan, Artificial Intelligence, Practical AI
Immerse yourself in the digital revolution as Steg.AI secures $5M in a seed round, shaping the future of watermarking technology. Join this episode for a deep dive into the financial backing, technological advancements, and the potential applications of Steg.AI's innovative approach to digital content protection.
Embark on a journey into the tech frontier as Steg.AI's $5M seed round sets new standards in watermarking technology. Join this episode to delve into the frontiers of technology, explore the challenges faced, and discuss the potential advancements in securing digital content.
Stay ahead of the curve as Steg.AI secures $5M in funding for its invisible watermarking innovation, setting the stage for the future of digital content protection. Join this episode to explore the financial backing, technological advancements, and the potential applications of Steg.AI's cutting-edge technology.
Join me in this episode as we discuss Microsoft's recent foray into AI advancements, including major features and the introduction of watermarking for AI-generated images. We explore the implications of these developments and how they position Microsoft at the forefront of AI innovation. Invest in AI Box: https://Republic.com/ai-box Get on the AI Box Waitlist: https://AIBox.ai/ AI Facebook Community Learn About ChatGPT Learn About AI at Tesla
In this episode, we uncover the groundbreaking collaboration between DeepMind and Google Cloud, revealing their trailblazing approach utilizing AI for imperceptible image watermarking, potentially revolutionizing digital asset protection. Invest in AI Box: https://Republic.com/ai-box Get on the AI Box Waitlist: https://AIBox.ai/ AI Facebook Community Learn more about AI in Video Learn more about Open AI
In this episode, we explore the strategic alliance formed by DeepMind and Google Cloud, focusing on their joint efforts to develop Invisible AI Image Watermarking and its potential significance in safeguarding digital content. Invest in AI Box: https://Republic.com/ai-box Get on the AI Box Waitlist: https://AIBox.ai/ AI Facebook Community Learn About ChatGPT Learn About AI at Tesla
In this episode, we unravel the hidden brilliance behind Steg.AI's $5 million seed funding, unlocking the potential for groundbreaking advancements in the realm of invisible watermarking for images and documents. Invest in AI Box: https://Republic.com/ai-box Get on the AI Box Waitlist: https://AIBox.ai/ AI Facebook Community Learn About ChatGPT Learn About AI at Tesla
On this episode of The AI Moment, we discuss an emerging Gen AI trend: Watermarking & other strategies for licensing AI training data & combating malicious AI generated content. As we move into year two of Generative AI, some themes have emerged in terms of the downsides to the technology. Two of the biggest downsides have been: Combating malicious or misleading AI-generated content, and Copyright/IP rights for both non AI generated and AI generated content Initiatives by Google, Fox-Polygon, the Content Authenticity Initiative and academic researchers focused primarily on digital watermarking are the latest and most prominent attempts to address these issues. What will the impact of this trend be to enabling or stifling gen AI be?
In this episode, we explore the latest strides made by Microsoft in the realm of artificial intelligence, covering the announcement of significant features and the introduction of watermarking for AI-generated images. Join me as we delve into the details of these advancements and their potential impact on the AI landscape. Invest in AI Box: https://Republic.com/ai-box Get on the AI Box Waitlist: https://AIBox.ai/ AI Facebook Community Learn About ChatGPT Learn About AI at Tesla
In this episode, we explore the groundbreaking collaboration between DeepMind and Google Cloud, diving into their innovative approach to invisible AI image watermarking and its implications. Invest in AI Box: https://Republic.com/ai-box Get on the AI Box Waitlist: https://AIBox.ai/ AI Facebook Community
Join me as we dissect the synergistic efforts of DeepMind and Google Cloud, delving into their pioneering strides in deploying invisible AI image watermarking technology. Invest in AI Box: https://Republic.com/ai-box Get on the AI Box Waitlist: https://AIBox.ai/ AI Facebook Community
In this episode, I unravel Steg.AI's revolutionary technology securing images and documents through invisible watermarking, diving into the significance of their recent $5M seed funding. Invest in AI Box: https://Republic.com/ai-box Get on the AI Box Waitlist: https://AIBox.ai/ AI Facebook Community
Discover in this episode the key highlights from Microsoft's latest AI announcements and dive deep into the concept of watermarking AI-generated images, exploring its implications and importance. Invest in AI Box: https://Republic.com/ai-box Get on the AI Box Waitlist: https://AIBox.ai/ AI Facebook Community Learn more about AI in Video Learn more about Open AI
In this episode, we delve into the implications of Steg.AI's $5 million investment in developing state-of-the-art invisible watermarking. We discuss the benefits for digital artists and the broader implications for content security. Invest in AI Box: https://Republic.com/ai-box Get on the AI Box Waitlist: https://AIBox.ai/ AI Facebook Community Learn more about AI in Video Learn more about Open AI
In this episode, explore Steg.AI's revolutionary strides with a $5M seed funding dedicated to advancing invisible watermarking technology for images and documents. Join me as I analyze the tech implications, applications, and the potential future of data security in this domain. Invest in AI Box: https://Republic.com/ai-box Get on the AI Box Waitlist: https://AIBox.ai/ AI Facebook Community Learn more about AI in Video Learn more about Open AI
2024 will be the biggest election year in world history. Forty countries will hold national elections, with over two billion voters heading to the polls. In this episode of Your Undivided Attention, two experts give us a situation report on how AI will increase the risks to our elections and our democracies. Correction: Tristan says two billion people from 70 countries will be undergoing democratic elections in 2024. The number expands to 70 when non-national elections are factored in.RECOMMENDED MEDIA White House AI Executive Order Takes On Complexity of Content Integrity IssuesRenee DiResta's piece in Tech Policy Press about content integrity within President Biden's AI executive orderThe Stanford Internet ObservatoryA cross-disciplinary program of research, teaching and policy engagement for the study of abuse in current information technologies, with a focus on social mediaDemosBritain's leading cross-party think tankInvisible Rulers: The People Who Turn Lies into Reality by Renee DiRestaPre-order Renee's upcoming book that's landing on shelves June 11, 2024RECOMMENDED YUA EPISODESThe Spin Doctors Are In with Renee DiRestaFrom Russia with Likes Part 1 with Renee DiRestaFrom Russia with Likes Part 2 with Renee DiRestaEsther Perel on Artificial IntimacyThe AI DilemmaA Conversation with Facebook Whistleblower Frances HaugenYour Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_
Content authenticity and enforcing copyright in the age of AI are proving difficult problems to solve.
In this episode, we discuss the AI gold rush and its impact on businesses. According to a study by IDC, companies are reaping 3.5 times returns on their AI investments, with a return on investment within 14 months on average. The report also highlights how generative AI is driving increased interest and investment in the technology. Additionally, the Oxford Internet Institute conducted a study that found AI skills and knowledge can increase a worker's salary by up to 40%. The study examined over 1,000 skills in 25,000 workers, showing the positive impact of AI-related knowledge on potential salaries.Three things to know today00:00 The AI Gold Rush Pays: Companies Reap 3.5x Returns & Salaries Surge by 40%04:50 Federal AI Blueprint Draws Industry Eyeballs as OMB Solicits Public Wisdom07:58 Watermarking, the AI Apocalypse, & Adult Content Leveraging AIAdvertiser: https://movebot.io/Looking for a link from the stories? The entire script of the show, with links to articles, are posted in each story on https://www.businessof.tech/Do you want the show on your podcast app or the written versions of the stories? Subscribe to the Business of Tech: https://www.businessof.tech/subscribe/Support the show on Patreon: https://patreon.com/mspradio/Want our stuff? Cool Merch? Wear “Why Do We Care?” - Visit https://mspradio.myspreadshop.comFollow us on:LinkedIn: https://www.linkedin.com/company/28908079/YouTube: https://youtube.com/mspradio/Facebook: https://www.facebook.com/mspradionews/Instagram: https://www.instagram.com/mspradio/TikTok: https://www.tiktok.com/@businessoftech
ChatGPT: News on Open AI, MidJourney, NVIDIA, Anthropic, Open Source LLMs, Machine Learning
Join us for an exciting episode as we delve into Microsoft's significant AI announcements, including the introduction of new features and the implementation of AI image watermarking. Explore the implications of these updates for a variety of applications and industries. Discover how Microsoft's innovations are reshaping the landscape of artificial intelligence in this must-listen podcast. Get on the AI Box Waitlist: https://AIBox.ai/Join our ChatGPT Community: https://www.facebook.com/groups/739308654562189/Follow me on Twitter: https://twitter.com/jaeden_ai
ChatGPT: News on Open AI, MidJourney, NVIDIA, Anthropic, Open Source LLMs, Machine Learning
DeepMind and Google Cloud join forces to revolutionize image watermarking with cutting-edge AI technology. Join us as we explore how this partnership is creating invisible watermarks to protect your digital images and artwork. Discover the potential impact on copyright protection and digital content security in this episode. Get on the AI Box Waitlist: https://AIBox.ai/Join our ChatGPT Community: https://www.facebook.com/groups/739308654562189/Follow me on Twitter: https://twitter.com/jaeden_ai
AI Applied: Covering AI News, Interviews and Tools - ChatGPT, Midjourney, Runway, Poe, Anthropic
In this episode, we delve into the groundbreaking $5 million seed funding round acquired by Steg.AI, a trailblazing company specializing in invisible watermarking for images and documents. Join us as we explore the implications of this significant investment and how Steg.AI is poised to transform the way we secure and protect digital content. Discover the innovative technology behind invisible watermarking and its potential applications in various industries. Get on the AI Box Waitlist: https://AIBox.ai/Join our ChatGPT Community: https://www.facebook.com/groups/739308654562189/Follow me on Twitter: https://twitter.com/jaeden_ai
ChatGPT: News on Open AI, MidJourney, NVIDIA, Anthropic, Open Source LLMs, Machine Learning
In this episode, we explore Steg.AI's impressive achievement of securing a $5 million seed round, propelling their mission to redefine image and document security through invisible watermarking technology. Join us as we uncover the potential applications of this cutting-edge solution across various industries and how it is set to reshape the way we protect and authenticate digital content. Dive into the innovative world of Steg.AI and their ambitious journey to enhance security in the digital age. Get on the AI Box Waitlist: https://AIBox.ai/Join our ChatGPT Community: https://www.facebook.com/groups/739308654562189/Follow me on Twitter: https://twitter.com/jaeden_ai
This episode is sponsored by Shopify. Shopify is a commerce platform that allows anyone to set up an online store and sell their products. Whether you're selling online, on social media, or in person, Shopify has you covered on every base. With Shopify you can sell physical and digital products. You can sell services, memberships, ticketed events, rentals and even classes and lessons. Sign up for a $1 per month trial period at http://shopify.com/eyeonai On episode #145 of Eye on AI, Craig Smith sits down with Riley McCormack, President and CEO of Digimarc, pioneers in digital watermarking and cloud-based product data. In this episode, we explore the critical role of digital watermarking in securing our digital assets, especially amidst the surging influence of AI. Riley guides us through the potential risks and benefits this technology brings to the forefront as AI continues to transform our digital world. We then navigate the intricate territories of NFTs and distributed ledger technology, understanding how digital watermarking is reshaping these fields by ensuring trust and authenticity. Our discussion also delves into the Digital Millennium Copyright Act of 1997, highlighting its relevance in upholding copyrights and fostering trust within the digital stratosphere. We conclude with a look at how digital watermarking impacts content creation and its role in shaping a secure and sustainable digital future, from collaborations with central banks to innovative products like Digimarc Recycle. Craig Smith Twitter: https://twitter.com/craigss Eye on A.I. Twitter: https://twitter.com/EyeOn_AI 00:00 Preview, Introduction and Shopify 03:05 Introduction to Digital Watermarking 07:01 Evolution of Digital Watermarking 14:21 Digimarc's Role in Digital Watermarking 21:13 Exploring the Protection Of Digital Content 28:56 Key Characteristics of Digital Watermarking 35:12 Application and Implementation of Digital Watermarking 42:46 Watermarking in Blockchain 49:11 What is Digimarc Validate? 01:03:01 Outro and Shopify
Hi folks!Things get a little wild as Cathi Bond looks at Cortical Labs, which is researching how to design computational systems from a mixture of stem cells and a silicon substrate (via New Atlas). We take a speculative look at a potential future of biological computing.Meanwhile, Nora Young looks at this Wired article on research into watermarking AI-generated images, and just how difficult that is turning out to be. How will we deal with the proliferation of deepfakes and similar forms of disinformation?
AI Hustle: News on Open AI, ChatGPT, Midjourney, NVIDIA, Anthropic, Open Source LLMs
Tune in to this episode as we uncover the latest developments from Microsoft, where they introduce major AI features and announce a game-changing move to watermark AI-generated images. Explore the exciting possibilities these advancements hold for various industries and how they'll impact the world of AI and image recognition. Stay informed about Microsoft's cutting-edge contributions to the ever-evolving landscape of artificial intelligence. Get on the AI Box Waitlist: https://AIBox.ai/Join our ChatGPT Community: https://www.facebook.com/groups/739308654562189/Follow me on Twitter: https://twitter.com/jaeden_ai
AI Hustle: News on Open AI, ChatGPT, Midjourney, NVIDIA, Anthropic, Open Source LLMs
In this episode, we delve into the groundbreaking collaboration between DeepMind and Google Cloud, unveiling their cutting-edge AI image watermarking technology that promises to revolutionize content protection and ownership. Join us as we explore how this invisible AI watermarking works and its potential implications for the future of digital content security. Tune in for an exclusive conversation with experts from both DeepMind and Google Cloud as they share insights into this game-changing innovation. Get on the AI Box Waitlist: https://AIBox.ai/Join our ChatGPT Community: https://www.facebook.com/groups/739308654562189/Follow me on Twitter: https://twitter.com/jaeden_ai
AI Hustle: News on Open AI, ChatGPT, Midjourney, NVIDIA, Anthropic, Open Source LLMs
Join us as we dive into the world of digital security and privacy with Steg.AI, a startup making waves with its innovative invisible watermarking technology. In this episode, we explore how Steg.AI's $5 million seed round is set to transform the way we protect digital assets, from images to documents. Discover the secrets behind their groundbreaking solution and the potential impact it could have on data security. Get on the AI Box Waitlist: https://AIBox.ai/Join our ChatGPT Community: https://www.facebook.com/groups/739308654562189/Follow me on Twitter: https://twitter.com/jaeden_ai
AI Chat: ChatGPT & AI News, Artificial Intelligence, OpenAI, Machine Learning
In this episode, we explore DeepMind's recent partnership with Google Cloud to implement watermarking on AI-generated images, aiming to increase transparency and address ethical concerns. We'll dive into the technology behind watermarking, how this move could impact the AI and digital art communities, and why this is a significant step toward responsible AI use. Get on the AI Box Waitlist: https://AIBox.ai/ Facebook Community: https://www.facebook.com/groups/739308654562189/ Discord Community: https://aibox.ai/discord Follow me on X: https://twitter.com/jaeden_ai
#stablediffusion #ai #watermark Watermarking the outputs of generative models is usually done as a post-processing step on the model outputs. Tree-Ring Watermarks are applied in the latent space at the beginning of a diffusion process, which makes them nearly undetectable, robust to strong distortions, and only recoverable by the model author. It is a very promising technique with applications potentially beyond watermarking itself. OUTLINE: 0:00 - Introduction & Overview 1:30 - Why Watermarking? 4:20 - Diffusion Models Recap 13:40 - Inverting Diffusion Models 17:05 - Tree-Ring Watermarking 26:15 - Effects of Tree-Ring Watermarks 30:00 - Experimental Results 32:40 - Limitations 34:40 - Conclusion Paper: https://arxiv.org/abs/2305.20030 Abstract: Watermarking the outputs of generative models is a crucial technique for tracing copyright and preventing potential harm from AI-generated content. In this paper, we introduce a novel technique called Tree-Ring Watermarking that robustly fingerprints diffusion model outputs. Unlike existing methods that perform post-hoc modifications to images after sampling, Tree-Ring Watermarking subtly influences the entire sampling process, resulting in a model fingerprint that is invisible to humans. The watermark embeds a pattern into the initial noise vector used for sampling. These patterns are structured in Fourier space so that they are invariant to convolutions, crops, dilations, flips, and rotations. After image generation, the watermark signal is detected by inverting the diffusion process to retrieve the noise vector, which is then checked for the embedded signal. We demonstrate that this technique can be easily applied to arbitrary diffusion models, including text-conditioned Stable Diffusion, as a plug-in with negligible loss in FID. Our watermark is semantically hidden in the image space and is far more robust than watermarking alternatives that are currently deployed. Code is available at this https URL. Authors: Yuxin Wen, John Kirchenbauer, Jonas Geiping, Tom Goldstein Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
AI Chat: ChatGPT & AI News, Artificial Intelligence, OpenAI, Machine Learning
In this episode, we delve into Steg.AI's recently secured $5M seed funding, focusing on their innovative technology for invisible watermarking on images and documents. We discuss the company's mission, the problem they're solving in digital security, and the potential impact of their unique solution in the era of rampant data breaches and intellectual property theft. Get on the AI Box Waitlist: https://AIBox.ai/ Investor Contact Email: jaeden@aibox.ai Facebook Community: https://www.facebook.com/groups/739308654562189/ Discord Community: https://aibox.ai/discord Download Selfpause: https://selfpause.com/Podcast Follow me on Twitter... er... X.com: https://twitter.com/jaeden_ai
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Watermarking considered overrated?, published by DanielFilan on July 31, 2023 on The AI Alignment Forum. Status: a slightly-edited copy-paste of a Twitter X thread I quickly dashed off a week or so ago. Here's a thought I'm playing with that I'd like feedback on: I think watermarking is probably overrated. Most of the time, I think what you want to know is "is this text endorsed by the person who purportedly authored it", which can be checked with digital signatures. Another big concern is that people are able to cheat on essays. This is sad. But what do we give up by having watermarking? Well, as far as I can tell, if you give people access to model internals - certainly weights, certainly logprobs, but maybe even last-layer activations if they have enough - they can bypass the watermarking scheme. This is even sadder - it means you have to strictly limit the set of people who are able to do certain kinds of research that could be pretty useful for safety. In my mind, that makes it not worth the benefit. What could I be missing here? Maybe we can make watermarking compatible with releasing model info, e.g. by baking it into the weights? Maybe the info I want to be available is inherently dangerous, by e.g. allowing people to fine-tune scary models? Maybe I'm missing some important reasons we care about watermarking, that make the cost-benefit analysis look better? E.g. avoiding a situations where AIs become really good at manipulation, so good that you don't want to inadvertently read AI-generated text, but we don't notice until too late? Anyway there's a good shot I don't know what I'm missing, so let me know if you know what it is. Postscript: Someone has pointed me to this paper that purports to bake a watermark into the weights. I can't figure out how it works (at least not at twitter-compatible speeds), but if it does, I think that would alleviate my concerns. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Show Notes:00:10: The core SEO changes at Google I/O2:14: Health and finance topics will not be affected (so far)3:08: Watermarking for generative content creators4:51: New layouts in Google SERP6:12: New impact on affiliate website7:43: Query complexity and how it influences content choice8:30: Google's incentives for content and websites continues10:30: RLHF, “Reinforcement Learning from Human Feedback13:42: Visual design importance18:06: How to increase efficiency in SEO with AI21:02: Visual design importance, continued21:47: Why complex queries are the best defense22:22: How to stay on top of optimization as AI evolves24:57: Final tips, tactics, and tricksShow Links:The AI Takeover of Google Search by The Verge“Watermarking” and Content from Google I/O“Conversational Mode” from Google I/OFollow Drew on TwitterFollow Ross on TwitterSend Us an Email
How should we scientifically think about the impact of AI on human civilization, and whether or not it will doom us all? In this episode, I speak with Scott Aaronson about his views on how to make progress in AI alignment, as well as his work on watermarking the output of language models, and how he moved from a background in quantum complexity theory to working on AI. Note: this episode was recorded before this story emerged of a man committing suicide after discussions with a language-model-based chatbot, that included discussion of the possibility of him killing himself. Patreon: https://www.patreon.com/axrpodcast Store: https://store.axrp.net/ Ko-fi: https://ko-fi.com/axrpodcast Topics we discuss, and timestamps: 0:00:36 - 'Reform' AI alignment 0:01:52 - Epistemology of AI risk 0:20:08 - Immediate problems and existential risk 0:24:35 - Aligning deceitful AI 0:30:59 - Stories of AI doom 0:34:27 - Language models 0:43:08 - Democratic governance of AI 0:59:35 - What would change Scott's mind 1:14:45 - Watermarking language model outputs 1:41:41 - Watermark key secrecy and backdoor insertion 1:58:05 - Scott's transition to AI research 2:03:48 - Theoretical computer science and AI alignment 2:14:03 - AI alignment and formalizing philosophy 2:22:04 - How Scott finds AI research 2:24:53 - Following Scott's research The transcript Links to Scott's things: Personal website Book, Quantum Computing Since Democritus Blog, Shtetl-Optimized Writings we discuss: Reform AI Alignment Planting Undetectable Backdoors in Machine Learning Models
This Week in Machine Learning & Artificial Intelligence (AI) Podcast
Today we're joined by Tom Goldstein, an associate professor at the University of Maryland. Tom's research sits at the intersection of ML and optimization and has previously been featured in the New Yorker for his work on invisibility cloaks, clothing that can evade object detection. In our conversation, we focus on his more recent research on watermarking LLM output. We explore the motivations behind adding these watermarks, how they work, and different ways a watermark could be deployed, as well as political and economic incentive structures around the adoption of watermarking and future directions for that line of work. We also discuss Tom's research into data leakage, particularly in stable diffusion models, work that is analogous to recent guest Nicholas Carlini's research into LLM data extraction.
ChatGPT presents the potential problem of ChatGPT content being used and attributed to another source, such as a professional writer or a student. In this episode we discuss the idea of "watermarking" ChatGPT content, including stenography, randomness, entropy, and how to destroy the watermarks.
On this week's episode of the podcast I cover a lot of recent updates on OpenAI, news of a recent out-of-band patch from Microsoft, info on Java pricing changes and much more! Reference Links: https://www.rorymon.com/blog/password-managers-targeted-for-attacks-ai-is-so-hot-right-now-avd-watermarking/
Industry conferences were in full swing last month, and the topics in this title provide a sense of what Chris and I talked about. October travel took me to three continents, so our perspectives are broad, and we don't think you'll mind this being a longer-than-usual episode.
「HarmonicとNAGRA、スポーツのライブ配信向けの「Watermarking as a Service」発表」 Harmonic社およびNAGRAは、スポーツのライブ配信コンテンツの保護を強化する新しい「Watermarking as a Service」を提供するための協業を発表した。
In 2022, broadcasters are close to launching commercial applications of ATSC 3.0. Watermarking provides a technology for ATSC 3.0 TV broadcasters to encode data that can pass through set-top box and HDMI connections. With watermark capabilities on LG NEXTGEN TVs, cable networks and regional sports networks can join local TV stations and national networks in planning to bring interactive capabilities to the living room. The watermarks enable the broadcasters to provide the same interactive experiences for over-the-air and non-over-the-air viewers. As these two-way interactive capabilities of NEXTGEN TV expand to additional households, consumers will enjoy more customized and localized experiences resulting in personalized broadcast television.Rick Ducey, BIA's Managing Director, discusses this valuable tech for broadcasting with Richard Glosser, Head of Business Development at Verance Corporation. Mitch Oscar, Director of Advanced Advertising Strategies at USIM, also joins the discussion.
What's in a brand? Does it reflect your personality? In this episode we talk about how we present ourselves in public, private and on social media. We talk about how YOU are your brand. People will get to know your cookie style, cookie photos and attach an ideal to, or about you. How do you protect what you put 'out there'? Are you good at Watermarking every photo before you publish or share it? How do you decide if you will post about an issue on your social media that has nothing to do with cookies? Tune in and listen to our different perspectives on our brands and how we manage them.
Within the framework of the Holy Grail 2.0 project, major players across the value chain come together to improve packaging waste sorting accuracy through digital watermarking. We hear from Beiersdorf's Sabrina Stiegler and All4Labels' Michael Brocher about how the two companies collaborated to move digital watermarking from innovation to implementation.
If you want to understand why your OTT TV service needs DRM and watermarking implemented in the cloud to protect your content, this podcast is for you. The CEO of BuyDRM breaks down the why, how, and what of the tech you'll need. Listeners to this podcast can sign up for the Video Security Summit for free, where you will hear from BuyDRM and many other security experts.
In this episode of Cyber Security Inside, Tom and Camille dive into how content producers and distributors are keeping content secure in a world of piracy and streaming. What makes it possible for us to safely stream content directly into our homes? How do we know what we’re streaming isn’t pirated in some way? Avi Wachtfogel, Engineering Fellow and Senior Director of Security Strategy at Synamedia, is just the person to cover the evolution of media content security and share the latest threats and best strategies for keeping content secure. The conversation covers: • Macrovision • VHS + DVD • Torrenting and peer-to-peer sharing • BitTorrent • VOD • Over-the-top (OTT) • Hulu, YouTube, HBO Max, Netflix, etc. • Credential fraud • Deep fakes • Content protection and service protection • Watermarking • Take down notices ... and more Don’t miss it! Here are some key take-aways: • With video being distributed more broadly and going straight to streaming, protecting and securing media content has become even more challenging. • Hardware and software technologies have been used on the service protection side to solve the problem of bootleg cable and other content security concerns of the 90s and 2000s. • Over the top (OTT) refers to the distribution of video content over a high-speed Internet connection. This covers streaming services like Netflix, HBO Max, etc. • It’s relatively simple to start an OTT service, so it’s more important than ever to keep media content protected against piracy. If pirates get access to the content, they can re-stream/distribute it. • Pirates also create distribution chains, selling to other pirates who then sell to consumers. You see this often with live events. There are even salesmen who go door-to-door selling these IPTV services. Content consumers are often confused about whether the content they’re getting is legal or not. • Licensing agreements that limit when and where content can be consumed can actually drive consumers to seek out pirate streaming sites. • Credential fraud allows people to access content without legally subscribing to a streaming service. This is another way service providers lose money. • Using encryption to keep OTT video content secure is a tricky thing. You need to allow those with the device to access the content, but pirates may also be accessing the content legitimately. Protecting that content using encryption is not as straightforward as it is with protecting personal info against external attacks. • The line between content protection and service protection blurs once the content is distributed. • The phases of protection are Protect, Detect, and Disrupt. Protecting won’t always be possible, Detecting involves figuring out who’s distributing the content and where and how they’re distributing it, and Disrupting is taking action and putting a stop to that distribution. • A major challenge with the streaming industry is that everything is so fragmented, and this fragmentation actually encourages piracy. After all, if we’re already subscribed to dozens of services and then a new service creates content we want, where do we draw the line? When do we start seeking that content elsewhere? At some point, these services will have to work together. • If you find that your own content is being distributed on YouTube, Facebook, or on search engines illegally, you can approach the platform. They’re required to take down those illegal links. • The challenge of countering piracy requires technological means (like watermarking and tracking pirate services), as well as legal means (like Take Down notices), and a group effort to make it easier for legal content to be consumed than it is for illegal content to be consumed. Some interesting quotes from today’s episode: “Because the technologies are out there that make it easy to start a [OTT] service, pirates can do the same. And it's just a matter of having access to the content… If you have any device that outputs content -- that can be a set-top box, that can be a PC -- and that content is being output, it can be captured, whether through the HTMI port, or using the screen grabbing software, or even, an extreme case of just taking a camera and having it opposite the monitor. You can capture that content. And once you can capture that content, you can re-stream it.” “One pirate will take a stream, a live stream, say of a sports event, and then they will sell that on to other pirates. So, you have a whole distribution chain and then those pirates will sell it on directly to consumers. And there’ll be resellers who are taking that content and selling it further and further along. There’s a lot of confusion very often among customers actually, as to what they're actually getting -- whether it's legal or not.” “A lot of these services, they call them IPTV services. We've seen in some countries, there will actually be a salesman going door to door. They'll knock on the door and I'll say, you know, ‘For $10 a month, would you like access to these 200 channels? We’ll set it up for you.’ They'll come in, they'll take a box of some sort and go plug it into your TV, and they'll set you up and set up the billing. Some of these guys have got 24/7 support -- pick up the phone and you have support -- and they look really legitimate. And then very often the customers themselves can't tell whether they're signing up for legitimate service or not.” “Very often these kinds of [licensing] arrangements actually drive consumers to use pirate services… We're actually seeing that kind of tendency. People are looking for content. There's actually a rise in the amount of content that's being viewed over Torrents these days because of these kinds of limitations.” “You can go on the dark web -- you can buy a set of credentials for a variety of streaming services and pay a lot less for those than you would if you were subscribing legally. And that’s also a major problem for the service providers today. There's a lot of money that they're losing to those kinds of attacks.” “In the case of video, it's a much more difficult problem because you're trying to protect the content on the device from the person who's holding the device. The pirate actually has a legitimate device with the content on it. And obviously you want a legitimate user to be able to view the content.” “There are different ways of capturing that content and then re-encoding. Today, just encrypting the content is really not enough.” “Whether it's Disney+, HBO Max, Netflix -- these new services are appearing every other day. And we all talk about the ‘streaming wars,’ but at some point they're going to have to recognize that they need to sort of get together and solve, what is really going to be a piracy problem. Because people are going looking for the content. People aren't going to sign up for ten different services. And if they don't happen to be subscribed to the particular service where there's content that they want, they're going to go look for it on Torrents. And so, they're going to have to find some way to work together after this fragmentation happens to sort of re-aggregate the content.” “Today, if you sign up for Spotify or Apple Music or Amazon, you're paying one monthly fee… You don't care what the label is behind the music. You’ve got access to all the music you could want. And when the video industry reaches a point where they make it easier to access content legally than it is to access it illegally, they will have largely solved a lot of the problems that they're seeing today.”
101st Airborne Division (https://www.army.mil/101stAirborne) D&G 189: Marlie Moxinspike (https://dgshow.org/189) The battle inside Signal (https://www.platformer.news/p/-the-battle-inside-signal) See also: Can WhatsApp stop spreading misinformation without compromising encryption? (https://qz.com/1978077/can-whatsapp-stop-misinformation-without-compromising-encryption/) What You Should Know Before Leaking a Zoom Meeting (https://theintercept.com/2021/01/18/leak-zoom-meeting/) The ‘Batman Effect’: How having an alter ego empowers you (https://www.bbc.com/worklife/article/20200817-the-batman-effect-how-having-an-alter-ego-empowers-you) Cutting Room Floor * DIY latex glove bagpipes (https://www.youtube.com/watch?v=y95I0rb1JoU) * Bagpipe swing with Gunhild Carling in Central Park NY (https://www.youtube.com/watch?v=8RbVuDuCYMY) * Menu items renamed as expense-able office items (https://www.ubereats.com/ca/toronto/food-delivery/good-fortune-burger-college/SlS7Rn6dQ1SVb59NxiWt5A) * Blue Check Homes: Apply now! (https://bluecheckhomes.com/) * Evidence of Life: Two Photographers Show Us Mysteries In The Mundane – 1977 (https://flashbak.com/evidence-mike-mandel-larry-sultan-1970s-found-photos-435986/) * Faraday Cages for Wi-Fi Routers Are the Latest 5G Conspiracy Grift (https://www.vice.com/en/article/xgzgw4/faraday-cages-for-wi-fi-routers-are-the-latest-5g-conspiracy-grift) We Give Thanks * The D&G Show Slack Clubhouse for the discussion topics!
What are the potential benefits of digital watermarking for recycling and beyond? On the occasion of the launch of the HolyGrail 2.0 project, Elisabeth Skoda speak to Michelle Gibbons, Director General at AIM, and to Gareth Callan, Sustainability Packaging Manager at PepsiCo and member of the HolyGrail leadership team to find out more. Sponsored by: Smurfit Kappa No.1 company in Europe producing corrugated packaging, container board and ‘bag in box'
Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.09.04.283135v1?rss=1 Authors: Oksuz, A. C., Ayday, E., Gudukbay, U. Abstract: Genome data is a subject of study for both biology and computer science since the start of Human Genome Project in 1990. Since then, genome sequencing for medical and social purposes becomes more and more available and affordable. Genome data can be shared on public websites or with service providers. However, this sharing compromises the privacy of donors even under partial sharing conditions. We mainly focus on the liability aspect ensued by unauthorized sharing of these genome data. One of the techniques to address the liability issues in data sharing is watermarking mechanism. To detect malicious correspondents and service providers (SPs) -whose aim is to share genome data without individuals' consent and undetected-, we propose a novel watermarking method on sequential genome data using belief propagation algorithm. In our method, we have two criteria to satisfy. (i) Embedding robust watermarks so that the malicious adversaries can not temper the watermark by modification and are identified with high probability (ii) Achieving {varepsilon}-local differential privacy in all data sharings with SPs. For the preservation of system robustness against single SP and collusion attacks, we consider publicly available genomic information like Minor Allele Frequency, Linkage Disequilibrium, Phenotype Information and Familial Information. Our proposed scheme achieves 100% detection rate against the single SP attacks with only 3% watermark length. For the worst case scenario of collusion attacks (50% of SPs are malicious), 80% detection is achieved with 5% watermark length and 90% detection is achieved with 10% watermark length. For all cases, {varepsilon}'s impact on precision remained negligible and high privacy is ensured. Copy rights belong to original authors. Visit the link for more info
Today George talks about the use of Machine Learning to diagnose Cancer from a blood test. By sampling 'cell-free-DNA' this test is capable of identifying 50 different types of Cancer and the localized tissue of origin with a >90% accuracy. Lan leads a discussion of what robots and researchers in robotics may be able to contribute towards fighting the COVID-19 pandemic. Last but not least, Kyle leads the panel in a discussion about watermarking data!
Aaron Conlon has personally evolved from owning physical assets to accessing game art design templates in the Unity Assets Store. Kevin Kelly suggested this sort of professional practice is inevitable. Andre Louis created the backing track.00:43 currently accessing https://2kmfromhome.com h/t OpenStreet Map with Dave Bolger 01:33 We're talking about accessing with Aaron Conlon https://twitter.com/Aaronsquid02:38 What is accessing 03:39 Aaron's memorable Sony Walkman Flip Phone04:55 Using Spotify to access popular music https://open.spotify.com/show/62fxmUNIJ7mfR6gLsD6dG005:38 Spotify and Netflix 07:40 Critical Role07:51 Insights about Ninetendo from Inside Gaming 08:31 Kindle has changed access for books08:31 Dematerialization https://en.wikipedia.org/wiki/Dematerialization_(economics)10:47 Accessing by game play level 11:15 Refunds for paid access12:46 Access to ePortfolio assets on Google Drive13:51 Collages, pins, and shared assets16:05 Watermarking the Shadowborne sword, then allowing easy access17:48 Unity Asset Store and your extended soul https://assetstore.unity.com/18:35 Accessing Aaron's current work on https://aaronconlon.artstation.com19:30 Music by https://patreon.com/onjmusic19:54 Comment on Limor
Leaving off from the last episode, “The Importance of Copywriting, Watermarking, and Camera Technology”, Michael, Mark and Ron catch up, continuing their conversation discussing photographing in the field and the enjoyment of meeting new wildlife photographers, friends and followers of the podcast. The guys also talk about the changes in the rut.
Michael, Mark, and Ron catch up and discuss all the new camera technology the importance of watermarking and copywriting your images.
In episode #9, we discuss about the top 5 reasons of why property investors should be watermarking their photos and how it will drive more buyer leads. Typically this isn't done, but should be incorporated into your tasks of preparing photos and we discuss the easy and fast tools that will only take a few seconds to watermark all your photos.
This week on Another Photography Podcast, we discuss watermarking your work, a killer milky way shot by Steven Magner, we announce our shirt, and talk about finding/funding the time to go shoot. Shirt announcement! The Just Go Shoot is now available. DM us for more info. Shipping is available We had one of our pictures get posted on another page without credit. Was it a big deal? We think yes, let us know what you think. Steven Magner took at epic shot of the milky way and Temple Crag last week which left us speechless. Check out his work at the link below. On this weeks topic we discuss finding/funding the time to go shoot. It comes down to budgeting, and not being lazy. Ray and Art usually go on big trips together and split the bill. Lastly, should you carry your gear everywhere you go? We have some mixed feelings, let us know how you feel. instagram.com/anotherphotographypodcast instagram.com/through_the_eyes_of_reignbird instagram.com/artveraphotography www.instagram.com/stevenjmagner/
Tonight is Jay Cat's own perspective on TOD16 But first .. -"I gotta vent on a certain dirty mother fu**er" -BoneFrog is coming!! --"Really don't wanna be in the fu**in lake with lightning striking" -This Cosby s**t ... -What should CZW do with Shlak? --"That's called a 'draw', DJ" -Future of DeathMatch wrestling -CZW thoughts --Sami's booking --IPPV -Doors or Tables? -Fire in wrestling -Mayweather vs McGregor --"This'll make money because people are fu**in' stupid" -"Stand the f**k up for the anthem" -More CZW ... --"I can't get a broadcast in motherf***in' Jersey ..." -"There was fu**in' trains then, but we ain't got no motherfu**in' trains now!" -"No Nazi's ... Only Shlak!" -"Did you forget someone?" --"Your camera sucks. Your pictures suck. And you're a fu**in' clown" --"Let me tell you personally why I don't like your fu**in' ass, Matt ..." --"you narrowly avoided permanent disabilities ..." --"The point where you crossed the fu**in' line" --Watermarking s**t from the bleachers -In-Depth review of TOD16 --Credit to Emil Jay --"So far, so MF'N good, right?"
Episode 034 Greetings friends and listeners, today is Fanmail Friday and I have a question from Sarita in Meridian, Idaho who asks, "Hi Sonya, I am in the process of getting my website finalized and I have given my images to my web person, she is telling me that I need to remove the watermarks from my images, but I remember from a previous podcast you said that watermarking the images is necessary, can you please explain this to me again in detail? Greetings Sarita, and thank you so much, I know that so many web people want to make things simplistic and they feel that the watermark is going to compromise the aesthetic from the website..... Ummmm not true! it's the same as if you were telloing them to not to allow them credit if they put their name at the bottom of the website. So, in my opinion, the answer is no, do not remove your watermark. However, I will say that the watermark does need to be thought out and doesn't need to take up the whole entire span of your image, it also should not be blasted across the center of your image, often times this is a turn off to the consumer and it should be in a lighter transparent layer something that isn't over taking images but will let the average viewer who might stumble on your images in that big fast Internet Highway know who who created the work. I also know that some web experts want to place high-resolution images spanning 1200 x 1200 pixels. Eeeeek!!! If you provide this size image, provide images that will benefit images of open spaces, images of your studio, photos of your exhibits with lots of people in it, a photo of you with your booth, a grouping of your artwork in a setting. But I do not recommend that you put one image 1200 pixels by 1200 pixels because then it can easily be "borrowed" from your website and utilized unfavorably without your permission. Sad, but true. I have provided art images on my site that are no more than 500 pixels wide, this will translate well and be nicely visible on tablet, laptop, smart phone etc. the whole point of putting photos on your site is so that people can see your work and anything larger than that it's going to take up unnecessary space and you want of course people to purchase prints of your work or other products that you have available. If you keep the watermark modest on the lower section of the on the image, this way you should be safe. Sarita, I am excited that you are close to launching your website and would love it if you would send me your final link so we can check out your amazing art! I hope that you will send in questions that you have that you would like for me to answer on the air about any obstacles that you are encountering or any questions that you might have. For the full summary of these show notes visit me at http://rockstarmentor.com/blog Please visit our website to sign up to be on the front lines of amazing information and free downloads that I have prepared especially for you. http://rockstarmentor.com Visit me on Twitter: http://twitter.com/crushitmentor Visit me on Instagram: http://instagram.com/rockstarmentor To learn more about me, my art and colorful product line, visit Sonya Paz through my artist website, http://SonyaPaz.com Thanks to "The Brush Guys" they can be located at
Published on Aug 27, 2013 John Taylor, Karyn O'Bryant, Katherine Curriden, Anthony Gettig join us for a fun discussion on watermarking, traveling, a recap of the DLF Poker Classic, and much more! George tells us about his recent affiliation with edge studio. Congratulations George! https://www.edgestudio.com/press-rele... Dan tells us about his recent meeting with David, and Stephanie Ciccarelli of voices.com. Hang Out Guests Included: John Taylor Karyn O'Bryant Katherine Curriden Anthony Gettig Rachel Ehrenberg Hang Out Questions and Topics Included: John Taylor's recent live broadcast of his radio show via Luci Live. Dan's progress with his ukulele lessons. Tales from the Don LaFontaine Poker Classic. Don LaFontaine, and June Foray's birthdays. Connie Terwilliger's meet up group. Shotgun mics, and pricing. Watermarking. Serious considerations. WOVO-world voices progress. Anthony's recent video presentation. Deriving work from YouTube productions. Various YouTube tips. Some of John's recent in studio experiences. Closing comments from each of our guests. Special Thanks to our newest sponsor Edge Studios! http://www.edgestudio.com/ Thanks to Harlan Hogan and Voice Over Essentials. voiceoveressentials.com/ Harlan is the exclusive supplier of portable studio booths for TED. Announcements: Sept. 18-20 Voice Over Virtual: An online virtual conference. http://www.voiceoverxtra.com/article.... Be sure to take the EWABS survey found on the front page of: EWABS.com. EWABS E-support membership: Unlimited one on one help with Dan and George. $39.95/mo. Click the support Link at the top of ewabs.com Donations are always welcome.ewabs.com For Studio Suit Orders and Info.: http://www.vostudiosuit.com/ Like our Facebook page. https://www.facebook.com/ewabshop. Subscribe on YouTube https://www.youtube.com/user/ewabs ewabs_show on Twitter WOULD YOU like to be one of our monthly sponsors??? Contact us at ewabshop.com to discuss!!! Visit World Voices at the new web address Worldvo.com. Don't forget to get your gear ! shop.ewabs.com Visit George at VOStudioTech.com Or follow George's Twitter: @EWABS_Show Visit Dan at HomeVoiceOverStudio.com/ Or follow Dan's Twitter: @HomStudioMan John Florian, and Voice Over Xtra. The daily resource for voice over success. http://www.voiceoverxtra.com/index.htm Sept 2-------Labor Day - No show this week. Sept 9----Tara Platt and Yuri Lowenthal-Actors, Animators, Authors of "VO Behind the MIC" Join the EWABS Correspondent Contest. Make a VO or home studio related video, and send it to ewabshop@gmail.com The best way to keep track of the shows activities is to visit the EWABS Facebook page. https://www.facebook.com/eastwest.aud... EWABS Twitter: EWABS_Show Thanks again to our sponsors. Harlan Hogan's Voice over Essentials. voiceoveressentials.com Voice Over Xtra http://www.voiceoverxtra.com/index.htm Thanks to Larry Hudson, Silvia McClure, Larry and Elizabeth Davis, and Rosemary Benson for the show's bumpers, promotions and drops. Special thanks to the shows producer, Katherine Curriden, Dave Courvosier for providing the shows weekly intro, Lee Pinney for posting the podcast, and, to Jason Lawson for providing our weekly show notes. Send us questions, and be on our show where Dan and George will solve your home studio problems live! Call 818-47EWABS, that's 818 473-9227, and leave us your question in the voice mail box. Go to ewabs.com for details. Contributions are also welcome at shop.ewabs.com
This part finishes Chapter 5. In this second part watermarking is explained in more detail and DRM standards as well as commercial solutions are presented.
This part finishes Chapter 5. In this second part watermarking is explained in more detail and DRM standards as well as commercial solutions are presented.
The Internet has become one of the main sources of knowledgeacquisition, harboring resources such as online newspapers, webportals for scientific documents, personal blogs, encyclopedias, andadvertisements. It has become a part of our daily life to search andaccess this immense amount of online information, and more recently wehave also started to contribute to this pool of information our owncreativity in the form of text, images and video. Unfortunately, it isstill an open question as to how we, as authors, can control the waythat the information we create is distributed or re-used.Rights management problems are serious for text since it is much easyfor other people to download and manipulate copyrighted text fromInternet and later re-use it free from control. There is a need for arights protection system that ``travels with the content''. Digitalwatermarking is an information hiding mechanism that embeds thecopyright information in the document. Besides traveling with thecontent of the documents, digital watermarks are also imperceptible(i.e., seamless) to the user, which makes the process of removing themfrom the document challenging.Using linguistic features for information hiding into natural language text is an exciting and new idea. This talk begins with a short surveyof existing technologies in natural language watermarking, and thenfocuses on a recently developed natural language watermarking systemthat is practical, easy-to-use and provides resilience to attacks throughthe use of ambiguity in natural language. The talk is aimed for a generalaudience, and will be self-contained covering the necessary backgroundinformation. About the speaker: Mercan Topkara is a PhD candidate at the Computer Science Departmentof Purdue University working with Mikhail J. Atallah and CristinaNita-Rotaru. She got her Bachelor of Science degree from ComputerEngineering and Information Science Department of Bilkent Universityin 2000. She started her graduate studies at Purdue University inAugust 2001. Her PhD thesis is focused on designing, building andevaluating natural language watermarking systems. Her researchinterests are within the areas of digital watermarking, statisticalnatural language processing, usable security and machine learning. Shehas previously worked as a research intern at AT&T Research Labs, IBMT. J. Watson Research, and Google Research. More information can befound at http://www.cs.purdue.edu/homes/mkarahan.
The Internet has become one of the main sources of knowledge acquisition, harboring resources such as online newspapers, web portals for scientific documents, personal blogs, encyclopedias, and advertisements. It has become a part of our daily life to search and access this immense amount of online information, and more recently we have also started to contribute to this pool of information our own creativity in the form of text, images and video. Unfortunately, it is still an open question as to how we, as authors, can control the way that the information we create is distributed or re-used. Rights management problems are serious for text since it is much easy for other people to download and manipulate copyrighted text from Internet and later re-use it free from control. There is a need for a rights protection system that ``travels with the content''. Digital watermarking is an information hiding mechanism that embeds the copyright information in the document. Besides traveling with the content of the documents, digital watermarks are also imperceptible (i.e., seamless) to the user, which makes the process of removing them from the document challenging. Using linguistic features for information hiding into natural language text is an exciting and new idea. This talk begins with a short survey of existing technologies in natural language watermarking, and then focuses on a recently developed natural language watermarking system that is practical, easy-to-use and provides resilience to attacks through the use of ambiguity in natural language. The talk is aimed for a general audience, and will be self-contained covering the necessary background information.
Doctor Who: Podshock Episode 19 For the Week of the 26th of December 2005 Running Time: 1:06:32 In this episode: News - Christmas Invasion News, John Barrowman to Tie the Knot, iPod Video Doctor Who?, Watermarking, NA DVD Package, etc. Features - Christmas Invasion Review (no spoilers), Chris Rattray Australian Report Announcements - I-CON 25, Web site progress, Blake's 7 Spin-off Podcast?, Ken's Podshock Apparel Challenge. Promos - Cinemaslave Hosted by James Naughton (UK), Ken Deep (US), and Louis Trapani (US). with Squiffy as a guest in the UK, and Chris Rattray in Australia. Do you need the MP3 file format? Get our MP3 version of this episode using our MP3 dedicated feed at http://www.gallifreyanembassy.org/podshock/podshockmp3.xml
Doctor Who: Podshock Episode 19 For the Week of the 26th of December 2005 Running Time: 1:06:32 In this episode: News - Christmas Invasion News, John Barrowman to Tie the Knot, iPod Video Doctor Who?, Watermarking, NA DVD Package, etc. Features - Christmas Invasion Review (no spoilers), Chris Rattray Australian Report Announcements - I-CON 25, Web site progress, Blake's 7 Spin-off Podcast?, Ken's Podshock Apparel Challenge. Promos - Cinemaslave Hosted by James Naughton (UK), Ken Deep (US), and Louis Trapani (US). with Squiffy as a guest in the UK, and Chris Rattray in Australia. Do you want the Enhanced Podcast AAC file format? Get our Enhanced Podcast version of this episode using our feed at http://www.gallifreyanembassy.org/podshock/podshock.xml
Proving ownership rights on outsourced relational databases is a crucial issue in today internet-based application environment and in many content distribution applications. In this talk, we will present mechanisms for proof of ownership based on the secure embedding of a robust imperceptible watermark in relational data. We will discuss the available watermark embedding and decoding techniques. Furthermore, we will provide a comparison between these techniques based on several dimensions such as applicability, efficiency, and security. About the speaker: Mohamed Shehab received the BSc from United Arab Emirates University in 2000. Currently he is a PhD student in electrical and computer engineering at Purdue University. His main research interests lie in information security with emphasis on rights protection, data integrity and access control. Recently, he has been also working on various topics in the areas of distributed access control and distributed secure collaboration.
Proving ownership rights on outsourced relational databases is a crucial issue in today internet-based application environment and in many content distribution applications. In this talk, we will present mechanisms for proof of ownership based on the secure embedding of a robust imperceptible watermark in relational data. We will discuss the available watermark embedding and decoding techniques. Furthermore, we will provide a comparison between these techniques based on several dimensions such as applicability, efficiency, and security.
In the past several years there has been an explosive growth in digital imaging technology and applications. Digital images and video are now widely distributed on the Internet and via CD-ROM. One problem with a digital image is that an unlimited number of copies of an "original" can be easily distributed and/or forged. This presents problems if the image is copyrighted. The protection and enforcement of intellectual property rights has become an important issue in the "digital world." Many approaches are available for protecting digital images and video; traditional methods include encryption, authentication and time stamping. In this talk we describe algorithms for image authentication and forgery prevention known as digital watermarking. A digital watermark is a signal that is embedded in a digital image or video sequence that allows one to establish ownership, identify a buyer or provide some additional information about the digital content. In this talk we will review the current state of watermarking and describe some of the open research problems. About the speaker: Edward J. Delp was born in Cincinnati, Ohio. He received the B.S.E.E. (cum laude) and M.S. degrees from the University of Cincinnati, and the Ph.D. degree from Purdue University. From 1980-1984, Dr. Delp was with the Department of Electrical and Computer Engineering at The University of Michigan, Ann Arbor, Michigan. Since August 1984, he has been with the School of Electrical and Computer Engineering at Purdue University where he is a Professor of Electrical and Computer Engineering.He is a Fellow of the IEEE, a Fellow of the SPIE, and a Fellow of the Society for Imaging Science and Technology (IS&T). His research interests include image and video compression, multimedia security, medical imaging, multimedia systems, communication and information theory. Dr. Delp has also consulted for various companies and government agencies in the areas of signal and image processing, robot vision, pattern recognition, and secure communications. More information about Professor Delp may be found in his online bio.
In the past several years there has been an explosive growth in digital imaging technology and applications. Digital images and video are now widely distributed on the Internet and via CD-ROM. One problem with a digital image is that an unlimited number of copies of an "original" can be easily distributed and/or forged. This presents problems if the image is copyrighted. The protection and enforcement of intellectual property rights has become an important issue in the "digital world." Many approaches are available for protecting digital images and video; traditional methods include encryption, authentication and time stamping. In this talk we describe algorithms for image authentication and forgery prevention known as digital watermarking. A digital watermark is a signal that is embedded in a digital image or video sequence that allows one to establish ownership, identify a buyer or provide some additional information about the digital content. In this talk we will review the current state of watermarking and describe some of the open research problems.