Podcasts about UID

  • 65PODCASTS
  • 168EPISODES
  • 55mAVG DURATION
  • 1MONTHLY NEW EPISODE
  • Apr 14, 2025LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about UID

Latest podcast episodes about UID

Next in Marketing
Tariff Brand Paralysis, Retail Media Uncertainty, and Trade Desk Legal Troubles

Next in Marketing

Play Episode Listen Later Apr 14, 2025 22:46


Mike and ad consultant Emily Riley are back talking about the big headlines in media and advertising, including the plunge in consumer confidence, how brands are viewing retail media right now, the fate of Yahoo's DSP, and lawsuits against the Trade Desk.

Crypto Banter
The ONLY Way To Make Money In Crypto Right Now!

Crypto Banter

Play Episode Listen Later Feb 12, 2025 35:47


CPI results are dropping today and the crypto market is stagnant leading up to it! Bitcoin and altcoins are all at major decision points and this CPI would likely spur the market into a definite direction...

Crypto Banter
Finally, A REAL Reason For The Crypto Bull Run To Continue!

Crypto Banter

Play Episode Listen Later Feb 10, 2025 35:47


Today, Ran will show you why there is now a REAL reason for altcoins to go parabolic! It doesn't mean ALL altcoins in the crypto space will benefit, but you can expect massive inflows into certain altcoins that are currently trading at very low prices...

Fortinet Cybersecurity Podcast
Brass Tacks #9: The Anatomy of Effective Cybersecurity Posture: AI, LLMs, and Beyond

Fortinet Cybersecurity Podcast

Play Episode Listen Later Nov 10, 2024 21:06


In the latest episode of Brass Tacks - Talking Cybersecurity, #Fortinet's Filippo Cassini delves into the anatomy of effective cybersecurity posture and discuss how increasing network and security complexity is driving a shift from the selection of security components on a best-of-breed basis to an integrated platform approach. Today's SecOps teams not only have to make sense of ever-increasing volumes of IoC (indicators of compromise) and other event data, but they need to respond faster and more efficiently to escape the heavy non-compliance penalties of new regulations. These challenges are explored with fascinating insights into the inner-workings of modern cybercrime and the potential role of new technologies such as AI, LLMs, and the opportunities for automation they enable. Learn more: https://www.fortinet.com/blog/ciso-collective/talking-to-the-c-suite-about-cybersecurity?utm_source=Social&utm_medium=Other&utm_campaign=BrassTacks-GLOBAL-Global&utm_content=BG-AmplifyGlobal-U&utm_term=Org-Social&lsci=7012H0000021nOIQAY&UID=ftnt-9768-338496 More about Fortinet: https://ftnt.net/60595CcyH Read our blog: https://ftnt.net/60505Ccyj Follow us on LinkedIn: https://ftnt.net/60515Ccyd

Fortinet Cybersecurity Podcast
Brass Tacks #7 - The 'Human Firewall': Building Cybersecurity Into Organizational Culture

Fortinet Cybersecurity Podcast

Play Episode Listen Later Oct 15, 2024 17:37


How do you securely network a "Smart City" that has to be rebuilt over a hundred times each year in different locations around the world, and with no more than a few days of annual downtime? In this episode of Brass Tacks - Talking Cybersecurity, host Joe Robertson meets with Michael Cole, Chief Technology Officer for the European Tour Group, to discuss the unique challenges of running over a hundred, high-tech, international golf tournaments each year. While not disputing the importance of board-level buy-in and top-down engagement for cybersecurity, Michael also stresses the importance of bottom-up awareness campaigns in which every staff member is valued as a first line of defense - a "human firewall." In this way, he argues, cybersecurity - far from being driven by the relentless march of digital innovation - becomes an enabler of sustainable business innovation and growth. Learn more: https://www.fortinet.com/blog/ciso-collective/employees-are-not-the-weakest-links?utm_source=Social&utm_medium=YouTube&utm_campaign=BrassTacks-GLOBAL-Global&utm_content=BG-YouTubeGlobal-U&utm_term=Org-social&lsci=7012H0000021nOIQAY&UID=ftnt-6452-436895

Fortinet Cybersecurity Podcast
Brass Tacks #6 - Building Cyber Resilience: Aligning People, Processes, Tech, & Compliance

Fortinet Cybersecurity Podcast

Play Episode Listen Later Sep 23, 2024 17:35


In this episode of Brass Tacks - Talking Cybersecurity, Daniele Mancini, Field CISO at #Fortinet explains the three main drivers of change for #cybersecurity and the challenges these present for the #CISO: ➡️ The explosion in data volumes ➡️ The increasing speed of innovation ➡️ The growing interconnection of the digital ecosystem Tune in for a discussion on the importance of balancing technological innovation and business strategy and creating resilience to a broad range of cybersecurity incidents through a joined-up strategy supported by the three essential pillars of people, processes, and technology. Learn more: https://www.fortinet.com/blog/ciso-collective/building-cyber-resilience?utm_source=Social&utm_medium=YouTube&utm_campaign=BrassTacks-GLOBAL-Global&utm_content=BG-YouTubeGlobal-U&utm_term=Org-Social&lsci=7012H0000021nOIQAY&UID=ftnt-5649-736091 More about Fortinet: https://ftnt.net/6056oiHQE Read our blog: https://ftnt.net/60529liW2 Follow us on LinkedIn: https://ftnt.net/60549liW4

Calvary Hanford Audio Podcast
Prophecy Update #796 – Face Finger

Calvary Hanford Audio Podcast

Play Episode Listen Later Aug 18, 2024 5:29


Just when I thought we had covered everything biometric, I discovered something called UID. It stands for Unique Identity Number and is utilized in India. Pastor Gene Pensiero Find audio, video, and text of hundreds of other prophecy updates at: https://calvaryhanford.com/prophecy Follow us on YouTube at https://youtube.com/calvaryhanford

Prophecy Updates // Pastor Gene Pensiero
Prophecy Update #796 – Face Finger

Prophecy Updates // Pastor Gene Pensiero

Play Episode Listen Later Aug 18, 2024 5:29


Just when I thought we had covered everything biometric, I discovered something called UID. It stands for Unique Identity Number and is utilized in India. Pastor Gene Pensiero Find audio, video, and text of hundreds of other prophecy updates at: https://calvaryhanford.com/prophecy Follow us on YouTube at https://youtube.com/calvaryhanford

Calvary Hanford Video Podcast
Prophecy Update #796 – Face Finger

Calvary Hanford Video Podcast

Play Episode Listen Later Aug 18, 2024 5:40


Just when I thought we had covered everything biometric, I discovered something called UID. It stands for Unique Identity Number and is utilized in India. Pastor Gene Pensiero Find audio, video, and text of hundreds of other prophecy updates at: https://calvaryhanford.com/prophecy Follow us on YouTube at https://youtube.com/calvaryhanford

Permission to Stan Podcast: KPOP Multistans
AESPA Gundam Girls|STRAY KIDS x Deadpool chances rising!|NEWJEANS & LE SSEARFIM fan shows in Japan (The FOMO is real)|LISA Rockstar MV|New JIKOOK (BTS JIMIN x JUNGKOOK ship) show incoming

Permission to Stan Podcast: KPOP Multistans

Play Episode Listen Later Jul 4, 2024 72:22


@PermissionToStanPodcast on Instagram (DM us here) & TikTok!NEW Podcast Episodes every THURSDAY! Please support us by 'Following' & 'Subscribing' for more K-POP talk!Starting ZZZ Zenless Zone Zero by Hoyoverse (Genshin Impact & Honkai Star Rail)Beginning of July comebacksMusic video recaps: BLACKPINKs LISA, STAYC, BABYMONSTER, KISS OF LIFE, AESPAILLIT's WONHEE seen with crutches & IROHA goes blondeLE SSEARFIM Fearnada fan showcase 2024 JapanNEWJEANS Bunnies Camp fan showcase and debut in JapanYOASOBI makes a guest appearance at Bunnies CampRhythm Hive: HYBE's rhythm video game app is a must download for HYBE stans (Add JOCO!: UID# 4984206279545700)BTS JIMIN solo single "Smeraldo Garden Marching Band" MVBTS JIN has been keeping busy since returningJIN will be one of the Olympic torch runnersSTRAY KIDS unveil track for "Mountains"SKZ CODE pool chaos & funBANGCHAN & FELIX to host new show on Apple MusicBANGCHAN goes Live after over a 1 year hiatusRYAN REYNOLD & HUGH JACKMAN in Korea for Deadpool 3: RYAN says he loves STRAY KIDS!Advertising Inquiries: https://redcircle.com/brandsPrivacy & Opt-Out: https://redcircle.com/privacy

The Political Party
Election 24 Special, Ep 14

The Political Party

Play Episode Listen Later Jun 16, 2024 69:55


We're going back to the seaside Today's candidates are... Anne-Marie Trevelyan, Conservatives, North NorthumberlandX: @annietrevW: https://www.teamtrevelyan.co.uk/ Frederick van Mierlo, Lib Dems, Henley and ThameX: @fivanmierloGeorgie Callé, Conservatives, Ealing SouthallX: @GeorgieCalleBen Bradley, Conservatives, MansfieldX: @BBradley_MansW: https://www.benbradley.uk/ Lucy Stephenson, Conservatives, Sheffield Central W: https://rutlandcounty.moderngov.co.uk/mgUserInfo.aspx?UID=138 Just 594 to go... If you are a candidate or know one who'd like to come on the show, email politicalpartypodcast@gmail.com SEE Matt at the Soho Theatre in June: Soho TheatreOr at the Edinburgh Festival in August: Matt Forde The End of an Era Tour Hosted on Acast. See acast.com/privacy for more information.

COSMO Köln Radyosu
DAVA - Almanya'da tartışılan siyasi oluşum

COSMO Köln Radyosu

Play Episode Listen Later Apr 8, 2024 23:53


9 Haziran'da yapılacak Avrupa Parlamentosu (AP) seçimlerine Almanya'dan katılacak partilerden biri de DAVA oluşumu. UID, Diyanet İşleri Başkanlığı ve Milli Görüş'e yakın isimler tarafından kurulan parti Alman kamuoyunda "AKP'nin uzantısı" olmakla eleştiriliyor. AP seçimlerinde birinci sıradan aday olan avukat Fatih Zingal podcast WDR Cosmo Türkçe'ye konuk oldu, bu yeni siyasi oluşumun programını ve hedeflerini anlattı- Ayrıca kamuoyu ile medyada yükselen eleştirilere de yanıt verdi. Mikrofonda Çelik Akpınar ve Elmas Topcu var. Von Celik Akpinar.

Acxiom Podcast
Will Gardens Grow Their Walls Even Higher?

Acxiom Podcast

Play Episode Listen Later Mar 12, 2024 34:04


Gabe Richman of TheTradeDesk joins the podcast to discuss all aspects of cookie deprecation and its impact to the industry – from Google's Privacy Sandbox and at-risk signals to the changing approaches to authentication and measurement. The team jumps into the details of UID 2.0, AI as a tool, and how a CMO should best spend their next marketing dollar.LinkedIn profile: linkedin.com/in/gaberichman/Twitter: twitter.com/TheTradeDeskCompany website: thetradedesk.comThanks for listening! Follow us on Twitter and Instagram or find us on Facebook.

The Bike Shed
417: Module Docs

The Bike Shed

Play Episode Listen Later Mar 5, 2024 39:32


Stephanie shares about her vacation at Disney World, particularly emphasizing the technological advancements in the park's mobile app that made her visit remarkably frictionless. Joël had a conversation about a topic he loves: units of measure, and he got to go deep into the idea of dimensional analysis with someone this week. Together, Joël and Stephanie talk about module documentation within software development. Joël shares his recent experience writing module docs for a Ruby project using the YARD documentation system. He highlights the time-consuming nature of crafting good documentation for each public method in a class, emphasizing that while it's a demanding task, it significantly benefits those who will use the code in the future. They explore the attributes of good documentation, including providing code examples, explaining expected usage, suggesting alternatives, discussing edge cases, linking to external resources, and detailing inputs, outputs, and potential side effects. Multidimensional numbers episode (https://bikeshed.thoughtbot.com/416) YARD docs (https://yardoc.org/) New factory_bot documentation (https://thoughtbot.com/blog/new-docs-for-factory_bot) Dash (https://kapeli.com/dash) Solargraph (https://solargraph.org/) Transcript:  JOËL: Hello and welcome to another episode of The Bike Shed, a weekly podcast from your friends at thoughtbot about developing great software. I'm Joël Quenneville. STEPHANIE: And I'm Stephanie Minn, and together, we're here to share a bit of what we've learned along the way. JOËL: So, Stephanie, what's new in your world? STEPHANIE: So, I recently was on vacation, and I'm excited [chuckles] to tell our listeners all about it. I went to Disney World [laughs]. And honestly, I was especially struck by the tech that they used there. As a person who works in tech, I always kind of have a little bit of a different experience knowing a bit more about software, I suppose, than just your regular person [laughs], citizen. And so, at Disney World, I was really impressed by how seamlessly the like, quote, unquote, "real life experience" integrated with their use of their branded app to pair with, like, your time at the theme park. JOËL: This is, like, an app that runs on your mobile device? STEPHANIE: Yeah, it's a mobile app. I haven't been to Disney in a really long time. I think the last time I went was just as a kid, like, this was, you know, pre-mobile phones. So, I recall when you get into the line at a ride, you can skip the line by getting what's called a fast pass. And so, you kind of take a ticket, and it tells you a designated time to come back so that you could get into the fast line, and you don't have to wait as long. And now all this stuff is on your mobile app, and I basically did not wait in [laughs] a single line for more than, like, five minutes to go on any of the rides I wanted. It just made a lot of sense that all these things that previously had more, like, physical touchstones, were made a bit more convenient. And I hesitate to use the word frictionless, but I would say that accurately describes the experience. JOËL: That's kind of amazing; the idea that you can use tech to make a place that's incredibly busy also feel seamless and where you don't have to wait in line. STEPHANIE: Yeah and, actually, I think the coolest part was it blended both your, like, physical experience really well with your digital one. I think that's kind of a gripe I have as a technologist [laughs] when I'm just kind of too immersed in my screen as opposed to the world around me. But I was really impressed by the way that they managed to make it, like, a really good supplement to your experience being there. JOËL: So, you're not hyped for a future world where you can visit Disney in VR? STEPHANIE: I mean, I just don't think it's the same. I rode a ride [laughs] where it was kind of like a mini roller coaster. It was called Expedition Everest. And there's a moment, this is, like, mostly indoors, but there's a moment where the roller coaster is going down outside, and you're getting that freefall, like, drop feeling in your stomach. And it also happened to be, like, drizzling that day that we were out there, and I could feel it, you know, like, pelting my head [laughs]. And until VR can replicate that experience [chuckles], I still think that going to Disney is pretty fun. JOËL: Amazing. STEPHANIE: So, Joël, what's new in your world? JOËL: I'm really excited because I had a conversation about a topic that I like to talk about: units of measure. And I got to go deep into the idea of dimensional analysis with someone this week. This is a technique where you can look at a calculation or a function and sort of spot-check whether it's correct by looking at whether the unit for the measure that would come out match what you would expect. So, you do math on the units and ignore the numbers coming into your formula. And, you know, let's say you're calculating the speed of something, and you get a distance and the amount of time it took you to take to go that distance. And let's say your method implements this as distance times time. Forget about doing the actual math with the numbers here; just look at the units and say, okay, we've got our meters, and we've got our seconds, and we're multiplying them together. The unit that comes out of this method is meters times seconds. You happen to know that speeds are not measured in meters times seconds. They're measured in meters divided by seconds or meters per second. So, immediately, you get a sense of, like, wait a minute, something's wrong here. I must have a bug in my function. STEPHANIE: Interesting. I'm curious how you're representing that data to, like, know if there's a bug or not. In my head, when you were talking about that, I'm like, oh yeah, I definitely recall doing, like, math problems for homework [laughs] where I had, you know, my meters per second. You have your little fractions written out, and then when you multiply or divide, you know how to, like, deal with the units on your piece of paper where you're showing your work. But I'm having a hard time imagining what that looks like as a programmer dealing with that problem. JOËL: You could do it just all in your head based off of maybe some comments that you might have or the name of the variable or something. So, you're like, okay, well, I have a distance in meters and a time in seconds, and I'm multiplying the two. Therefore, what should be coming out is a value that is in meters times seconds. If you want to get fancier, you can do things with value objects of different types. So, you say, okay, I have a distance, and I have a time. And so, now I have sort of a multiplication of a distance and a time, and sort of what is that coming out as? That can sometimes help you prevent from having some of these mistakes because you might have some kind of error that gets raised at runtime where it's like, hey, you're trying to multiply two units that shouldn't be multiplied, or whatever it is. You can also, in some languages, do this sort of thing automatically at the type level. So, instead of looking at it yourself and sort of inferring it all on your own based off of the written code, languages like F# have built-in unit-of-measure systems where once you sort of tag numbers as just being of a particular unit of measure, any time you do math with those numbers, it will then tag the result with whatever compound unit comes from that operation. So, you have meters, and you have seconds. You divide one by the other, and now the result gets tagged as meters per second. And then, if you have another calculation that takes the output of the first one and it comes in, you can tell the compiler via type signature, hey, the input for this method needs to be in meters per second. And if the other calculation sort of automatically builds something that's of a different unit, you'll get a compilation error. So, it's really cool what it can do. STEPHANIE: Yeah, that is really neat. I like all of those built-in guardrails, I suppose, to help you, you know, make sure that your answer is correct. Definitely could have used that [chuckles]. Turns out I just needed a calculator to take my math test with [laughs]. JOËL: I think what I find valuable more than sort of the very rigorous approach is the mindset. So, anytime you're dealing with numbers, thinking in your mind, what is the unit of this number? When I do math with it with a different number, is it the same unit? Is it a different unit? What is the unit of the thing that's coming out? Does this operation make sense in the domain of my application? Because it's easy to sometimes think you're doing a math operation that makes sense, and then when you look at the unit, you're like, wait a minute, this does not make sense. And I would go so far as to say that, you know, you might think, oh, I'm not doing a physics app. I don't care about units of measure. Most numbers in your app that are actually numbers are going to have some kind of unit of measure associated to them. Occasionally, you might have something where it's just, like, a straight-up, like, quantity or something like that. It's a dimensionless number. But most things will have some sort of unit. Maybe it's a number of dollars. Maybe it is an amount of time, a duration. It could be a distance. It could be all sorts of things. Typically, there is some sort of unit that should attach to it. STEPHANIE: Yeah. That makes sense that you would want to be careful about making sure that your mathematical operations that you're doing when you're doing objects make sense. And we did talk about this in the last episode about multidimensional numbers a little bit. And I suppose I appreciate you saying that because I think I have mostly benefited from other people having thought in that mindset before and encoding, like I mentioned, those guardrails. So, I can recall an app where I was working with, you know, some kind of currency or money object, and that error was raised when I would try to divide by zero because rather than kind of having to find out later with some, not a number or infinite [laughs] amount of money bug, it just didn't let me do that. And that wasn't something that I had really thought about, you know, I just hadn't considered that zero value edge case when I was working on whatever feature I was building. JOËL: Yeah, or even just generally the idea of dividing money. What does that even mean? Are you taking an amount of money and splitting it into two equivalent piles to split among multiple people? That kind of makes sense. Are you dividing money by another money value? That's now asking a very different kind of question. You're asking, like, what is the ratio between these two, I guess, piles of money if we want to make it, you know, in the physical world? Is that a thing that makes sense in your application? But also, realize that that ratio that you get back is not itself an amount of money. And so, there are some subtle bugs that can happen around that when you don't keep track of what your quantities are. So, this past week, I've been working on a project where I ended up having to write module docs for the code in question. This is a Ruby project, so I'm writing docs using the YARD documentation system, where you effectively just write code comments at the sort of high level covering the entire class and then, also, individual documentation comments on each of the methods. And that's been really interesting because I have done this in other languages, but I'd never done it in Ruby before. And this is a piece of code that was kind of gnarly and had been tricky for me to figure out. And I figured that a couple of these classes could really benefit from some more in-depth documentation. And I'm curious, in your experience, Stephanie, as someone who's writing code, using code from other people, and who I assume occasionally reads documentation, what are the things that you like to see in good sort of method-level docs? STEPHANIE: Personally, I'm really only reading method-level docs when, you know, at this point, I'm, like, reaching for a method. I want to figure out how to use it in my use case right now [laughs]. So, I'm going to search API documentation for it. And I really am just scanning for inputs, especially, I think, and maybe looking at, you know, some potential various, like, options or, like, variations of how to use the method. But I'm kind of just searching for that at a glance and then moving on [laughs] with my day. That is kind of my main interaction with module docs like that, and especially ones for Ruby and Rails methods. JOËL: And for clarity's sake, I think when we're talking about module docs here, I'm generally thinking of, like, any sort of documentation that sort of comments in code meant to document. It could be the whole modular class. It could be on a per-method level, things like RDoc or YARD docs on Ruby classes. You used the word API docs here. I think that's a pretty similar idea. STEPHANIE: I really haven't given the idea of writing this kind of documentation a lot of thought because I've never had to do too much of it before, but I know, recently, you have been diving deep into it because, you know, like you said, you found these classes that you were working with a bit ambiguous, I suppose, or just confusing. And I'm wondering what kind of came out of that journey. What are some of the most interesting aspects of doing this exercise? JOËL: And one of the big ones, and it's not a fun one, but it is time-consuming. Writing good docs per method for a couple of classes takes a lot of time, and I understand why people don't do it all the time. STEPHANIE: What kinds of things were you finding warranted that time? Like, you know, you had to, at some point, decide, like, whether or not you're going to document any particular method. And what were some of the things you were looking out for as good reasons to do it? JOËL: I was making the decisions to document or not document on a class level, and then every public method gets documentation. If there's a big public API, that means every single one of those methods is getting some documentation comments, explaining what they do, how they're meant to be used, things like that. I think my kind of conclusion, having worked with this, is that the sort of sweet spot for this sort of documentation is for anything that is library-like, so a lot of things that maybe would go into a Rails lib directory might make sense. Anything you're turning into a gem that probably makes sense. And sometimes you have things in your Rails codebase that are effectively kind of library-like, and that was the case for the code that I was dealing with. It was almost like a mini ORM style kind of ActiveRecord-inspired series of base classes that had a bunch of metaprogramming to allow you to write models that were backed by not a database but a headless CMS, a content management system. And so, these classes are not extracted to the lib directory or, like, made into a gem, but they feel very library-esque in that way. STEPHANIE: Library-like; I like that descriptor a lot because it immediately made me think of another example of a time when I've used or at least, like, consumed this type of documentation in a, like, SaaS repo. Rather, you know, I'm not really seeing that level of documentation around domain objects, but I noticed that they really did a lot of extending of the application record class because they just had some performance needs that they needed to write some, like, custom code to handle. And so, they ended up kind of writing a lot of their own ORM-like methods for just some, like, custom callbacks on persisting and some just, like, bulk insertion functionality. And those came with a lot of different ways to use them. And I really appreciated that they were heavily documented, kind of like you would expect those ActiveRecord methods to be as well. JOËL: So, I've been having some conversations with other members at thoughtbot about when they like to use the style of module doc. What are some of the alternatives? And one that kept coming up for different people that they would contrast with this is what they would call the big README approach, and this could be for a whole gem, or it could be maybe some directory with a few classes in your application that's got a README in the root of the directory. And instead of documenting each method, you just write a giant README trying to answer sort of all of the questions that you anticipate people will ask. Is that something that you've seen, and how do you feel about that as a tool when you're looking for help? STEPHANIE: Yes. I actually really like that style of documentation. I find that I just want examples to get me started, especially; I guess this is especially true for libraries that I'm not super familiar with but need to just get a working knowledge about kind of immediately. So, I like to see examples, the getting started, the just, like, here's what you need to know. And as I start to use them, that will get me rolling. But then, if I find I need more details, then I will try to seek out more specific information that might come in the form of class method documentation. But I'm actually thinking about how FactoryBot has one of the best big README-esque [laughs] style of documentation, and I think they did a really big refresh of the docs not too long ago. It has all that high-level stuff, and then it has more specific information on how to use, you know, the most common methods to construct your factories. But those are very detailed, and yet they do sit, like, separately from inline, like, code documentation in the style of module docs that we're talking about. So, it is kind of an interesting mix of both that I think is helpful for me personally when I want both the “what do I need to know now?” And the, “like, okay, I know where to look for if I need something a little more detailed.” JOËL: Yeah. The two don't need to be mutually exclusive. I thought it was interesting that you mentioned how much examples are valuable to you because...I don't know if this is controversial, but an opinion that I have about sort of per-method documentation is that you should always default to having a code example for every method. I don't care how simple it is or how obvious it is what it does. Show me a code example because, as a developer, examples are really, really helpful. And so, seeing that makes documentation a lot more valuable than just a couple of lines that explain something that was maybe already obvious from the title of the method. I want to see it in action. STEPHANIE: Interesting. Do you want to see it where the method definition is? JOËL: Yes. Because sometimes the method definition, like, the implementation, might be sort of complex. And so, just seeing a couple of examples, like, oh, you call with this input, you get that. Call with this other input; you get this other thing. And we see this in, you know, some of the core docs for things like the enumerable methods where having an example there to be like, oh, so that's how map works. It returns this thing under these circumstances. That sort of thing is really helpful. And then, I'll try to do it at a sort of a bigger level for that class itself. You have a whole paragraph about here's the purpose of the class. Here's how you should use it. And then, here's an example of how you might use it. Particularly, if this is some sort of, like, base class you're meant to inherit from, here's the circumstances you would want to subclass this, and then here's the methods you would likely want to override. And maybe here are the DSLs you might want to have and to kind of package that in, like, a little example of, in this case, if you wanted a model that read from the headless CMS, here's what an example of such a little model might look like. So, it's kind of that putting it all together, which I think is nice in the module docs. It could probably also live in the big README at some level. STEPHANIE: Yeah. As you are saying that, I also thought about how I usually go search for tests to find examples of usage, but I tend to get really overwhelmed when I see inline, like, that much inline documentation. I have to, like, either actively ignore it, choose to ignore it, or be like, okay, I'm reading this now [laughs]. Because it just takes up so much visual space, honestly. And I know you put a lot of work into it, a lot of time, but maybe it's because of the color of my editor theme where comments are just that, like, light gray [laughs]. I find them quite easy to just ignore. But I'm sure there will be some time where I'm like, okay, like, if I need them, I know they're there. JOËL: Yeah, that is, I think, a downside, right? It makes it harder to browse the code sometimes because maybe your entire screen is almost taken up by documentation, and then, you know, you have one method up, and you've got to, like, scroll through another page of documentation before you hit the next method, and that makes it harder to browse. And maybe that's something that plays into the idea of that separation between library-esque code versus application code. When you browse library-esque code, when you're actually browsing the source, you're probably doing it for different reasons than you would for code in your application because, at that point, you're effectively source diving, sometimes being like, oh, I know this class probably has a method that will do the thing I want. Where is it? Or you're like, there's an edge case I don't understand on this method. I wonder what it does. Let me look at the implementation. Or even some existing code in the app is using this library method. I don't know what it does, but they call this method, and I can't figure out why they're using it. Let me look at the source of the library and see what it does under the hood. STEPHANIE: Yeah. I like the distinction of it is kind of a different mindset that you're reading the code at, where, like, sometimes my brain is already ready to just read code and try to figure out inputs and outputs that way. And other times, I'm like, oh, like, I actually can't parse this right now [chuckles]. Like, I want to read just English, like, telling me what to expect or, like, what to look out for, especially when, like you said, I'm not really, like, trying to figure out some strange bug that would lead me to diving deep in the source code. It's I'm at the level where I'm just reaching for a method and wanting to use it. We're writing these YARD docs. I think I also heard you mention that you gave some, like, tips or maybe some gotchas about how to use certain methods. I'm curious why that couldn't have been captured in a more, like, self-documenting way. Or was there a way that you could have written the code for that not to have been needed as a comment or documented as that? And was there a way that method names could have been clear to signal, like, the intention that you were trying to convey through your documentation? JOËL: I'm a big fan of using method names as a form of documentation, but they're frequently not good enough. And I think comments, whether they're just regular inline comments or more official documentation, can be really good to help avoid sort of common pitfalls. And one that I was working with was, there were two methods, and one would find by a UID, so it would search up a document by UID. And another one would search by ID. And when I was attempting to use these before I even started documenting, I used the wrong one, and it took me a while to realize, oh wait, these things have both UIDs and IDs, and they're slightly different, and sometimes you want to use one or the other. The method names, you know, said like, "Find by ID" or "Find by UID." I didn't realize there were both at the time because I wasn't browsing the source. I was just seeing a place where someone had used it. And then, when I did find it in the source, I'm like, well, what is the difference? And so, something that I did when I wrote the docs was sort of call out on both of those methods; by the way, there is also find by UID. If you're searching by UID, consider using the other one. If you don't know what the difference is, here's a sentence summarizing the difference. And then, here's a link to external documentation if you want to dive into the nitty gritty of why there are two and what the differences are. And I think that's something you can't capture in just a method name. STEPHANIE: Yeah, that's true. I like that a lot. Another use case you can think of is when method names are aliased, and it's like, I don't know how I would have possibly known that until I, you know, go through the journey of realizing [laughs] that these two methods do the same thing or, like, stumbling upon where the aliasing happens. But if that were captured in, like, a little note when I'm in, like, a documentation viewer or something, it's just kind of, like, a little tidbit of knowledge [laughs] that I get to gain along the way that ends up, you know, being useful later because I will have just kind of...I will likely remember having seen something like that. And I can at least start my search with a little bit more context than when you don't know what you don't know. JOËL: I put a lot of those sorts of notes on different methods. A lot of them are probably based on a personal story where I made a mistaken assumption about this method, and then it burned me. But I'm like, okay, nobody else is going to make that mistake. By the way, if you think this is what the method does, it does something slightly different and, you know, here's why you need to know that. STEPHANIE: Yeah, you're just looking out for other devs. JOËL: And, you know, trying to, like, take my maybe negative experience and saying like, "How can I get value out of that?" Maybe it doesn't feel great that I lost an hour to something weird about a method. But now that I have spent that hour, can I get value out of it? Is the sort of perspective I try to have on that. So, you mentioned kind of offhand earlier the idea of a documentation viewer, which would be separate than just reading these, I guess, code comments directly in your code editor. What sort of documentation viewers do you like to use? STEPHANIE: I mostly search in my browser, you know, just the official documentation websites for Rails, at least. And then I know that there are also various options for Ruby as well. And I think I had mentioned it before but using DuckDuckGo as my search engine. I have nice bang commands that will just take me straight to the search for those websites, which is really nice. Though, I have paired with people before who used various, like, macOS applications to do something similar. I think Alfred might have some built-in workflows for that. And then, a former co-worker used to use one called Dash, that I have seen before, too. So, it's another one of those just handy just, like, search productivity extensions. JOËL: You mentioned the Rails documentation, and this is separate from the guides. But the actual Rails docs are generated from comments like this inline in code. So, all the different ActiveRecord methods, when you search on the Rails documentation you're like, oh yeah, how does find_by work? And they've got a whole, like, paragraph explaining how it works with a couple of examples. That's this kind of documentation. If you open up that particular file in the source code, you'll find the comments. And it makes sense for Rails because Rails is more of, you know, library-esque code. And you and I search these docs pretty frequently, although we don't tend to do it, like, by opening the Rails gem and, like, grepping through the source to find the code comment. We do it through either a documentation site that's been compiled from that source or that documentation that's been extracted into an offline tool, like you'd mentioned, Dash. STEPHANIE: Yeah, I realized how conflicting, I suppose, it is for me to say that I find inline documentation really overwhelming or visually distracting, whereas I recognize that the only reason I can have that nice, you know, viewing experience is because documentation viewers use the code comments in that format to be generated. JOËL: I wonder if there's like a sort of...I don't know what this pattern is called, but a bit of a, like, middle-quality trap where if you're going to source dive, like, you'd rather just look at the code and not have too much clutter from sort of mediocre comments. But if the documentation is really good and you have the tooling to read it, then you don't even need to source dive at all. You can just read the documentation, and that's sufficient. So, both extremes are good, but that sort of middle kind of one foot in each camp is sort of the worst of both worlds experience. Because I assume when you look for Rails documentation, you never open up the actual codebase to search. The documentation is good enough that you don't even need to look at the files with the comments and the code. STEPHANIE: Yeah, and I'm just recalling now there's, like, a UI feature to view the source from the documentation viewer page. JOËL: Yes. STEPHANIE: I use that actually quite a bit if the comments are a little bit sparse and I need just the code to supplement my understanding, and that is really nice. But you're right, like, I very rarely would be source diving, unless it's a last resort [laughs], let's be honest. JOËL: So, we've talked about documentation viewers and how that can make things nice, and you're able to read documentation for things. But a lot of other tooling can benefit from this sort of model documentation as well, and I'm thinking, in particular, Solargraph, which is Ruby's language server protocol. And it has plugins for VS Code, for Vim, for a few different editors, takes advantage of that to provide all sorts of things. So, you can get smart expansion of code and good suggestions. You can get documentation for what's under your cursor. Maybe you're reading somebody else's code that they've written, and you're like, why are they calling this parameterized method here? What does that even do? Like, in VS Code, you could just hover over it, and it will pop up and show you documentation, including the, like, inputs and return types, and things like that. That's pretty nifty. STEPHANIE: Yeah, that is cool. I use VS Code, but I've not seen that too much yet because I don't think I've worked in enough codebases with really comprehensive [laughs] YARD docs. I'm actually wondering, tooling-wise, did you use any helpful tools when you were writing them or were you hand-documenting each? JOËL: I was hand-documenting everything. STEPHANIE: Class. Okay. JOËL: The thing that I did use is the YARD gem, which you don't need to have the gem to write YARD-style documentation. But if you have the gem, you can run a local server and then preview a documentation site that is generated from your comments that has everything in there. And that was incredibly helpful for me as I was trying to sort of see an overview of, okay, what would someone who's looking at the docs generated from this see when they're trying to look for what the documentation of a particular method does? STEPHANIE: Yeah, and that's really nice. JOËL: Something that I am curious about that I've not really had a lot of experience with is whether or not having extra documentation like that can help AI tools give us better suggestions. STEPHANIE: Yeah, I don't know the answer to that either, but I would be really curious to know if that is already something that happens with something like Copilot. JOËL: Do better docs help machines, or are they for humans only? STEPHANIE: Whoa, that's a very [laughs] philosophical question, I think. It would make sense, though, that if we already have ways to parse and compile this kind of documentation, then I can see that incorporating them into the types of, like, generative problems that AI quote, unquote "solves" [chuckles] would be really interesting to find out. But anyone listening who kind of knows the answer to that or has experience working with AI tools and various types of code comment documentation would be really curious to know what your experience is like and if it improves your development workflow. So, for people who might be interested in getting better at documenting their code in the style of module docs, what would you say are some really great attributes of good documentation in this form? JOËL: I think, first of all, you have to write from the motivation of, like, if you were confused and wanting to better understand what a method does, what would you like to see? And I think coming from that perspective, and that was, in my case, I had been that person, and then I was like, okay, now that I've figured it out, I'm going to write it so that the next person is not confused. I have five or six things that I think were really valuable to add to the docs, a few of which we've already mentioned. But rapid fire, first of all, code example. I love code examples. I want a code example on every method. An explanation of expected usage. Here's what the method does. Here's how we expect you to use this method in any extra context about sort of intended use. Callouts for suggested alternatives. If there are methods that are similar, or there's maybe a sort of common mistake that you would reach for this method, put some sort of call out to say, "Hey, you probably came here trying to do X. If that's what you were actually trying to do, you should use method Y." Beyond that, a discussion of edge cases, so any sort of weird ways the method behaves. You know, when you pass nil to it, does it behave differently? If you call it in a different context, does it behave differently? I want to know that so that I'm not totally surprised. Links to external resources–really great if I want to, like, dig deeper. Is this method built on some sort of, like, algorithm that's documented elsewhere? Please link to that algorithm. Is this method integrating with some, like, third-party API? You know, they have some documentation that we could link to to go deeper into, like, what these search options do. Link to that. External links are great. I could probably find it by Googling myself, but you are going to make me very happy as a developer if you already give me the link. You'd mentioned capturing inputs and outputs. That's a great thing to scan for. Inputs and outputs, though, are more sometimes than just the arguments and return values. Although if we're talking about arguments, any sort of options hash, please document the keys that go in that because that's often not obvious from the code. And I've spent a lot of time source diving and jumping between methods trying to figure out like, what are the options I can pass to this hash? Beyond the explicit inputs and outputs, though, anything that is global state that you rely on. So, do you need to read something from an environment variable or even a global variable or something like that that might make this method behave differently in different situations? Please document that. Any situations where you might raise an error that I might not expect or that I might want to rescue from, let me know what are the potential errors that might get raised. And then, finally, any sorts of side effects. Does this method make a network call? Are you writing to the file system? I'd like to know that, and I'd have to, like, figure it out by trial and error. And sometimes, it will be obvious in just the description of the method, right? Oh, this method pulls data from a third-party API. That's pretty clear. But maybe it does some sort of, like, caching in the background or something to a file that's not really important. But maybe I'm trying to do a unit test that involves this, and now, all of a sudden, I have to do some weird stubbing. I'd like to know that upfront. So, those are kind of all the things I would love to have in my sort of ideal documentation comment that would make my life easier as a developer when trying to use some code. STEPHANIE: Wow. What a passionate plea [laughs]. I was very into listening to you list all of that. You got very animated. And it makes a lot of sense because I feel like these are kind of just the day-to-day developer issues we run into in our work and would be so awesome if, especially as the, you know, author where you have figured all of this stuff out, the author of a, you know, a method or a class, to just kind of tell us these things so we don't have to figure it out ourselves. I guess I also have to respond to that by saying, on one hand, I totally get, like, you want to be saved [chuckles] from those common pitfalls. But I think that part of our work is just going through that and playing around and exploring with the code in front of us, and we learn all of that along the way. And, ultimately, even if that is all provided to you, there is something about, like, going through it yourself that gives you a different perspective on it. And, I don't know, maybe it's just my bias against [laughs] all the inline text, but I've also seen a lot of that type of information captured at different levels of documentation. So, maybe it is a Confluence doc or in a wiki talking about, you know, common gotchas for this particular problem that they were trying to solve. And I think what's really cool is that, you know, everyone can kind of be served and that people have different needs that different styles of documentation can meet. So, for anyone diving deep in the source code, they can see all of those examples inline. But, for me, as a big Googler [laughs], I want to see just a nice, little web app to get me the information that I need to find. I'm happy having that a little bit more, like, extracted from my source code. JOËL: Right. You don't want to have to read the source code with all the comments in it. I think that's a fair criticism and, yeah, probably a downside of this. And I'm wondering, there might be some editor tooling that allows you to just collapse all comments and hide them if you wanted to focus on just the code. STEPHANIE: Yeah, someone, please build that for me. That's my passionate plea [laughs]. And on that note, shall we wrap up? JOËL: Let's wrap up. STEPHANIE: Show notes for this episode can be found at bikeshed.fm. JOËL: This show has been produced and edited by Mandy Moore. STEPHANIE: If you enjoyed listening, one really easy way to support the show is to leave us a quick rating or even a review in iTunes. It really helps other folks find the show. JOËL: If you have any feedback for this or any of our other episodes, you can reach us @_bikeshed, or you can reach me @joelquen on Twitter. STEPHANIE: Or reach both of us at hosts@bikeshed.fm via email. JOËL: Thanks so much for listening to The Bike Shed, and we'll see you next week. ALL: Bye. AD: Did you know thoughtbot has a referral program? If you introduce us to someone looking for a design or development partner, we will compensate you if they decide to work with us. More info on our website at: tbot.io/referral. Or you can email us at referrals@thoughtbot.com with any questions.

Trace Evidence
232 - The Murder of Little Christmas Doe

Trace Evidence

Play Episode Listen Later Oct 21, 2023 51:59


On Wednesday, December 21st, 1988, a timber truck driver making his way through rural Georgia made a horrifying discovery. Just off US Route 82, down at the end of a long dirt road, in a dusty turn out which doubled as an illegal dumping ground, he found the remains of an unidentified toddler wedged inside of an old television console.Investigators would later reveal that the child, described as a young black girl, was estimated to be between the ages of 3 and 4 and had likely been dead for thirty to sixty days. Whoever had dumped her in that spot had first wrapped her in a blanket, placed her inside of a duffel bag, filled that bag with concrete, and then put it inside of a metal foot locker which was also filled with concrete. It was a scene so bizarre, so disturbing, it continues to haunt the investigators who worked it.Despite their sincerest efforts, the case grew cold and the child's true identity remained illusive. Twenty-one years later, in 2009, an anonymous tipster claimed that the child might have been named Bridget, could have been from the city of Albany and might have family living in Tifton. All these years later and both little Christmas Doe's identity and that of her killer remain unknown.Sponsored by: ZocDoc! Visit ZocDoc.com/TRACE to download the ZocDoc app FREE! | Babbel! Visit Babbel.com/TRACE for 55% off your subscription!Social Media and Subscription Link TreeMusic Courtesy of: "Wounded" Kevin MacLeod (incompetech.com) Licensed under Creative Commons: By Attribution 3.0 License http://creativecommons.org/licenses/by/3.0/#truecrime #truecrimepodcast #realcrimes #disappearance #disappeared #missing #unsolved #unsolvedmysteries #evidence #investigation #missingperson #traceevidence #JaneDoe #ChristmasDoe #LittleChristmasDoe #unidentified #UID #WareCounty #GeorgiaTrueCrime #GBI #Millwood

Open4Business from NLive Radio
Matt Golby⁠ - Councillor

Open4Business from NLive Radio

Play Episode Listen Later Oct 11, 2023 15:51


Born and bred in Northampton and from a well-known local business family, Councillor Matt Golby talks about his career and work as the Councillor with Cabinet responsibility for Adult Social Care and Public Health, some of the key initiatives in his area as well as his aspirations in office - and how he keeps his ear to the ground! See https://westnorthants.moderngov.co.uk/mgUserInfo.aspx?UID=164

Open4Business from NLive Radio
Dan Lister - Councillor - West Northamptonshire

Open4Business from NLive Radio

Play Episode Listen Later Sep 27, 2023 47:35


In the first of new series interviewing all the West Northamptonshire Councillors with Cabinet responsibility, Adrian talks with Cllr Dan Lister about his career, his Cabinet responsibilities for economic development, the controversial Market Square development and market displacement and his hopes and aspirations for the future od Northampton town and county. See https://westnorthants.moderngov.co.uk/mgUserInfo.aspx?UID=365

Pour de vrai
[Déguiller le patriarcat BONUS] Interview de Pauline Milani, historienne

Pour de vrai

Play Episode Listen Later Aug 25, 2023 37:31


Pauline Milani est historienne, enseignante et chercheuse et s'intéresse particulièrement aux thématiques touchant aux femmes, au genre et à l'écriture de l'histoire des femmes. C'est parce que ces thématiques forment un parfait complément à la série “Déguiller le patriarcat” et aux questions que la création de cette série a suscité chez moi que j'ai eu envie de m'entretenir avec elle. Je la remercie pour cette discussion. Références citées : Le journal L'Exploitée sur la plateforme e-periodica : https://www.e-periodica.ch/digbib/volumes?UID=exp-001 "Dictionnaire sur l'histoire des femmes en Suisse", dirigé par Raphaëlle Ruppen Coutat et Pauline Milani: femmesensuisse.ch (le dictionnaire sera présenté en novembre, c'est une version provisoire en ligne) Dupuis-Déri, Francis. La crise de la masculinité : autopsie d'un mythe tenace. Les Éditions du remue-ménage, 2018 Episode de Pour de vrai sur la tuerie de Polytechnique : https://spotifyanchor-web.app.link/e/RvT1ZjepqBb Podcast La Méthode Rebecca Amsellem sur Louie Media https://louiemedia.com/la-methode Article Fémina, avril 2023. Les luttes féministes vont-elles trop loin. Interview de Pauline Milani : https://www.femina.ch/societe/actu-societe/les-luttes-feministes-vont-elles-trop-loin-aujourdhui

仙境之桥|欢乐的动漫聊天室
崩坏:星穹铁道 | vol.1 科幻梗和ACG彩蛋的狂欢,银河球棒侠喊你上车啦!

仙境之桥|欢乐的动漫聊天室

Play Episode Listen Later Jul 15, 2023 85:55


玩游戏还有好友位的加我(未央UID 101856274)让我借借各位的助战大佬!!主播:未央、酸奶嘉宾:毛师傅后期:姚阿华、毛师傅进群+vx: weiyangyang27(未央)、bai5jinji_(白马)开服玩家未央最近沉迷崩铁,回合制太适合只剩下碎片时间的老母亲了请来了冷面小青龙毛师傅一起分享下初玩游戏的心得体会在大版本1.2剧情更新之前,先回顾一下序章黑塔空间站和第一个世界雅利洛-IV的故事下一期咱们就仙舟罗浮篇再继续呀~有哪些彩蛋或者剧情、人物让你印象深刻呢?快来评论区跟我们聊聊!

Pour de vrai
[Déguiller le patriarcat #3] Emilie Gourd

Pour de vrai

Play Episode Listen Later Jun 30, 2023 11:32


Ce troisième épisode de Déguiller le patriarcat est consacré à la genevoise Emilie Gourd, une pionnières du féminisme suisse et international.  Elle a consacré sa vie à lutter pour l'amélioration des conditions de vie des femmes et leur émancipation, sans laisser aucun des sujets de l'époque de côté. Elle fut de toutes les batailles: suffrage féminin, assurance maladie, assurance maternité, formation des filles, égalité des salaires, accès des femmes à toutes les fonctions, etc…  Accès aux archives de la revue Le Mouvement féministe : ⁠https://www.e-periodica.ch/digbib/volumes?UID=emi-001⁠ Sources et références :  http://www.emiliegourd.ch/qui-etait-emilie-gourd- Martine Chaponnière: "Mouvement féministe, Le (revue)", in: Dictionnaire historique de la Suisse (DHS), version du 24.02.2021. Online: https://hls-dhs-dss.ch/fr/articles/047098/2021-02-24/, consulté le 12.05.2023. Martine Chaponnière: "Gourd, Emilie", in: Dictionnaire historique de la Suisse (DHS), version du 17.07.2007. Online: https://hls-dhs-dss.ch/fr/articles/009308/2007-07-17/, consulté le 12.05.2023.  https://fr.wikipedia.org/wiki/%C3%89milie_Gourd https://www.saffa.ch/fr/histoire  https://www.letemps.ch/suisse/emilie-gourd-une-passionnaria-feministe https://fr.wikipedia.org/wiki/Association_suisse_pour_le_suffrage_f%C3%A9minin Crettaz, Dorothée, Deléamont Patricia, Détraz, Patricia. L'idée marche: Emilie Gourd, rédactrice du “Mouvement Féministe”. Genève: éditeur non identifié, 2001. Pionnières et créatrices en Suisse romande, XIXe et XXe siècles. Service pour la promotion de l'égalité entre homme et femme, 2004 Chaponnière, Martine, Devenir ou redevenir femme : l'éducation des femmes et le mouvement féministe en Suisse, du début du siècle à nos jours. Société d'histoire et d'archéologie, 1992. Chapitre 6 : Femmes suisses et le Mouvement féministe, p. 175-192 Gourd, Emilie. A travail égal, salaire égal : d'après une enquête faite par l'Association suisse pour le suffrage féminin (1917-1918). Association nationale suisse pour le suffrage féminin, 1919. 

Masters of Privacy
Adam Klee: combining media addressability, privacy compliance and customer empowerment

Masters of Privacy

Play Episode Listen Later May 28, 2023 36:00


Adam Klee has an impressive resume in the AdTech world, having worked at Disney, Google, NBC, Twitter, Polar, or Spotify. He is the founder of Licorice, a platform that “gives consumers the privacy they want and publishers the data they need”. Adam's passion for solving this problem comes from both his years developing new ways to help drive better yield for publishers, and his experience as a consumer, where he thinks privacy should come standard. We are covering: Why email-based identity solutions (as an alternative to cookies) are flawed What consumers expect in the media monetization trade-off (ad blockers!) Different degrees of control and convenience, and how consent banners are the opposite of both A formula to rely on other legal bases (such as the GDPR's legitimate interest) when no individual deduplication is involved. References: Adam Klee on LinkedIn Licorice  Licorice featured on AdExchanger: Programmatic Vets Are Behind A Wave Of New Startups Built For A Privacy-First Web Topics API (Chrome Privacy Sandbox)  

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

We're trying a new format, inspired by Acquired.fm! No guests, no news, just highly prepared, in-depth conversation on one topic that will level up your understanding. We aren't experts, we are learning in public. Please let us know what we got wrong and what you think of this new format!When you ask someone to break down the basic ingredients of a Large Language Model, you'll often hear a few things: You need lots of data. You need lots of compute. You need models with billions of parameters. Trust the Bitter Lesson, more more more, scale is all you need. Right?Nobody ever mentions the subtle influence of great benchmarking.LLM Benchmarks mark our progress in building artificial intelligences, progressing from * knowing what words go with others (1985 WordNet)* recognizing names and entities (2004 Enron Emails) * and image of numbers, letters, and clothes (1998-2017 MNIST)* language translation (2002 BLEU → 2020 XTREME)* more and more images (2009 ImageNet, CIFAR)* reasoning in sentences (2016 LAMBADA) and paragraphs (2019 AI2RC, DROP)* stringing together whole sentences (2018 GLUE and SuperGLUE)* question answering (2019 CoQA)* having common sense (2018 Swag and HellaSwag, 2019 WinoGrande)* knowledge of all human tasks and professional exams (2021 MMLU)* knowing everything (2022 BIG-Bench)People who make benchmarks are the unsung heroes of LLM research, because they dream up ever harder tests that last ever shorter periods of time.In our first AI Fundamentals episode, we take a trek through history to try to explain what we have learned about LLM Benchmarking, and what issues we have discovered with them. There are way, way too many links and references to include in this email. You can follow along the work we did for our show prep in this podcast's accompanying repo, with all papers and selected tests pulled out.Enjoy and please let us know what other fundamentals topics you'd like us to cover!Timestamps* [00:00:21] Benchmarking Questions* [00:03:08] Why AI Benchmarks matter* [00:06:02] Introducing Benchmark Metrics* [00:08:14] Benchmarking Methodology* [00:09:45] 1985-1989: WordNet and Entailment* [00:12:44] 1998-2004 Enron Emails and MNIST* [00:14:35] 2009-14: ImageNet, CIFAR and the AlexNet Moment for Deep Learning* [00:17:42] 2018-19: GLUE and SuperGLUE - Single Sentence, Similarity and Paraphrase, Inference* [00:23:21] 2018-19: Swag and HellaSwag - Common Sense Inference* [00:26:07] Aside: How to Design Benchmarks* [00:26:51] 2021: MMLU - Human level Professional Knowledge* [00:29:39] 2021: HumanEval - Code Generation* [00:31:51] 2020: XTREME - Multilingual Benchmarks* [00:35:14] 2022: BIG-Bench - The Biggest of the Benches* [00:37:40] EDIT: Why BIG-Bench is missing from GPT4 Results* [00:38:25] Issue: GPT4 vs the mystery of the AMC10/12* [00:40:28] Issue: Data Contamination* [00:42:13] Other Issues: Benchmark Data Quality and the Iris data set* [00:45:44] Tradeoffs of Latency, Inference Cost, Throughput* [00:49:45] ConclusionTranscript[00:00:00] Hey everyone. Welcome to the Latent Space Podcast. This is Alessio, partner and CTO and residence at Decibel Partners, and I'm joined by my co-host, swyx writer and editor of Latent Space.[00:00:21] Benchmarking Questions[00:00:21] Up until today, we never verified that we're actually humans to you guys. So we'd have one good thing to do today would be run ourselves through some AI benchmarks and see if we are humans.[00:00:31] Indeed. So, since I got you here, Sean, I'll start with one of the classic benchmark questions, which is what movie does this emoji describe? The emoji set is little Kid Bluefish yellow, bluefish orange Puffer fish. One movie does that. I think if you added an octopus, it would be slightly easier. But I prepped this question so I know it's finding Nemo.[00:00:57] You are so far a human. Second one of these emoji questions instead, depicts a superhero man, a superwoman, three little kids, one of them, which is a toddler. So you got this one too? Yeah. It's one of my favorite movies ever. It's the Incredibles. Uh, second one was kind of a letdown, but the first is a.[00:01:17] Awesome. Okay, I'm gonna ramp it up a little bit. So let's ask something that involves a little bit of world knowledge. So when you drop a ball from rest, it accelerates downward at 9.8 meters per second if you throw it downward instead, assuming no air resistance, so you're throwing it down instead of dropping it, it's acceleration immediately after leaving your hand is a 9.8 meters per second.[00:01:38] B, more than 9.8 meters per second. C less than 9.8 meters per second. D cannot say unless the speed of the throw is. I would say B, you know, I started as a physics major and then I changed, but I think I, I got enough from my first year. That is B Yeah. Even proven that you're human cuz you got it wrong.[00:01:56] Whereas the AI got it right is 9.8 meters per second. The gravitational constant, uh, because you are no longer accelerating after you leave the hand. The question says if you throw it downward after leaving your hand, what is the. It is, it goes back to the gravitational constant, which is 9.8 meters per, I thought you said you were a physics major.[00:02:17] That's why I changed. So I'm a human. I'm a human. You're human. You're human. But you, you got them all right. So I can't ramp it up. I can't ramp it up. So, Assuming, uh, the AI got all of that right, you would think that AI will get this one wrong. Mm-hmm. Because it's just predicting the next token, right?[00:02:31] Right. In the complex Z plane, the set of points satisfying the equation. Z squared equals modulars. Z squared is A, a pair points B circle, C, a half line D, online D square. The processing is, this is going on in your head. You got minus three. A line. This is hard. Yes, that is. That is a line. Okay. What's funny is that I think if, if an AI was doing this, it would take the same exact amount of time to answer this as it would every single other word.[00:03:05] Cuz it's computationally the same to them. Right.[00:03:08] Why AI Benchmarks matter[00:03:08] Um, so anyway, if you haven't caught on today, we're doing our first, uh, AI fundamentals episode, which just the two of us, no guess because we wanted to go deep on one topic and the topic. AI benchmarks. So why are we focusing on AI benchmarks? So, GPT4 just came out last week and every time a new model comes out, All we hear about is it's so much better than the previous model on benchmark X, on benchmark Y.[00:03:33] It performs better on this, better on that. But most people don't actually know what actually goes on under these benchmarks. So we thought it would be helpful for people to put these things in context. And also benchmarks evolved. Like the more the models improve, the harder the benchmarks get. Like I couldn't even get one of the questions right.[00:03:52] So obviously they're working and you'll see that. From the 1990s where some of the first ones came out to day, the, the difficulty of them is truly skyrocketed. So we wanna give a, a brief history of that and leave you with a mental model on, okay, what does it really mean to do well at X benchmark versus Y benchmark?[00:04:13] Um, so excited to add that in. I would also say when you ask people what are the ingredients going into a large language model, they'll talk to you about the data. They'll talk to you about the neural nets, they'll talk to you about the amount of compute, you know, how many GPUs are getting burned based on this.[00:04:30] They never talk to you about the benchmarks. And it's actually a shame because they're so influential. Like that is the entirety of how we judge whether a language model is better than the other. Cuz a language model can do anything out of. Potentially infinite capabilities. How do you judge one model versus another?[00:04:48] How do you know you're getting better? And so I think it's an area of intense specialization. Also, I think when. Individuals like us, you know, we sort of play with the language models. We are basically doing benchmarks. We're saying, look, it's, it's doing this awesome thing that I found. Guess what? There have been academics studying this for 20 years who have, uh, developed a science to this, and we can actually benefit from studying what they have done.[00:05:10] Yep. And obviously the benchmarks also drive research, you know, in a way whenever you're working on, in a new model. Yeah. The benchmark kind of constraints what you're optimizing for in a way. Because if you've read a paper and it performs worse than all the other models, like you're not gonna publish it.[00:05:27] Yeah. So in a way, there's bias in the benchmark itself. Yeah. Yeah. We'll talk a little bit about that. Right. Are we optimizing for the right things when we over-optimize for a single benchmark over over some others? And also curiously, when GPT4 was released, they emitted some very. Commonplace industry benchmarks.[00:05:44] So the way that you present yourself, it is a form of marketing. It is a form of trying to say you're better than something else. And, and trying to explain where you think you, you do better. But it's very hard to verify as well because there are certain problems with reproducing benchmarks, uh, especially when you come to large language models.[00:06:02] Introducing Benchmark Metrics[00:06:02] So where do we go from here? Should we go over the, the major concept? Yeah. When it comes to benchmark metrics, we get three main measures. Accuracy, precision, recall accuracy is just looking at how many successful prediction the model does. Precision is the ratio of true positives, meaning how many of them are good compared to the overall amount of predictions made Versus recall is what proportion of the positives were identified.[00:06:31] So if you think. Spotify playlist to maybe make it a little more approachable, precision is looking. How many songs in a Spotify playlist did you like versus recall is looking at of all the Spotify songs that you like in the word, how many of them were put in the in the playlist? So it's more looking at how many of the true positives can you actually bring into the model versus like more focusing on just being right.[00:06:57] And the two things are precision and recall are usually in tension.. If you're looking for a higher position, you wanna have a higher percentage of correct results. You're usually bringing recall down because you lead to kind of like lower response sets, you know, so there's always trade offs. And this is a big part of the benchmarking too.[00:07:20] You know, what do you wanna optimize for? And most benchmarks use this, um, F1 score, which is the harmonic mean of precision and recall. Which is, you know, we'll put it in the show notes, but just like two times, like the, you know, precision Times Recall divided by the sum. So that's one. And then you get the Stanford Helm metrics.[00:07:38] Um, yeah, so ultimately I think we have advanced a lot in the, in the past few decades on how we measure language models. And the most interesting one came out January of this year from Percy Lang's research lab at Stanford, and he's got. A few metrics, accuracy, calibration, robustness, fairness, efficiency, general information bias and toxicity, and caring that your language models are not toxic and not biased.[00:08:03] So is is, mm-hmm. Kind of a new thing because we have solved the other stuff, therefore we get to care about the toxic of, uh, the language models yelling at us.[00:08:14] Benchmarking Methodology[00:08:14] But yeah, I mean, maybe we can also talk about the other forms of how their be. Yeah, there's three main modes. You can need a benchmark model in a zero shot fashion, few shot or fine tune models, zero shots.[00:08:27] You do not provide any example and you're just testing how good the model is at generalizing few shots, you have a couple examples that you provide and then. You see from there how good the model is. These are the number of examples usually represented with a K, so you might see few shots, K equal five, it means five examples were passed, and then fine tune is you actually take a bunch of data and fine tune the model for that specific task, and then you test it.[00:08:55] These all go from the least amount of work required to the most amount of work required. If you're doing zero shots benchmarking, you do not need to have any data, so you can just take 'em out and do. If you're fine tuning it, you actually need a lot of data and a lot of compute time. You're expecting to see much better results from there.[00:09:14] Yeah. And sometimes the number of shots can go up to like a hundred, which is pretty surprising for me to see that people are willing to test these language models that far. But why not? You just run the computer a little bit longer. Yeah. Uh, what's next? Should we go into history and then benchmarks? Yeah.[00:09:29] History of Benchmarking since 1985[00:09:29] Okay, so I was up all night yesterday. I was like, this is a fascinating topic. And I was like, all right, I'll just do whatever's in the G PT three paper. And then I read those papers and they all cited previous papers, and I went back and back and back all the way to 1985. The very first benchmark that I can find.[00:09:45] 1985-1989: WordNet and Entailment[00:09:45] Which is WordNet, which is uh, an English benchmark created in at Princeton University by George Miller and Christian Fellbaum. Uh, so fun fact, Chris George Miller also authored the paper, the Magical Number seven plus Minus two, which is the observation that people have a short term memory of about seven for things.[00:10:04] If you have plus or minus two of seven, that's about all you can sort of remember in the short term, and I just wanted. Say like, this was before computers, right? 1985. This was before any of these personal computers were around. I just wanna give people a sense of how much work manual work was being done by these people.[00:10:22] The database, uh, WordNet. Sorry. The WordNet database contains 155,000 words organized in 175,000 sys. These sys are basically just pairings of nouns and verbs and adjectives and adverbs that go together. So in other words, for example, if you have nouns that are hyper names, if every X is a, is a kind of Y.[00:10:44] So a canine is a hyper name of a dog. It's a holo. If X is a part of Y, so a building is a hollow name of a window. The most interesting one for in terms of formal, uh, linguistic logic is entailment, which captures the relationship between two words, where the verb Y is entailed by X. So if by doing X, you must be doing Y.[00:11:02] So in other words, two, sleep is entailed by two snore because you cannot snore without also sleeping and manually mapping 155,000 words like that, the relationships between all of them in a, in a nested tree, which is. Incredible to me. Mm-hmm. And people just did that on faith. They were like, this will be useful somehow.[00:11:21] Right. Uh, and they were interested in cycle linguistics, like understanding how humans thought, but then it turned out that this was a very good dataset for understanding semantic similarity, right? Mm-hmm. Like if you measure the distance between two words by traversing up and down the graph, you can find how similar to two words are, and therefore, Try to figure out like how close they are and trade a model to, to predict that sentiment analysis.[00:11:42] You can, you can see how far something is from something that is considered a good sentiment or a bad sentiment or machine translation from one language to the other. Uh, they're not 200 word languages, which is just amazing. Like people had to do this without computers. Penn Tree Bank, I was in 1989, I went to Penn, so I always give a shout out to my university.[00:12:01] This one expanded to 4.5 million words of text, which is every uh, wall Street Journal. For three years, hand collected, hand labeled by grad students your tuition dollars at work. So I'm gonna skip forward from the eighties to the nineties. Uh, NYS was the most famous data set that came out of this. So this is the, uh, data set of 60,000.[00:12:25] Training images of, uh, of numbers. And this was the first visual dataset where, uh, people were tr tracking like, you know, handwritten numbers and, and mapping them to digital numbers and seeing what the error rate for them was. Uh, these days I think this can be trained in like e every Hello world for machine learning is just train missed in like four lanes of code.[00:12:44] 1998-2004 Enron Emails and MNIST[00:12:44] Then we have the Enron email data set. Enron failed in 2001. Uh, the emails were released in 2004 and they've been upgraded every, uh, every few years since then. That is 600,000 emails by 150 senior employees of Enron, which is really interesting because these are email people emailing each other back and forth in a very natural.[00:13:01] Context not knowing they're being, they're about to be observed, so you can do things like email classification, email summarization, entity recognition and language modeling, which is super cool. Any thoughts about that be before we go into the two thousands? I think like in a way that kind of puts you back to the bias, you know, in some of these benchmarks, in some of these data sets.[00:13:21] You know, like if your main corpus of benchmarking for entity recognition is a public energy company. Mm-hmm. You know, like if you're building something completely different and you're building a model for that, maybe it'll be worse. You know, you start to see how we started. With kind of like, WordNet is just like human linguistics, you know?[00:13:43] Yes. It's not domain related. And then, um, same with, you know, but now we're starting to get into more and more domain-specific benchmarks and you'll see this increase over time. Yeah. NY itself was very biased towards, um, training on handwritten letter. Uh, and handwritten numbers. So, um, in 2017 they actually extended it to Eist, which is an extended to extension to handwritten letters that seems very natural.[00:14:08] And then 2017, they also had fashion ness, which is a very popular data set, which is images of clothing items pulled from Zando. So you can see the capabilities of computer vision growing from single digit, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, to all the letters of the alphabet. To now we can recognize images, uh, of fashion, clothing items.[00:14:28] So it's pretty. So the big one for deep learning, cuz all of that was just, just the appetizers, just getting started.[00:14:35] 2009-2014 : ImageNet, CIFAR and the AlexNet Moment for Deep Learning[00:14:35] The big one for deep learning was ImageNet, which is where Fafa Lee came into the picture and that's why she's super well known. She started working in 2006 and released it in 2009. Fun fact, she actually met with, uh, Christian Feldbaum, who was, uh, one of the co-authors of, uh, war.[00:14:51] To create ImageNet. So there's a direct lineage from Words to Images. Yeah. And uh, they use Amazon Mechanical Turk to help with classification images. No longer grad students. But again, like I think, uh, this goes, kind of goes back to your observation about bias, like when I am a mechanical Turk worker. And I'm being paid by the image to classify an image.[00:15:10] Do you think I'll be very careful at my job? Right? Yeah. Whereas when I'm a, you know, Enron employee, emailing my, my fellow coworker, trying to just communicate something of, of natural language that is a different type of, uh, environment. Mm-hmm. So it's a pretty interesting benchmark. So it was released in 2009 ish and, you know, people were sort of competing to recognize and classify that properly.[00:15:33] The magic moment for ImageNet came in 2012, uh, which is called the AlexNet moment cuz I think that grad student that, um, created this recognition model was, uh, named Alex, I forget his last name, achieved a error rate of 15%, which is, More than 10% lower than the runner up. So it was used just so much better than the second place that everyone else was like, what are you doing?[00:15:54] Uh, and it turned out that he was, he was the first to use, uh, deep learning, uh, c n n 10 percentage points. So like 15 and the other one was 25. Yeah, exactly. So it was just so much, so much better than the others. It was just unbelievable that no one else was, no other approach was even coming close.[00:16:09] Therefore, everyone from there on out for the next, until today we're just learning the lessons of deep learning because, um, it is so much superior to the other approaches. And this was like a big. Images and visual moment because then you had like a sci-fi 10, which is a, another, like a data set that is mostly images.[00:16:27] Mm-hmm. Focused. Mm-hmm. So it took a little bit before we got back to to text. And nowadays it feels like text, you know, text models are kind of eating the word, you know, we're making the text one multi-model. Yeah. So like we're bringing the images to GBT four instead of the opposite. But yeah, in 2009 we had a, another 60,000 images that set.[00:16:46] 32 by 32. Color images with airplanes, automobiles, like, uh, animals, like all kind of stuff. Like I, I think before we had the numbers, then we had the handwritten letters. Then we had clothing, and then we finally made clothing items came after, oh, clothing items. 2009. Yeah, this is 2009. I skipped, I skipped time a little bit.[00:17:08] Yeah, yeah. But yeah, CFR 10 and CFR 100. CFR 10 was for 10 classes. And that that was chosen. And then obviously they optimized that and they were like, all right, we need a new problem now. So in 20 14, 5 years later, they introduced CFAR 100, which was a hundred classes of other items. And I think this is a very general pattern, which is used.[00:17:25] You create a data set for a specific be. You think it's too hard for machines? Mm-hmm. It lasts for five years before it's no longer too hard for machines, and you have to find a new data set and you have to extend it again. So it's Similarly, we are gonna find that in glue, which is another, which is one of more modern data sets.[00:17:42] 2018-19: GLUE and SuperGLUE - Single Sentence, Similarity and Paraphrase, Inference[00:17:42] This one came out in 2018. Glue stands for general Language Understanding Evaluation. This is one of the most influential, I think, early. Earlier, um, language model benchmarks, and it has nine tasks. Um, so it has single sentence tasks, similarity and paraphrase tasks and inference tasks. So a single sentence task, uh, would be something like, uh, the Stanford Sentiment Tree Bank, which is a.[00:18:05] Uh, sentences from movie reviews and human annotations of the sentiment, whether it's positive or negative, in a sort of like a four point scale. And your job is to predict the task of a single sentence. This similarity task would involve corpuses, like the Microsoft research paraphrase corpus. So it's a corpus of sentence pairs automatically extracted from online news sources with human annotations for whether or not the sentence is in the para semantically equivalent.[00:18:28] So you just predict true or false and again, Just to call back to the math that we did earlier in this episode, the classes here are imbalance. This data set, for example, is 68% positive. So we report both accuracy and F1 scores. F1 is a more balanced approach because it, it adjusts for, uh, imbalanced, um, data sets.[00:18:48] Mm-hmm. Yeah. And then finally, inference. Inference is the one where we really start to have some kind of logic. So for example, the M N L I. Um, actually I'm, I'm gonna focus on squad, the Stanford questioning question answering dataset. It's another data set of pairs, uh, questions, uh, uh, p question paragraphs, pairs.[00:19:04] So where one of the sentences of the paragraph drawn from Wikipedia contains the answer to the corresponding question, we convert the task into a sentence, para classification by forming a pair between each question in each sentence into corresponding context and filtering out pairs of low overlap. So basically annotating whether or not.[00:19:20] Is the answer to the question inside of this paragraph that I pulled. Can you identify that? And again, like Entailment is kind of included inside of each of these inference tasks because it starts to force the language model to understand whether or not one thing implies the other thing. Mm-hmm. Yeah.[00:19:37] And the, the models evolving. This came out in 2018, lasted one year exactly. One year later, people were like, that's too easy. That's too easy. So in 2019, they actually came out with super. I love how you'll see later with like swag and hella swag. It's like they come up with very good names for these things.[00:19:55] Basically what's super glue dead is stick glue and try and move outside of the single sentence evaluation. So most of the tasks that. Sean was talking about focus on one sentence. Yeah, one sentence, one question. It's pretty straightforward in that way. Superglue kind of at the, so one, it went from single sentence to having some multi sentence and kind of like a context driven thing.[00:20:21] So you might have questions where, The answer is not in the last paragraph that you've read. So it starts to test the, the context window on this model. Some of them are more, in order to know the answer, you need to know what's not in the question kind of thing. So like you may say, Hey, this drink is owned by the Coca-Cola company.[00:20:43] Is this a Pepsi product? You know, so you need to make the connection false. Exactly, yeah. Then you have also like, um, embedded clauses. So you have things that are not exactly said, have to be inferred, and like a lot of this stack is very conversational. So some of the example contain a lot of the, um, um, you know, or this question's very hard to read out.[00:21:07] Yeah, I know. It's like, it sounds like you are saying, um, but no, you're actually, you're actually. And yet I hope to see employer base, you know, helping out child, um, care centers at the place of employment, things like that, that will help out. It's kind of hard to even read it. And then the hypothesis is like they're setting a trend.[00:21:27] It's going from something very simple like a big p d extract to something that is more similar to how humans communicate. Transcripts, like audio transcripts. Exactly. Of how people talk. Yeah. And some of them are also, Plausibility. You know, like most of these models have started to get good at understanding like a clear cause, kind of like a.[00:21:48] You know, cause effect things. But some of the plausible ones are like, for example, this one is a copa. They're called choice of plausible alternatives. The premises, my body cast a shadow over the grass. What's the cost for this alternative? One, the sun was rising. Alternative to the grass was cut.[00:22:07] Obviously it's the sun was rising, but nowhere. In the question we're actually mentioning the sun, uh, we are mentioning the grass. So some models, some of the older models might see the grass and make the connection that the grass is part of the reason, but the models start to get better and better and go from simply looking at the single sentence context to a more of a, a word new, uh, word knowledge.[00:22:27] It's just really impressive, like the fact that. We can expect that out of a model. It still blows my mind. I think we should not take it for granted that when we're evaluating models, we're asking questions like this that is not obvious from just the given text itself. Mm-hmm. So it, it is just coming with a memorized view of the world, uh, or, or world knowledge. And it understands the premise on, on some form. It is not just random noise. Yeah, I know. It's really impressive. This one, I actually wanted multi rc I actually wanted to spring on you as a, as a test, but it's just too long to read. It's just like a very long logic question.[00:23:03] And then it'll ask you to do, uh, comprehension. But uh, yeah, we'll just, we'll just kinda skip that. We'll put it, we'll put it in the show notes, and then you have to prove us that you're a human. Send us the answer exactly. Exactly and subscribe to the podcast. So superglue was a lot harder, and I think also was superseded eventually, pretty soon.[00:23:21] 2018-2019: Swag and HellaSwag - Common Sense Inference[00:23:21] And, uh, yeah, then we started coming onto the more recent cohort of tests. I don't know how to introduce the rest. Uh, there, there are just so many tests here that I, I struggle a little bit picking from these. Uh, but perhaps we can talk about swag and heli swyx since you mentioned it. Yeah. So SWAG stands for situations with Adversarial Generations.[00:23:39] Uh, also came out in 2018, but this guy, zes Etal, likes to name his data sets and his benchmarks in a very memorable way. And if you look at the PDF of the paper, he also has a little icon, uh, image icon for swag. And he doesn't just go by, uh, regular language. So he definitely has a little bit of branding to this and it's.[00:24:00] Part. So I'll give you an example of the kind of problems that swyx poses. Uh, it it is focused on common sense inference. So what's common sense inference? So, for example, given a partial description, like she opened the hood of the car, humans can reason about the situation and anticipate what might come next.[00:24:16] Then she examined the engine. So you're supposed to pick based on what happened in the first part. What is most likely to happen in the second part based on the, uh, multiple choice question, right? Another example would be on stage, a woman takes a seat at the piano. She a, sits on a bench as her sister plays at the doll.[00:24:33] B. Smiles with someone as the music play. C is in the crowd watching the dancers. D nervously set her fingers on the keys, so A, B, C, or D. It's not all of them are plausible. When you look at the rules of English, we're we've, we're not even checking for whether or not produces or predicts grammatical English.[00:24:54] We're checking for whether the language model can correctly pick what is most likely given the context. The only information that you're given is on stage. A woman takes a seat at the piano, what is she most likely to do next? And D makes sense. It's arguable obviously. Sometimes it could be a. In common sense, it's D.[00:25:11] Mm-hmm. So we're training these models to have common. Yeah, which most humans don't have. So it's a, it's already a step up. Obviously that only lasted a year. Uh, and hello, SWAG was no longer, was no longer challenging in 2019, and they started extending it quite a lot more, a lot more questions. I, I forget what, how many questions?[00:25:33] Um, so Swag was a, swag was a data set. A hundred thousand multiple choice questions. Um, and, and part of the innovation of swag was really that you're generating these questions rather than manually coming up with them. Mm-hmm. And we're starting to get into not just big data, but big questions and big benchmarks of the, of the questions.[00:25:51] That's where the adversarial generations come in, but how that swag. Starts pulling in from real world questions and, and data sets like, uh, wikiHow and activity net. And it's just really, you know, an extension of that. I couldn't even add examples just cuz there's so many. But just to give you an idea of, uh, the progress over time.[00:26:07] Aside: How to Design Benchmarks[00:26:07] Most of these benchmarks are, when they're released, they set. Benchmark at a level where if you just randomly guessed all of the questions, you'll get a 25%. That's sort of the, the baseline. And then you can run each of the language models on them, and then you can run, uh, human evaluations on them. You can have median evaluations, and then you have, um, expert evaluations of humans.[00:26:28] So the randoms level was, uh, for halla. swyx was 20. GT one, uh, which is the, uh, 2019 version that got a 41 on the, on the Hello Sue X score. Bert from Google, got 47. Grover, also from Google, got 57 to 75. Roberta from Facebook, got 85 G P T, 3.5, got 85, and then GPT4 got 95 essentially solving hello swag. So this is useless too.[00:26:51] 2021 - MMLU - Human level Professional Knowledge[00:26:51] We need, we need super Hell now's use this. Super hell swyx. I think the most challenging one came from 2021. 2021 was a very, very good year in benchmarking. So it's, we had two major benchmarks that came out. Human eval and M M L U, uh, we'll talk about mm. M L U first, cuz that, that's probably the more, more relevant one.[00:27:08] So M M L U. Stands for measuring mul massive multitask language understanding, just by far the biggest and most comprehensive and most human-like, uh, benchmark that we've had for until 2021. We had a better one in 2022, but we'll talk about that. So it is a test that covers 57 tasks, including elementary, math, US history, computer science law, and more.[00:27:29] So to attain high accuracy on this task, models must possess extensive world knowledge and prop problem solving. Its. Includes practice questions for the GRE test and the U United States, um, m l e, the medical exam as. It also includes questions from the undergrad courses from Oxford, from all the way from elementary high school to college and professional.[00:27:49] So actually the opening question that I gave you for this podcast came from the math test from M M L U, which is when you drop a ball from rest, uh, what happens? And then also the question about the Complex Z plane, uh, but it equally is also asking professional medicine question. So asking a question about thyroid cancer and, uh, asking you to diagnose.[00:28:10] Which of these four options is most likely? And asking a question about microeconomics, again, giving you a, a situation about regulation and monopolies and asking you to choose from a list of four questions. Mm-hmm. Again, random baseline is 25 out of 100 G P T two scores, 32, which is actually pretty impressive.[00:28:26] GT three scores between 43 to 60, depending on the the size. Go. Scores 60, chinchilla scores 67.5, GT 3.5 scores, 70 GPT4 jumps, one in 16 points to 86.4. The author of M M L U, Dan Hendrix, uh, was commenting on GPT4 saying this is essentially solved. He's basically says like, GT 4.5, the, the next incremental improvement on GPT4 should be able to reach expert level human perform.[00:28:53] At which point it is passing simultaneously, passing all the law exams, all the medical exams, all the graduate student exams, every single test from AP history to computer science to. Math to physics, to economics. It's very impressive. Yeah. And now you're seeing, I mean, it's probably unrelated, but Ivy League universities starting to drop the a t as a requirement for getting in.[00:29:16] So yeah. That might be unrelated as well, because, uh, there's a little bit of a culture war there with regards to, uh, the, the inherent bias of the SATs. Yeah. Yeah. But I mean, that's kinda, I mean exactly. That's kinda like what we were talking about before, right? It's. If a model can solve all of these, then like how good is it really?[00:29:33] How good is it as a Exactly. Telling us if a person should get in. It captures it. Captures with just the beginning. Yeah. Right.[00:29:39] 2021: HumanEval - Code Generation[00:29:39] Well, so I think another significant. Benchmark in 2021 was human eval, which is, uh, the first like very notable benchmark for code code generation. Obviously there's a, there's a bunch of research preceding this, but this was the one that really caught my eye because it was simultaneously introduced with Open Eyes Codex, which is the code generation model, the version of G P T that was fine tuned for generating code.[00:30:02] Uh, and that is, Premise of, well, there is the origin or the the language model powering GitHub co-pilot and yeah, now we can write code with language models, just with that, with that benchmark. And it's good too. That's the other thing, I think like this is one where the jump from GT 3.5 to GPT4 was probably the biggest, like GT 3.4 is like 48% on. On this benchmark, GPT4 is 67%. So it's pretty big. Yeah. I think coders should rest a little bit. You know, it's not 90 something, it's, it's still at 67, but just wait two years. You know, if you're a lawyer, if you're a lawyer, you're done. If you're a software engineer, you got, you got a couple more years, so save your money.[00:30:41] Yeah. But the way they test it is also super creative, right? Like, I think maybe people don't understand that actually all of the tests that are given here are very intuitive. Like you. 90% of a function, and then you ask the language model to complete it. And if it completes it like any software engineer would, then you give it a win.[00:31:00] If not, you give it a loss, run that model 164 times, and that is human eval. Yeah. Yeah. And since a lot of our listeners are engineers too, I think the big thing here is, and there was a, a link that we had that I missed, but some of, for example, some of. Coding test questions like it can answer older ones very, very well.[00:31:21] Like it doesn't not answer recent ones at all. So like you see some of like the data leakage from the training, like since it's been trained on the issues, massive data, some of it leaks. So if you're a software engineer, You don't have to worry too much. And hopefully, especially if you're not like in the JavaScript board, like a lot of these frameworks are brand new every year.[00:31:41] You get a lot of new technologies. So there's Oh, there's, oh yeah. Job security. Yes, exactly. Of course. Yeah. You got a new, you have new framework every year so that you have job security. Yeah, exactly. I'll sample, uh, data sets.[00:31:51] 2020 - XTREME - Multilingual Benchmarks[00:31:51] So before we get to big bench, I'll mention a couple more things, which is basically multilingual benchmarks.[00:31:57] Uh, those are basically simple extensions of monolingual benchmarks. I feel like basical. If you can. Accurately predicts the conversion of one word or one part of the word to another part of the word. Uh, you get a score. And, and I think it's, it's fairly intuitive over there. Uh, but I think the, the main benchmarks to know are, um, extreme, which is the, uh, x the x lingual transfer evaluation, the multilingual encoders, and much prefer extreme.[00:32:26] I know, right? Uh, that's why, that's why they have all these, uh, honestly, I think they just wanted the acronym and then they just kinda worked backwards. And then the other one, I can't find it in my notes for, uh, what the other multilingual ones are, but I, I just think it's interesting to always keep in mind like what the other.[00:32:43] Language capabilities are like, one language is basically completely equivalent to another. And I think a lot of AI ethicists or armchair AI ethicists are very angry that, you know, most of the time we optimize for English because obviously that has, there's the most, uh, training corpuses. I really like extreme the work that's being done here, because they took a, a huge amount of effort to make sure they cover, uh, sparse languages like the, the less popular ones.[00:33:06] So they had a lot of, uh, the, the, obviously the, the popular. Uh, the world's top languages. But then they also selected to maximize language diversity in terms of the complete diversity in, uh, human languages like Tamil Telugu, maam, and Sohi and Yoruba from Africa. Mm-hmm. So I just thought like that kind of effort is really commendable cuz uh, that means that the rest of the world can keep up in, in this air race.[00:33:28] Right. And especially on a lot of the more human based things. So I think we talked about this before, where. A lot of Israel movies are more[00:33:36] focused on culture and history and like are said in the past versus a lot of like the Western, did we talk about this on the podcast? No, not on the podcast. We talked and some of the Western one are more focused on the future and kind of like what's to come.[00:33:48] So I feel like when you're, some of the benchmarks that we mentioned before, you know, they have movie reviews as like, uh, one of the. One of the testing things. Yeah. But there's obviously a big cultural difference that it's not always captured when you're just looking at English data. Yeah. So if you ask the a motto, it's like, you know, are people gonna like this movie that I'm writing about the future?[00:34:10] Maybe it's gonna say, yeah, that's a really good idea. Or if I wanna do a movie about the past, it's gonna be like maybe people want to hear about robots. But that wouldn't be the case in, in every country. Well, since you and I speak different languages, I speak Chinese, you speak Italian, I'm sure you've tested the Italian capabilities.[00:34:29] What do you think? I think like as. Italy, it's so much more, um, dialect driven. So it can be, it can be really hard. So what kind of Italian does g PT three speak? Actually Italian, but the reality is most people have like their own, their own like dialect. So it would be really hard for a model to fool. An Italian that it's like somebody from where they are, you know?[00:34:49] Yeah. Like you can actually tell if you're speaking to AI bot in Chinese because they would not use any of the things that human with humans would use because, uh, Chinese humans would use all sorts of replacements for regular Chinese words. Also, I tried one of those like language tutor things mm-hmm.[00:35:06] That people are making and they're just not good Chinese. Not colloquial Chinese, not anything that anyone would say. They would understand you, but they were from, right, right.[00:35:14] 2022: BIG-Bench - The Biggest of the Benches[00:35:14] So, 2022, big bench. This was the biggest of the biggest, of the biggest benchmarks. I think the, the main pattern is really just, Bigger benchmarks rising in opposition to bigger and bigger models.[00:35:27] In order to evaluate these things, we just need to combine more and more and way more tasks, right? Like swag had nine tasks, hello swag had nine more tasks, and then you're, you're just adding and adding and adding and, and just running a battery of tasks all over. Every single model and, uh, trying to evaluate how good they are at each of them.[00:35:43] Big bench was 204 tasks contributed by 442 authors across 132 institutions. The task topics are diverse, drawing from linguistics, childhood development, math, common sense reasoning, biology, physics, social bias, software development, and beyond. I also like the fact that these authors also selected tasks that are not solved by current language models, but also not solvable by memorizing the internet, which is mm-hmm.[00:36:07] Tracking back to a little bit of the issues that we're, we're gonna cover later. Right. Yeah. I think that's, that's super interesting. Like one of, some of the examples would include in the following chess position, find a checkmate, which is, some humans cannot do that. What is the name of the element within a topic number of six?[00:36:22] Uh, that one you can look up, right? By consulting a periodic table. We just expect language models to memorize that. I really like this one cuz it's, uh, it's inherent. It's, uh, something that you can solve.[00:36:32] Identify whether this sentence has an anachronism. So, option one. During the Allied bombardment of the beaches of Iwojima, Ralph spoke loudly into his radio.[00:36:41] And in option two, during the allied bombardment of the beaches of Iwojima, Ralph spoke loudly into his iPhone. And you have to use context of like when iPhone, when Ally bombarding. Mm-hmm. And then sort of do math to like compare one versus the other and realize that okay, this one is the one that's out of place.[00:36:57] And that's asking more and more and more of the language model to do in implicitly, which is actually modeling what we do when we listen to language, which is such a big. Gap. It's such a big advancement from 1985 when we were comparing synonyms. Mm-hmm. Yeah, I know. And it's not that long in the grand scheme of like humanity, you know, like it's 40 years.[00:37:17] It's crazy. It's crazy. So this is a big missing gap in terms of research. Big benches seems like the most comprehensive, uh, set of benchmarks that we have. But it is curiously missing from Gypsy four. Mm-hmm. I don't know. On paper, for code, I only see Gopher two 80. Yeah. On it. Yeah. Yeah. It could be a curious emission because it maybe looks.[00:37:39] Like it didn't do so well.[00:37:40] EDIT: Why BIG-Bench is missing from GPT4 Results[00:37:40] Hello, this is Swyx from the editing room sometime in the future. I just wanted to interject that. Uh, we now know why the GPT for benchmark results did not include the big bench. Benchmark, even though that was the state-of-the-art benchmark at the time. And that's because the. Uh, GPC four new the Canary G U I D of the big bench.[00:38:02] Benchmark. Uh, so Canary UID is a random string, two, six[00:38:08] eight six B eight, uh, blah, blah, blah. It's a UID. UID, and it should not be knowable by the language model. And in this case it was therefore they had to exclude big bench and that's. And the issue of data contamination, which we're about to go into right now.[00:38:25] Issue: GPT4 vs the mystery of the AMC10/12[00:38:25] And there's some interesting, if you dive into details of GPT4, there's some interesting results in GPT4, which starts to get into the results with benchmarking, right? Like so for example, there was a test that GPT4 published that is very, very bizarre to everyone who is even somewhat knowledgeable.[00:38:41] And this concerns the Ammc 10 and AMC 12. So the mc. Is a measure of the American math 10th grade student and the AMC12 is a, uh, is a measure of the American 12th grade student. So 12 is supposed to be harder than 10. Because the students are supposed to be older, it's, it's covering topics in algebra, geometry number, theory and combinatorics.[00:39:04] GPT4 scored a 30 on AMC10 and scored a 60 on AMC12. So the harder test, it got twice as good, and 30 was really, really bad. So the scoring format of AMC10. It is 25 questions. Each correct answer is worth six points. Each incorrect answer is worth 1.5 points and unanswered questions receive zero points.[00:39:25] So if you answer every single question wrong, you will get more than GPT4 got on AMC10. You just got everything wrong. Yeah, it's definitely better in art medics, you know, but it's clearly still a, a long way from, uh, from being even a high school student. Yeah. There's a little bit of volatility in these results and it, it shows that we, it's not quite like machine intelligence is not the same, or not linearly scaling and not intuitive as human intelligence.[00:39:54] And it's something that I think we should be. Aware of. And when it freaks out in certain ways, we should not be that surprised because Yeah, we're seeing that. Yeah. I feel like part of it is also human learning is so structured, you know, like you learn the new test, you learn the new test, you learn the new test.[00:40:10] But these models, we kind of throw everything at them all at once, you know, when we train them. So when, when the model is strained, are you excusing the model? No, no, no. I'm just saying like, you know, and you see it in everything. It's like some stuff. I wonder what the percentage of. AMC 10 versus AMC 12.[00:40:28] Issue: Data Contamination[00:40:28] Content online is, yes. This comes in a topic of contamination and memorization. Right. Which we can get into if we, if we, if we want. Yeah. Yeah, yeah. So, uh, we're getting into benchmarking issues, right? Like there's all this advancements in benchmarks, uh, language models. Very good. Awesome. Awesome, awesome. Uh, what are the problems?[00:40:44] Uh, the problem is that in order to train these language models, we are scraping the vast majority of the internet. And as time passes, the. Of previous runs of our tests will be pasted on the internet, and they will go into the corpus and the leg model will be memorizing them rather than reasoning them from first principles.[00:41:02] So in, in the machine, classic machine learning parlance, this would be overfitting mm-hmm. Uh, to the test rather than to the generalizing to the, uh, the results that we really want. And so there's an example of, uh, code forces as well also discovered on GPT4. So Code Forces has annual vintages and there was this guy, uh, C H H Halle on Twitter who ran GPT4 on pre 2021 problems, solved all of them and then ran it on 2022 plus problems and solved zero of them.[00:41:31] And we know that the cutoff for GPT4 was 2021. Mm-hmm. So it just memorized the code forces problems as far as we can tell. And it's just really bad at math cuz it also failed the mc 10 stuff. Mm-hmm. It's actually. For some subset of its capabilities. I bet if you tested it with GPT3, it might do better, right?[00:41:50] Yeah. I mean, this is the, you know, when you think about models and benchmarks, you can never take the benchmarks for what the number says, you know, because say, you know, you're focusing on code, like the benchmark might only include the pre 2021 problems and it scores great, but it's actually bad at generalizing and coming up with new solutions.[00:42:10] So, yeah, that, that's a. Big problem.[00:42:13] Other Issues: Benchmark Data Quality and the Iris data set[00:42:13] Yeah. Yeah. So bias, data quality, task specificity, reproducibility, resource requirements, and then calibrating confidence. So bias is, is, is what you might think it is. Basically, there's inherent bias in the data. So for example, when you think about doctor, do you think about a male doctor, a female doctor, in specifically an image net?[00:42:31] Businessmen, white people will be labeled businessmen, whereas Asian businessmen will be labeled Asian businessmen and that can reinforce harmful serotypes. That's the bias issue. Data quality issue. I really love this one. Okay, so there's a famous image data set we haven't talked about called the pedals or iris.[00:42:47] Iris dataset mm-hmm. Contains measurements of, uh, of, uh, length with petal length and petal with, uh, three different species of iris, iris flowers, and they have labeling issues in. So there's a mini, there's a lowest level possible error rate because the error rate exists in the data itself. And if you have a machine learning model that comes out with better error rate than the data, you have a problem cuz your machine learning model is lying to you.[00:43:12] Mm-hmm. Specifically, there's, we know this for a fact because especially for Iris flowers, the length should be longer than the, than the width. Um, but there. Number of instances in the data set where the length was shorter than the, than the width, and that's obviously impossible. So there was, so somebody made an error in the recording process.[00:43:27] Therefore if your machine learning model fits that, then it's doing something wrong cuz it's biologically impossible. Mm-hmm. Task specificity basically if you're overfitting to, to one type of task, for example, answering questions based on a single sentence or you're not, you know, facing something real world reproducibility.[00:43:43] This one is actually, I guess, the fine details of machine learning, which people don't really like to talk about. There's a lot. Pre-processing and post-processing done in I Python notebooks. That is completely un versions untested, ad hoc, sticky, yucky, and everyone does it differently. Therefore, your test results might not be the same as my test results.[00:44:04] Therefore, we don't agree that your scores are. The right scores for your benchmark, whereas you're self reporting it every single time you publish it on a, on a paper. The last two resource requirements, these are, these are more to do with GPTs. The larger and larger these models get, the harder, the more, more expensive it is to run some.[00:44:22] And some of them are not open models. In other words, they're not, uh, readily available, so you cannot tell unless they run it themselves on, on your benchmark. So for example, you can't run your GPT3, you have to kind of run it through the api. If you don't have access to the API like GPT4, then you can't run it at all.[00:44:39] The last one is a new one from GPT4's Paper itself. So you can actually ask the language models to expose their log probabilities and show you how confident they think they are in their answer, which is very important for calibrating whether the language model has the right amount of confidence in itself and in the GPT4 people. It. They were actually very responsible in disclosing that They used to have about linear correspondence between the amount of confidence and the amount of times it was right, but then adding R L H F onto GPT4 actually skewed this prediction such that it was more confident than it should be. It was confidently incorrect as as people say.[00:45:18] In other words, hallucinating. And that is a problem. So yeah, those are the main issues with benchmarking that we have to deal with. Mm-hmm. Yeah, and a lot of our friends, our founders, we work with a lot of founders. If you look at all these benchmarks, all of them just focus on how good of a score they can get.[00:45:38] They don't focus on what's actually feasible to use for my product, you know? So I think.[00:45:44] Tradeoffs of Latency, Inference Cost, Throughput[00:45:44] Production benchmarking is something that doesn't really exist today, but I think we'll see the, the rise off. And I think the main three drivers are one latency. You know, how quickly can I infer the answer cost? You know, if I'm using this model, how much does each call cost me?[00:46:01] Like is that in line with my business model I, and then throughput? I just need to scale these models to a lot of questions on the ones. Again, I just do a benchmark run and you kind of come up. For quadrants. So if on the left side you have model size going from smallest to biggest, and on the X axis you have latency tolerance, which is from, I do not want any delay to, I'll wait as long as I can to get the right answer.[00:46:27] You start to see different type of use cases, for example, I might wanna use a small model that can get me an answer very quickly in a short amount of time, even though the answer is narrow. Because me as a human, maybe I'm in a very iterative flow. And we have Varun before on the podcast, and we were talking about a kind of like a acceleration versus iteration use cases.[00:46:50] Like this is more for acceleration. If I'm using co-pilot, you know, the code doesn't have to be a hundred percent correct, but it needs to happen kind of in my flow of writing. So that's where a model like that would be. But instead, other times I might be willing, like if I'm asking it to create a whole application, I'm willing to wait one hour, you know, for the model to get me a response.[00:47:11] But you don't have, you don't have a way to choose that today with most models. They kind of do just one type of work. So I think we're gonna see more and more of these benchmark. Focus on not only on the research side of it, which is what they really are today when you're developing a new model, like does it meet the usual standard research benchmarks to having more of a performance benchmark for production use cases?[00:47:36] And I wonder who's gonna be the first company that comes up with, with something like this, but I think we're seeing more and more of these models go from a research thing to like a production thing. And especially going from companies like. Google and Facebook that have kinda unlimited budget for a lot of these things to startups, starting to integrate them in the products.[00:48:00] And when you're on a tight budget paying, you know, 1 cent per thousand tokens or 0.10 cent for a thousand tokens, like it's really important. So I think that's, um, that's what's missing to get a lot of these things to productions. But hopefully we, we see them.[00:48:16] Yeah, the software development lifecycle I'm thinking about really is that most people will start with large models and then they will prototype with that because that is the most capable ones.[00:48:25] But then as they put more and more of those things in production, people always want them to run faster and faster and faster and cheaper. So you will distill towards a more domain specific model, and every single company that puts this into production, we'll, we'll want something like that, but I, I think it's, it's a reasonable bet because.[00:48:41] There's another branch of the AI builders that I see out there who are build, who are just banking on large models only. Mm-hmm. And seeing how far they can stretch them. Right. With building on AI agents that can take arbitrarily long amounts of time because they're saving you lots of, lots of time with, uh, searching the web for you and doing research for you.[00:48:59] And I think. I'm happy to wait for Bing for like 10 seconds if it does a bunch of searches for median. Mm-hmm. Just ends with, ends with the right, right result. You know, I was, I was tweeting the other day that I wanted an AI enabled browser because I was seeing this table, uh, there was an image and I just needed to screenshot an image and say, plot this on a chart for me.[00:49:17] And I just wanted to do that, but it would have to take so many steps and I would be willing to wait for a large model to do that for me. Mm-hmm. Yeah. I mean, web development so far has been, Reduce, reduce, reduce the loading times. You know, it's like first we had the, I don't know about that. There, there are people who disagree.[00:49:34] Oh. But I, I think, like if you think about, you know, the CDN and you think about deploying things at the edge, like the focus recently has been on lowering the latency time versus increasing it.[00:49:45] Conclusion[00:49:45] Yeah. So, well that's the, that's Benchmark 1 0 1. Um. Let us know how we, how you think we did. This is something we're trying for the first time.[00:49:52] We're very inspired by other podcasts that we like where we do a bunch of upfront prep, but then it becomes a single topical episode that is hopefully a little bit more timeless. We don't have to keep keeping up with the news. I think there's a lot of history that we can go back on and. Deepen our understanding of the context of all these evolutions in, uh, language models.[00:50:12] Yeah. And if you have ideas for the next, you know, 1 0 1 fundamentals episode, yeah, let us know in the, in the comments and we'll see you all soon. Bye. Get full access to Latent Space at www.latent.space/subscribe

Kvartal
Inläst: Här firar Grå vargarnas ledare med Botkyrkas kommunalråd

Kvartal

Play Episode Listen Later Mar 24, 2023 14:27


En video visar hur Botkyrkas omstridda tidigare kommunalråd Ebba Östlin (S) tillsammans med moderaten Stina Lundgren hyllar en turkisk kulturförening i Botkyrka. Vid hedersbordet sitter ledaren för Grå vargarnas svenska avdelning och en känd lokal ledare för det turkiska regeringspartiet AKP:s lobbyorganisation UID. Han har tidigare pekats ut för att hota oppositionella i Sverige, skriver Lars Åberg.– Ansvarslöst och oacceptabelt, säger samhällsdebattören Kurdo Baksi om politikernas agerande. Hosted on Acast. See acast.com/privacy for more information.

Oxide and Friends
Does a GPT future need software engineers?

Oxide and Friends

Play Episode Listen Later Mar 21, 2023 99:18


Bryan and Adam and the Oxide Friends take on GPT and its implications for software engineering. Many aspiring programmers are concerned that the future of the profession is in jeopardy. Spoiler: the Oxide Friends see a bright future for human/GPT collaboration in software engineering.We've been hosting a live show weekly on Mondays at 5p for about an hour, and recording them all; here is the recording from March 20th, 2023.In addition to Bryan Cantrill and Adam Leventhal, speakers on MM DD included Josh Clulow, Keith Adams, Ashley Williams, and others. (Did we miss your name and/or get it wrong? Drop a PR!)Live chat from the show (lightly edited): ahl: John Carmack's tweet ahl: ...and the discussion Wizord: https://twitter.com/balajis/status/1636797265317867520 (the $1M bet on BTC, I take) dataphract: "prompt engineering" as in "social engineering" rather than "civil engineering" Grevian: I was surprised at how challenging getting good prompts could be, even if I wouldn't quite label it engineering TronDD: https://www.aiweirdness.com/search-or-fabrication/ MattCampbell: I tested ChatGPT in an area where I have domain expertise, and it got it very wrong. TronDD: Also interesting https://www.youtube.com/watch?v=jPhJbKBuNnA Wizord: the question is, when will it be in competition with people? Wizord: copilot also can review code and find bugs if you ask it in a right way ag_dubs: i suspect that a new job will be building tools that help make training sets better and i strongly suspect that will be a programming job. ai will need tools and data and content and there's just a whole bunch of jobs to build tools for AI instead of people Wizord: re "reading manual and writing DTrace scripts" I think it's possible, if done with a large enough token window. Wizord: (there are already examples of GPT debugging code, although trivial ones) flaviusb: The chat here is really interesting to me, as it seems to miss the point of the thing. ChatGPT does not and can not ever 'actually work' - and whether it works is kind of irrelevant. Like, the Jaquard Looms and Numerical Control for machining did not 'work', but that didn't stop the roll out. Columbus: Maybe it has read the dtrace manual

Day[0] - Zero Days for Day Zero
[binary] An OpenBSD overflow and TPM bugs

Day[0] - Zero Days for Day Zero

Play Episode Listen Later Mar 16, 2023 41:14


Some simple, but interesting vulnerabilities. A use-after-free because of wrong operation ordering, an interesting type confusion, an integer underflow and some OOB access in TPM 2.0 reference code. Links and vulnerability summaries for this episode are available at: https://dayzerosec.com/podcast/196.html [00:00:00] Introduction [00:00:27] Spot the Vuln - Just be Positive [00:03:42] oss-sec: Linux kernel: CVE-2023-1118: UAF vulnerabilities in "drivers/media/rc" directory [00:07:56] oss-sec: CVE-2023-1076: Linux Kernel: Type Confusion hardcodes tuntap socket UID to root [00:11:21] GitHub - fuzzingrf/openbsd_tcpip_overflow: OpenBSD remote overflow [00:14:36] Chat Question: What Language is Most Effective for Writing These Types of Exploits [00:18:22] Vulnerabilities in the TPM 2.0 reference implementation code [00:28:19] Chat Question: Skillset for Exploit Dev as part of a Red Team [00:33:40] Espressif ESP32: Glitching The OTP Data Transfer The DAY[0] Podcast episodes are streamed live on Twitch twice a week: -- Mondays at 3:00pm Eastern (Boston) we focus on web and more bug bounty style vulnerabilities -- Tuesdays at 7:00pm Eastern (Boston) we focus on lower-level vulnerabilities and exploits. We are also available on the usual podcast platforms: -- Apple Podcasts: https://podcasts.apple.com/us/podcast/id1484046063 -- Spotify: https://open.spotify.com/show/4NKCxk8aPEuEFuHsEQ9Tdt -- Google Podcasts: https://www.google.com/podcasts?feed=aHR0cHM6Ly9hbmNob3IuZm0vcy9hMTIxYTI0L3BvZGNhc3QvcnNz -- Other audio platforms can be found at https://anchor.fm/dayzerosec You can also join our discord: https://discord.gg/daTxTK9

Marketecture: Get Smart. Fast.
Episode 7: Paul Bannister on publisher strategy, the sandbox, UID2 and OpenPath. Plus new YouTube leadership.

Marketecture: Get Smart. Fast.

Play Episode Listen Later Feb 23, 2023 39:48


Our latest episode of the Marketecture podcast with Ari Paparo, Eric Franchi from AperiumVentures and Paul Bannister, the CSO at CafeMedia.Cafe Media is always on the cutting-edge of monetization for digital publishers, so we ask Paul to give perspective on the Chrome sandbox, Trade Desk's UID and OpenPath initiatives, and everything else you need to worry about.Also, ad tech veteran Neal Mohan takes the helm at YouTube. And BlueKai founder Omar Tawakol has a new start-up for digital product placements.Visit Marketecture.tv to join our community and get access to full-length in-depth interviews. Marketecture is a new way to get smart about technology. Our team of real industry practitioners helps you understand the complex world of technology and make better vendor decisions through in-depth interviews with CEOs and product leaders at dozens of platforms. We are launching with extensive coverage of the marketing and advertising verticals with plans to expand into many other technology sectors.Copyright (C) 2023 Marketecture Media, Inc.

The Transcript
The Transcript Podcast Episode 93

The Transcript

Play Episode Listen Later Feb 21, 2023 9:43


In this episode, we examine the state of the consumer, the impact of UID2 on advertising, and how things are going with ChatGPT.Show Notes00:00:00   Introduction00:00:13   Consumer is Still Partying00:01:31   Service Inflation Still Sticky00:02:45   No Hurricane Yet00:03:22   UID 2.000:05:44   ChatGPT Attracting Criticism00:08:15   Back to Office at Amazon00:09:27   Conclusion

Privacy Files
How to protect your privacy with MySudo

Privacy Files

Play Episode Listen Later Feb 10, 2023 33:55


In this episode of Privacy Files, Rich and Sarah dive deep into the world of MySudo, the world's only all-in-one privacy app. And Bundy, Anonyome Labs' Head of User Support, joins the discussion from his office on the Gold Coast of Australia. As one of the original employees at Anonyome, Bundy brings an immense wealth of knowledge about the inner workings of MySudo. Before breaking down how MySudo protects personal data, Rich and Sarah discuss the recent news about the Federal Trade Commission (FTC) slapping a $1.5 million fine on GoodRx, an online price comparison site for prescription medications. GoodRx is accused of sharing the health information of consumers with third parties like Facebook, Google and Criteo for advertising purposes. As the episode moves into MySudo talk, Bundy explains why the app is so popular with privacy advocates. From not requiring personally identifiable information (PII) and having no capability to decrypt user data, to robust functionality and the core concept of compartmentalization of online living, MySudo is a privacy champion's dream come true. Rich explains the concept of digital exhaust, why it's a bad thing, and how MySudo limits it to better protect your personal data. He explains why just any email address is not good enough, especially if you're using Gmail. Then Rich introduces UID 2.0, an emerging advertising framework designed to use email addresses as a primary means for tracking people online. Sarah then explains the different MySudo plans offered and what each subscription service offers. From encrypted communications to advanced messaging features, MySudo is the most robust privacy app in the world. As the discussion continues, Bundy and Sarah address private contact matching and porting a number into MySudo. The Rich private browsing, virtual cards and pairing MySudo with a laptop browser. Sarah and Bundy wrap things up by going over frequently asked questions, how to deal with spam calls, and what new features are coming in 2023. Links Referenced: https://anonyome.com/2020/04/why-compartmentalization-is-the-most-powerful-data-privacy-strategy/ https://www.seattletimes.com/business/technology/everyone-wants-your-email-address-think-twice-before-sharing-it/ OUR SPONSORS: Anonyome Labs - Makers of MySudo and Sudo Platform. Take back control of your personal data. www.anonyome.com MySudo - The world's only all-in-one privacy app. Communicate and transact securely and privately. Talk, text, email, browse, shop and pay, all from one app. Stay private. www.mysudo.com Sudo Platform - The cloud-based platform companies turn to for seamlessly integrating privacy solutions into their software. Easy-to-use SDKs and APIs for building out your own branded customer apps like password managers, virtual cards, private browsing, identity wallets (decentralized identity), and secure, encrypted communications (e.g., encrypted voice, video, email and messaging). www.sudoplatform.com

Founders Unfiltered
AJVC Behind the Scenes 58: Can HealthifyMe Transform Health from India to the World?

Founders Unfiltered

Play Episode Listen Later Feb 5, 2023 11:49


Last fortnight, HealthifyMe released its campaign with Mandira Bedi, hot on the heels of its campaigns with Farhan Akhtar and Sara Ali Khan as it doubled down on celebrity marketing. Tushar Vashisht is an Ivy League alumnus, and a Wall Street banker. When he got a chance to work with Nandan Nilekani on the UID project, he decided to come back to India. He and his roommate Matthew Cherian wanted to build a business that had a social impact, and made money too! They decided to run an experiment - subsist for a month on what the average Indian does - just 100 rupees ($2.04) a day. The outcome - both of them lost a significant amount of weight. People wanted a copy of the tracking journal they had maintained during this experiment. Tushar and Matthew created a business out of it. HealthifyMe was born in 2012. By 2014, it had 100K users on its app, grew 5X by 2015, and had a Series A round of $6M. It was well poised to gain traction in the $6B Indian wellness industry. By 2017, it had crossed 2M downloads. By 2020, it was making INR 100 Cr in ARR. How did it leverage conversational AI (even before ChatGPT), and how will it disrupt the $5Tn global wellness market? Read full article here: https://ajuniorvc.com/healthifyme-business-case-study-startup-tech-india-explained-weight-loss/

Les Cast Codeurs Podcast
LCC 290 - Mettre tes lunettes dans ta base de données

Les Cast Codeurs Podcast

Play Episode Listen Later Jan 14, 2023 75:48


Guillaume et Arnaud discutent de tech en cette nouvelle année 2023. GraalVM dans OpenJDK, Rust, Webassembly, containers. postgres, ChatGPT, le rôle de l'architecte et la ribambelle de rétrospective 2022. Enregistré le 13 janvier 2023 Téléchargement de l'épisode LesCastCodeurs-Episode–290.mp3 News Langages OpenJDK propose projet Galahad : pour fusionner dans OpenJDK certaines parties de GraalVM community edition https://www.infoq.com/news/2022/12/openjdk-galahad-Dec22/ https://www.infoq.com/articles/graalvm-java-compilers-openjdk/ Alex Snaps partage un article sur Rust pour le développeur Java https://wcgw.dev/posts/2023/rusty-java-intro/ Google a sorti sa formation interne sur Rust en libre accès https://google.github.io/comprehensive-rcust/ Paul King du projet Apache Groovy partage sa rétrospective de l'année 2022 https://blogs.apache.org/groovy/entry/apache-groovy–2022-year-in Webassembly pour le developpeur Java https://www.javaadvent.com/2022/12/webassembly-for-the-java-geek.html Un article assez critique sur TypeScript https://dev.to/wiseai/17-compelling-reasons-to-start-ditching-typescript-now–249b On voit souvent des articles plutôt positif sur TypeScript, mais est-ce que tout est tout rose tout le temps, pas forcément ! L'article cite 17 problèmes avec TypeScript, dont la courbe d'apprentissage, la baisse de productivité, la verbosité des types, le manque de flexibilité, le fait que ce n'est pas vraiment un sur-ensemble de JavaScript, la lenteur du temps de compilation… basé sur son talk sur le même thème qu'il a déjà présenté à Devoxx Maroc et Belgique Alex a également écrit une deuxième partie faisant suite à son article, dans lequel il parle un peu plus d'ownership, de borrowing, du trait Drop, etc. (càd sur la gestion mémoire) https://wcgw.dev/posts/2023/rusty-java–2/ Librairies Sortie du Micronaut 3.8 https://micronaut.io/2022/12/27/micronaut-framework–3–8–0-released/ support de GraalVM 22.3.0 possibilité d'annoter les records avec @RequestBean (pour binder les paramètres de requête et autre, aux paramètres de la méthode du controleur) amélioration du CorsFilter pour éviter certaines attaques également des améliorations sur le support de CRaC (Coordinated Restore at Checkpoint) et plein d'autres upgrades de versions, nouveaux plugins, et améliorations mineures Swing n'est pas mort ! Un nouveau DSL Java open source pour Swing dénommé Sierra, pour faciliter la création d'interfaces graphiques Swing https://github.com/HTTP-RPC/Sierra Infrastructure Comprendre root dans et en dehors des containers https://www.redhat.com/en/blog/understanding-root-inside-and-outside-container un article pas recent mais utile c'est quoi un container rootless on peut etre root et lancer le moteur de container on peut etre root dans le container lui meme quand on run en root le moteur, l'utilisateur exterieur et interieur sont mappés (meme # d'UID) ; par contre en non root, le UID de l'utilisateur du container est mappé sur un nouvel UID c'est top car les utilisateurs dedans et dehors ne sont pas mappés donc moins de risque en cas de sortie de bac a sable (sandbox escape) c'est le cas pour podman mais pour docker il y a un ajout: docker a un démon (root ou pas) et une CLI qui appelle ce demon (root ou pas), ce qui importe c'est le demon pour les risques de sécu l'idéal c'est de tourner non root le moteur et dans le container (meme si encore beaucoup d'images s'attendent a être root les folles) Cloud Kubernetes 1.26 avec notamment une de corrélation de l'hébergement de la Registry par Google https://www.infoq.com/news/2022/12/kubernetes–1–26/?utm_campaign=infoq_content&utm_source=twitter&utm_medium=feed&utm_term=Devops Web Evan You, le créateur de Vue.js revient sur l'année 2022 https://blog.vuejs.org/posts/2022-year-in-review.html C'est la grande migration de Vue 2 vers Vue 3 Migration de l'API Composition de Vue 3 vers l'API Options de Vue 2 (mais supporté encore en 3) La documentation de Vue propose Vue 3 par défaut depuis février Pendant la phase de transition, gros focus sur l'outillage et l'expérience développeur L'écosystème a bien adopté Vue 3 et continue de le faire au fur et à mesure Pour 2023, espère faire plus de releases mineures régulières, et travail sur le “vapor mode” qui propose une stratégie de compilation plus rapide Data Un article de Stephan Schmidt qui suggère d'utiliser PostgreSQL… pour tout ! https://www.amazingcto.com/postgres-for-everything/ pour du caching à la place de REDIS comme une queue de messages pour stocker des documents JSON au lieu de MongoDB pour faire des requêtes géo-spatiales pour le full-text search à la place d'ElasticSearch pour générer du JSON directement en base comme stockage / adaptateur pour GraphQL ou pour Timescale (base de données time-series) Outillage ChatGPT en action sur le design d'un nouveau langage de programmation https://judehunter.dev/blog/chatgpt-helped-me-design-a-brand-new-programming-language ChatGPT, on lui attribue plus de magie qu'il n'en a https://arxiv.org/pdf/2212.03551.pdf Github rajoute le scan des secrets dans vos répos publics aussi https://github.blog/2022–12–15-leaked-a-secret-check-your-github-alerts-for-free/ ce n'est plus seulement pour les organisations des entreprises aussi accessible pour les répos publics permet d'éviter de leaker des clés d'API et autre Les nouveautés de Java sur Visual Studio Code https://foojay.io/today/java-on-visual-studio-code-update-december–2022/ amélioration visuelles pour les extensions Spring Boot et aussi pour la visualisation de la mémoire utilisée complétion “post-fix” comme dans IntelliJ plus de raccourcis pour générer du code support de Lombok intégré support de l'annotation processing de Gradle meilleure visualisation des erreurs de build 2 millions de développeurs utilisent Visual Studio Code pour Java Encore un guide pour sortir de Vi https://thevaluable.dev/vim-advanced/ Le client HTTP de IntelliJ peut maintenant être utilisé en ligne de commande et dans un environnement d'intégration continue https://blog.jetbrains.com/idea/2022/12/http-client-cli-run-requests-and-tests-on-ci/ Architecture L'évolution du rôle de l'architecte https://www.infoq.com/articles/architecture-architecting-role/ Le (très long) rapport des tendances 2023 par Didier Girard et Olivier Rafal https://www.linkedin.com/pulse/rapport-tendances–2023-didier-girard/?trackingId=wu9pJ4wNQAOKjh11R2UyjA%3D%3D un prisme tech/orga/culture pour préparer l'entreprise aux enjeux un prisme produits/plateformes/data pour structurer notre approche d'un SI moderne. couvre des tonnes de sujets de l'intelligence artificielle, les données, le cloud, le web1/2/3, mais aussi l'organisation des équipes, les rôles, etc. Loi, société et organisation Twitter n'apprécie guère Mastodon, et bride les tweets avec des liens vers Mastodon. La liberté d'expression façon Elon Musk ! https://twitter.com/bluxte/status/1603656787097534464 Statement de Mastodon sur le fait que Twitter bannit les liens vers Mastodon https://blog.joinmastodon.org/2022/12/twitter-suspends-mastodon-account-prevents-sharing-links/ Et finalement Twitter est revenu en arrière sur son changement des conditions d'utilisation Dans la famille “les informaticiens ont des supers passions”, je voudrais Cédric Champeau, qui nous fait une magnifique rétrospective de ces clichés d'astrophotographie https://melix.github.io/blog//2022/12/astrophoto–2022.html Conférences La liste des conférences provenant de Developers Conferences Agenda/List par Aurélie Vache et contributeurs : 19 janvier 2023 : Archilocus - Bordeaux (France) 19–20 janvier 2023 : Touraine Tech - Tours (France) 25–28 janvier 2023 : SnowCamp - Grenoble (France) 31 janvier 2023 : Duck Conf - Paris (France) 2 février 2023 : Very Tech Trip - Paris (France) 2 février 2023 : AgiLeMans - Le Mans (France) 9–11 février 2023 : World AI Cannes Festival - Cannes (France) 16–19 février 2023 : PyConFR - Bordeaux (France) 7 mars 2023 : Kubernetes Community Days France - Paris (France) 23–24 mars 2023 : SymfonyLive Paris - Paris (France) 23–24 mars 2023 : Agile Niort - Niort (France) 1–2 avril 2023 : JdLL - Lyon 3e (France) 5–7 avril 2023 : FIC - Lille Grand Palais (France) 12–14 avril 2023 : Devoxx France - Paris (France) 20–21 avril 2023 : Toulouse Hacking Convention 2023 - Toulouse (France) 4–6 mai 2023 : Devoxx Greece - Athens (Greece) 10–12 mai 2023 : Devoxx UK - London (UK) 12 mai 2023 : AFUP Day - lle & Lyon (France) 25–26 mai 2023 : Newcrafts Paris - Paris (France) 26 mai 2023 : Devfest Lille - Lille (France) 27 mai 2023 : Polycloud - Montpellier (France) 7 juin 2023 : Serverless Days Paris - Paris (France) 15–16 juin 2023 : Le Camping des Speakers - Baden (France) 29–30 juin 2023 : Sunny Tech - Montpellier (France) 19 septembre 2023 : Salon de la Data Nantes - Nantes (France) & Online 21–22 septembre 2023 : API Platform Conference - Lille (France) & Online 2–6 octobre 2023 : Devoxx Belgium - Antwerp (Belgium) 12 octobre 2023 : Cloud Nord - Lille (France) 12–13 octobre 2023 : Volcamp 2023 - Clermont-Ferrand (France) 6–7 décembre 2023 : Open Source Experience - Paris (France) Nous contacter Pour réagir à cet épisode, venez discuter sur le groupe Google https://groups.google.com/group/lescastcodeurs Contactez-nous via twitter https://twitter.com/lescastcodeurs Faire un crowdcast ou une crowdquestion Soutenez Les Cast Codeurs sur Patreon https://www.patreon.com/LesCastCodeurs Tous les épisodes et toutes les infos sur https://lescastcodeurs.com/

Nerd Poker
S5E23 Getting So Thorny

Nerd Poker

Play Episode Listen Later Jan 3, 2023 65:59


The royal hall of the lich king is buried in thorns, brambles, and branches. Luckily two of our crew are experienced at dealing with flora, Dr. Uid of course, but also newcomer (and huge fan of Dr. Drew Hugh Uid) Winifred Wintergem. As long as the plant isn't terrifying this should be no problem at all!

Overdrive Radio
FMCSA's electronic-ID qs, Level 8 wireless inspections, autonomous trucks: Connecting the dots

Overdrive Radio

Play Episode Listen Later Oct 7, 2022 30:04


Former CVSA president Steve Vaughn, currently vice president of field operations for PrePass, stressed asking the tough questions about technology and all the infrastructure put in place to support it when it comes to roadside inspections and over-the-air communication from the truck to law enforcement. Getting answers to those questions, some of which he details in today's edition of Overdrive Radio, will be absolutely key to making any possible rulemakings around electronic ID something industry and enforcement can agree on and benefit from where the rubber meets the road. And hopefully without the unintended consequences that often arise in the rush toward federal implementation of new programs. Overdrive's Alex Lockie wrote about the electronic-ID comment period currently open in this story from Thursday, October 6, 2022: https://www.overdriveonline.com/regulations/article/15301028/electronic-ids-for-trucks-fact-vs-fiction PrePass' Vaughn was speaking long before the FMCSA's current request for comment on the topic, but drew connecting lines between electronic ID (sometimes referred to as "UID") and the agency's long pursuit of so-called "Wireless Roadside Inspections," or WRI, a program to automate both vehicle and driver inspections with communications technology. That's now become a more limited version of what the old WRI program envisioned, in the form of CVSA's Level 8 electronic inspection standard. Study and technology development around WRI go back to at least 2006, when the agency was provided funding for a four-year study that morphed into at least nine years of funded research. "Congress in 2018 told them you're no longer to spend money on it," Vaughn said, speaking in March at the Truckload 2022 conference in Las Vegas. "You've been looking at it for 10 years, that's enough." Around that same time, though, he added, "we saw it move over to CVSA in the form of a Level 8 inspection." Though the new standard for electronic inspections has been official in CVSA's inspection program since June 2017, "not a single Level 8 inspection [has been] conducted to date," Vaughn said in March. FMCSA's September ANPRM around electronic IDs did note that the agency has been testing Level 8 inspections as a means essentially to do what CVSA Executive Director Collin Mooney described in Alex Lockie's story yesterday, assess basic safety at highway speeds and thus target only trucks, operators and carriers that truly need it. The Level 8 electronic inspection standard itself, as Vaughn noted in his March talk, requires the capture of where the truck is via "GPS coordinates, information about the driver. ... Are they licensed to drive the class of vehicle they're in, do they have their medical? Hours of service information -- are they current, is it up to date? Skill Performance Evaluation Certificate -- does it meet the requirements? On the vehicle -- DOT number, registration. UCR: is it current? What is the operating authority, and does [the carrier] currently have federal out of service violations?" Vaughn noted an established electronic ID/UID system he viewed as a stepping stone to get there, and the Level 8 as another step toward what CVSA adopted at its most recent meeting, reported on just two days ago, a standard for inspections of automated vehicles. Citing his past with the California Highway Patrol and with CVSA, Vaughn noted that during his 40 years of experience around trucking and government, he's seen the tendency particularly for governments to want to "get a program forward so quickly that they don't properly vet it all the way through." **Where to comment on FMCSA's electronic-ID ANPRM: https://www.regulations.gov/document/FMCSA-2022-0062-0008 **A tutorial around challenging preventable crashes in FMCSA's fairly new program conducted through the DataQs system: https://www.overdriveonline.com/channel-19/article/14897753/how-to-dataq-a-crash-in-new-fmcsa-preventability-program

Decorating Tips and Tricks
Tips from a London Townhouse

Decorating Tips and Tricks

Play Episode Listen Later Sep 14, 2022 29:43


Today we are doing something different. We're highlighting a gorgeous London townhouse full of ideas you can use in your home. It's full of great ideas for color and pattern to liven up your home. You don't have to see photos of the townhouse to get the ides, but it's a great idea to check it out to get the full benefit of the episode. Find it HERE (https://www.houseandgarden.co.uk/gallery/alice-palmer-house?uID=86aecfa57a9754bc0c9937d25be5018041424a9e0bbb83db0a9df1dbb9d9ba31) This home is owned by the talented Alice Palmer who who owns Alice Palmer. co HERE (https://www.alicepalmer.co/) She creates beautiful lampshades, pillows, and other things for her online shop and her house was featured by House and Garden magazine online. DTT defines chesterfield sofa Anita's crush is the Defiant Health Radio podcast with Dr. William Davis He's telling the truth about cholesterol, about fat in your diet and how to maximize your health. Kelly's crush is the 99% Invisible podcast. Need help with your home? We'd love to help! We do personalized consults, and we'll offer advice specific to your room that typically includes room layout ideas, suggestions for what the room needs, and how to pull the room together. We'll also help you to decide what isn't working for you. We work with any budget, large or small. Find out more HERE (https://www.decoratingtipsandtricks.com/consult) Hang out with us between episodes at our blogs, IG and Kelly's YouTube channels. Links are below to all those places to catch up on the other 6 days of the week! Kelly's IG HERE (https://www.instagram.com/mysoulfulhome/) Kelly's Youtube HERE (https://www.youtube.com/mysoulfulhome) Kelly's blog HERE (https://www.mysoulfulhome.com/) Anita's IG HERE (https://www.instagram.com/cedarhillfarmhouse/) Anita's blog HERE (https://cedarhillfarmhouse.com/) Are you subscribed to the podcast? Don't need to search for us each Wednesday let us come right to your door ...er...device. Subscribe wherever you listen to your podcasts. Just hit the SUBSCRIBE button & we'll show up! If you have a moment we would so appreciate it if you left a review for DTT on iTunes. Just go HERE (https://podcasts.apple.com/us/podcast/decorating-tips-and-tricks/id1199677372?ls=1&mt=2) and click listen in apple podcasts. XX, Anita & Kelly

Nerd Poker
S5E7 Deadly Stink Cloud

Nerd Poker

Play Episode Listen Later Aug 23, 2022 57:38


Our heroes pull out all the stops in their first battle- silver flames are summoned and magical songs are sung, while arm blades slash, axes fly, and Dr. Uid's forehead vein pulses. But the gravity of the situation is suddenly clear as a filth cloud threatens to take down everyone, even though they've just met. For merch, social media, and more be sure to head to nerdpokerpod.com. And for 3 bonus episodes a month and more, subscribe to our Patreon at patreon.com/nerdpoker.

Masters of Privacy (ES)
Toni Andújar: nuevos actores, reinos de taifas, Data Clean Rooms y el azote del Long Tail

Masters of Privacy (ES)

Play Episode Listen Later Jun 10, 2022 33:39


Toni Andújar lleva más de 15 años trabajando en proyectos de Data, Digital, tecnología, Customer Experience (CX), MadTech (MarTech + AdTech), Online Marketing, Ecommerce e Innovación. Su Newsletter (MadTech Soul) es una excelente fuente de información y reflexiones para profesionales del sector. Además de Customer Data Technology & Ops Lead en el grupo Publicis, Toni es muy activo en eventos y espacios formativos de referencia como Gen/D, The Valley o The Cookie Afterwork.  Referencias: MadTech Soul Gen/D The Valley Digital Business School Toni Andújar en LinkedIn

Masters of Privacy (ES)
Mikel Lekaroz: identidad, estándares y futuro de la publicidad programática

Masters of Privacy (ES)

Play Episode Listen Later Jun 3, 2022 27:47


Mikel Lekaroz es consejero delegado de Adbibo by Next14, presidente de IAB Spain y consejero de Editora de Tecnología Publicitaria. Se ha especializado en negocio digital, estrategia y tecnología para publicidad habiendo ocupado puestos de responsabilidad tanto en la parte de venta como en la de compra. Es, además, colaborador habitual en Programmatic Spain y profesor y ponente en diversos cursos y foros de la industria digital. Mikel es licenciado en empresariales por la Universidad de Deusto y Master en marketing por la Strathclyde University. Empezó su carrera en PWC como auditor. Referencias: IAB Spain Programmatic Spain Unified ID Privacy Sandbox (Google Chrome) Futuro Reglamento ePrivacy (borrador) Resumen de ATS Madrid 2022 (ExchangeWire)

Software Sessions
Ant Wilson on Supabase

Software Sessions

Play Episode Listen Later May 11, 2022 59:08


This episode originally aired on Software Engineering Radio.A few topics covered Building on top of open source Forking their GoTrue dependency Relying on Postgres features like row level security Adding realtime support based on Postgres's write ahead log Generating an API layer based on the database schema with PostgREST Creating separate EC2 instances for each customer's database How Postgres could scale in the future Monitoring postgres Common support tickets Permissive open source licenses Related Links @antwilson Supabase Supabase GitHub Firebase Airtable PostgREST GoTrue Elixir Prometheus VictoriaMetrics Logflare BigQuery Netlify Y Combinator Postgres PostgreSQL Write-Ahead Logging Row Security Policies pg_stat_statements pgAdmin PostGIS Amazon Aurora Transcript You can help edit this transcript on GitHub. [00:00:00] Jeremy: Today I'm talking to Ant Wilson, he's the co-founder and CTO of Supabase. Ant welcome to software engineering radio.[00:00:07] Ant: Thanks so much. Great to be here. [00:00:09] Jeremy: When I hear about Supabase, I always hear about it in relation to two other products. The first is Postgres, which is a open source relational database. And second is Firebase, which is a backend as a service product from Google cloud that provides a no SQL data store.It provides authentication and authorization. It has a functions as a service component. It's really meant to be a replacement for you needing to have your own server, create your own backend. You can have that all be done from Firebase. I think a good place for us to start would be walking us through what supabase is and how it relates to those two products.[00:00:55] Ant: Yeah. So, so we brand ourselves as the open source Firebase alternativethat came primarily from the fact that we ourselves do use the, as the alternative to Firebase. So, so my co-founder Paul in his previous startup was using fire store. And as they started to scale, they hit certain limitations, technical scaling limitations and he'd always been a huge Postgres fan.So we swapped it out for Postgres and then just started plugging in. The bits that we're missing, like the real-time streams. Um, He used the tool called PostgREST with a T for the, for the CRUD APIs. And sohe just built like the open source Firebase alternative on Postgres, And that's kind of where the tagline came from.But the main difference obviously is that it's relational database and not a no SQL database which means that it's not actually a drop-in replacement. But it does mean that it kind of opens the door to a lot more functionality actually. Um, Which, which is hopefully an advantage for us. [00:02:03] Jeremy: it's a, a hosted form of Postgres. So you mentioned that Firebase is, is different. It's uh NoSQL. People are putting in their, their JSON objects and things like that. So when people are working with Supabase is the experience of, is it just, I'm connecting to a Postgres database I'm writing SQL.And in that regard, it's kind of not really similar to Firebase at all. Is that, is that kind of right?[00:02:31] Ant: Yeah, I mean, the other thing, the other important thing to notice that you can communicate with Supabase directly from the client, which is what people love about fire base. You just like put the credentials on the client and you write some security rules, and then you just start sending your data. Obviously with supabase, you do need to create your schema because it's relational.But apart from that, the experience of client side development is very much the same or very similar the interface, obviously the API is a little bit different. But, but it's similar in that regard. But I, I think, like I said, we're moving, we are just a database company actually. And the tagline, just explained really, well, kind of the concept of, of what it is like a backend as a service. It has the real-time streams. It has the auth layer. It has the also generated APIs. So I don't know how long we'll stick with the tagline. I think we'll probably outgrow it at some point. Um, But it does do a good job of communicating roughly what the service is.[00:03:39] Jeremy: So when we talk about it being similar to Firebase, the part that's similar to fire base is that you could be a person building the front end part of the website, and you don't need to necessarily have a backend application because all of that could talk to supabase and supabase can handle the authentication, the real-time notifications all those sorts of things, similar to Firebase, where we're basically you only need to write the front end part, and then you have to know how to, to set up super base in this case.[00:04:14] Ant: Yeah, exactly. And some of the other, like we took w we love fire based, by the way. We're not building an alternative to try and destroy it. It's kind of like, we're just building the SQL alternative and we take a lot of inspiration from it. And the other thing we love is that you can administer your database from the browser.So you go into Firebase and you have the, you can see the object tree, and when you're in development, you can edit some of the documents in real time. And, and so we took that experience and effectively built like a spreadsheet view inside of our dashboard. And also obviously have a SQL editor in there as well.And trying to, create this, this like a similar developer experience, because that's where Firebase just excels is. The DX is incredible. And so we, we take a lot of inspiration from it in, in those respects.[00:05:08] Jeremy: and to to make it clear to our listeners as well. When you talk about this interface, that's kind of like a spreadsheet and things like that. I suppose it's similar to somebody opening up pgAdmin, I suppose, and going in and editing the rows. But, but maybe you've got like another layer on top that just makes it a little more user-friendly a little bit more like something you would get from Firebase, I guess.[00:05:33] Ant: Yeah.And, you know, we, we take a lot of inspiration from pgAdmin. PG admin is also open source. So I think we we've contributed a few things and, or trying to upstream a few things into PG admin. The other thing that we took a lot of inspiration from for the table editor, what we call it is airtable.And because airtable is effectively. a a relational database and that you can just come in and, you know, click to add your columns, click to add a new table. And so we just want to reproduce that experience again, backed up by a full Postgres dedicated database. [00:06:13] Jeremy: so when you're working with a Postgres database, normally you need some kind of layer in front of it, right? That the person can't open up their website and connect directly to Postgres from their browser. And you mentioned PostgREST before. I wonder if you could explain a little bit about what that is and how it works.[00:06:34] Ant: Yeah, definitely. so yeah, PostgREST has been around for a while. Um, It's basically an, a server that you connect to, to your Postgres database and it introspects your schemas and generates an API for you based on the table names, the column names. And then you can basically then communicate with your Postgres database via this restful API.So you can do pretty much, most of the filtering operations that you can do in SQL um, uh, equality filters. You can even do full text search over the API. So it just means that whenever you obviously add a new table or a new schema or a new column the API just updates instantly. So you, you don't have to worry about writing that, that middle layer which is, was always the drag right.When, what have you started a new project. It's like, okay, I've got my schema, I've got my client. Now I have to do all the connecting code in the middle of which is kind of, yeah, no, no developers should need to write that layer in 2022.[00:07:46] Jeremy: so this the layer you're referring to, when I think of a traditional. Web application. I think of having to write routes controllers and, and create this, this sort of structure where I know all the tables in my database, but the controllers I create may not map one to one with those tables. And so you mentioned a little bit about how PostgREST looks at the schema and starts to build an API automatically.And I wonder if you could explain a little bit about how it does those mappings or if you're writing those yourself. [00:08:21] Ant: Yeah, it basically does them automatically by default, it will, you know, map every table, every column. When you want to start restricting things. Well, there's two, there's two parts to this. There's one thing which I'm sure we'll get into, which is how is this secure since you are communicating direct from the client.But the other part is what you mentioned giving like a reduced view of a particular date, bit of data. And for that, we just use Postgres views. So you define a view which might be, you know it might have joins across a couple of different tables or it might just be a limited set of columns on one of your tables. And then you can choose to just expose that view. [00:09:05] Jeremy: so it sounds like when you would typically create a controller and create a route. Instead you create a view within your Postgres database and then PostgREST can take that view and create an end point for it, map it to that.[00:09:21] Ant: Yeah, exactly (laughs) . [00:09:24] Jeremy: And, and PostgREST is an open source project. Right. I wonder if you could talk a little bit about sort of what its its history was. How did you come to choose it? [00:09:37] Ant: Yeah.I think, I think Paul probably read about it on hacker news at some point. Anytime it appears on hacker news, it just gets voted to the front page because it's, it's So awesome. And we, we got connected to the maintainer, Steve Chavez. At some point I think he just took an interest in, or we took an interest in Postgres and we kind of got acquainted.And then we found out that, you know, Steve was open to work and this kind of like probably shaped a lot of the way we think about building out supabase as a project and as a company in that we then decided to employ Steve full time, but just to work on PostgREST because it's obviously a huge benefit for us.We're very reliant on it. We want it to succeed because it helps our business. And then as we started to add the other components, we decided that we would then always look for existing tools, existing opensource projects that exist before we decided to build something from scratch. So as we're starting to try and replicate the features of Firebase we would and auth is a great example.We did a full audit of what are all the authorization, authentication, authentication open-source tools that are out there and which one was, if any, would fit best. And we found, and Netlify had built a library called gotrue written in go, which did pretty much exactly what we needed. So we just adopted that.And now obviously, you know, we, we just have a lot of people on the team contributing to, to gotrue as well.[00:11:17] Jeremy: you touched on this a little bit earlier. Normally when you connect to a Postgres database your user has permission to, to basically everything I guess, by default, anyways. And so. So, how does that work? Where when you want to restrict people's permissions, make sure they only get to see records they're allowed to see how has that all configured in PostgREST and what's happening behind the scenes?[00:11:44] Ant: Yeah, we, the great thing about Postgres is it's got this concept of row level security, which actually, I don't think I even rarely looked at until we were building out this auth feature where the security rules live in your database as SQL. So you do like a create policy query, and you say anytime someone tries to select or insert or update apply this policy.And then how it all fits together is our auth server go true. Someone will basically make a request to sign in or sign up with email and password, and we create that user inside the, database. They get issued a URL. And they get issued a JSON, web token, a JWT, and which, you know, when they, when they have it on the, client side, proves that they are this, you, you ID, they have access to this data.Then when they make a request via PostgREST, they send the JWT in the authorization header. Then Postgres will pull out that JWT check the sub claim, which is the UID and compare it to any rows in the database, according to the policy that you wrote. So, so the most basic one is you say in order to, to access this row, it must have a column you UID and it must match whatever is in the JWT.So we basically push the authorization down into the database which actually has, you know, a lot of other benefits in that as you write new clients, You don't need to have, have it live, you know, on an API layer on the client. It's kind of just, everything is managed from the database.[00:13:33] Jeremy: So the, the, you, you ID, you mentioned that represents the user, correct. [00:13:39] Ant: Yeah. [00:13:41] Jeremy: Is that, does that map to a user in post graphs or is there some other way that you're mapping those permissions?[00:13:50] Ant: Yeah. When, so when you connect go true, which is the auth server to your Postgres database for the first time, it installs its own schema. So you'll have an auth schema and inside will be all start users with a list of the users. It'll have a uh, auth dot tokens which will store all the access tokens that it's issued.So, and one of the columns on the auth start user's table will be UUID, and then whenever you write application specific schemers, you can just join a, do a foreign key relation to the author users table. So, so it all gets into schema design and and hopefully we do a good job of having some good education content in the docs as well.Because one of the things we struggled with from the start was how much do we abstract away from SQL away from Postgres and how much do we educate? And we actually landed on the educate sides because I mean, once you start learning about Postgres, it becomes kind of a superpower for you as a developer.So we'd much rather. Have people discover us because we're a firebase alternatives frontend devs then we help them with things like schema design landing about row level security. Because ultimately like every, if you try and abstract that stuff it gets kind of crappy. And maybe not such a great experience. [00:15:20] Jeremy: to make sure I understand correctly. So you have GoTrue, which is uh, a Netlify open-source project that GoTrue project creates some tables in your, your database that has like, you've mentioned the tokens, the, the different users. Somebody makes a request to GoTrue. Like here's my username, my password go true.Gives them back a JWT. And then from your front end, you send that JWT to the PostgREST endpoint. And from that JWT, it's able to know which user you are and then uses postgres' built in a row level security to figure out which rows you're, you're allowed to bring back. Did I, did I get that right?[00:16:07] Ant: That is pretty much exactly how it works. And it's impressive that you garnered that without looking at a single diagram (laughs) But yeah, and, and, and obviously we, we provide a client library supabase JS, which actually does a lot of this work for you. So you don't need to manually attach the JJ JWT in a header.If you've authenticated with supabase JS, then every request sent to PostgREST. After that point, the header will just be attached automatically, and you'll be in a session as that user. [00:16:43] Jeremy: and, and the users that we're talking about when we talk about Postgres' row level security. Are those actual users in PostgreSQL. Like if I was to log in with psql, I could actually log in with those users.[00:17:00] Ant: They're not, you could potentially structure it that way. But it would be more advanced it's it's basically just users in, in the auth.users table, the way, the way it's currently done. [00:17:12] Jeremy: I see and postgrest has the, that row level security is able to work with that table. You, you don't need to have actual Postgres users.[00:17:23] Ant: Exactly. And, and it's, it's basically turing complete. I mean, you can write extremely complex auth policies. You can say, you know, only give access to this particular admin group on a Thursday afternoon between six and 8:00 PM. You can get really, yeah. really as fancy as you want. [00:17:44] Jeremy: Is that all written in SQL or are there other languages they allow you to use?[00:17:50] Ant: Yeah. It's the default is plain SQL. Within Postgres itself, you can useI think you can use, like there's a Python extension. There's a JavaScript extension, which is a, I think it's a subsets of, of JavaScripts. I mean, this is the thing with Postgres, it's super extensible and people have probably got all kinds of interpreters.So you, yeah, you can use whatever you want, but the typical user will just use SQL. [00:18:17] Jeremy: interesting. And that applies to logic in general, I suppose, where if you were writing a rails application, you might write Ruby. Um, If you're writing a node application, you write JavaScript, but you're, you're saying in a lot of cases with PostgREST, you're actually able to do what you want to do, whether that's serialization or mapping objects, do that all through SQL.[00:18:44] Ant: Yeah, exactly, exactly. And then obviously like there's a lot of awesome other stuff that Postgres has like this postGIS, which if you're doing geo, if you've got like a geo application, it'll load it up with a geo types for you, which you can just use. If you're doing like encryption and decryption, we just added PG libsodium, which is a new and awesome cryptography extension.And so you can use all of these, these all add like functions, like SQL functions which you can kind of use in, in any parts of the logic or in the role level policies. Yeah.[00:19:22] Jeremy: and something I thought was a little unique about PostgREST is that I believe it's written in Haskell. Is that right?[00:19:29] Ant: Yeah, exactly. And it makes it fairly inaccessible to me as a result. But the good thing is it's got a thriving community of its own and, you know, people who on there's people who contribute probably because it's written in haskell. And it's, it's just a really awesome project and it's an excuse to, to contribute to it.But yeah. I, I think I did probably the intro course, like many people and beyond that, it's just, yeah, kind of inaccessible to me. [00:19:59] Jeremy: yeah, I suppose that's the trade-off right. Is you have a, a really passionate community about like people who really want to use Haskell and then you've got the, the, I guess the group like yourselves that looks at it and goes, oh, I don't, I don't know about this.[00:20:13] Ant: I would, I would love to have the time to, to invest in uh, but not practical right now. [00:20:21] Jeremy: You talked a little bit about the GoTrue project from Netlify. I think I saw on one of your blog posts that you actually forked it. Can you sort of explain the reasoning behind doing that?[00:20:34] Ant: Yeah, initially it was because we were trying to move extremely fast. So, so we did Y Combinator in 2020. And when you do Y Combinator, you get like a part, a group partner, they call it one of the, the partners from YC and they add a huge amount of external pressure to move very quickly. And, and our biggest feature that we are working on in that period was auth.And we just kept getting the question of like, when are you going to ship auth? You know, and every single week we'd be like, we're working on it, we're working on it. And um, and one of the ways we could do it was we just had to iterate extremely quickly and we didn't rarely have the time to, to upstream things correctly.And actually like the way we use it in our stack is slightly differently. They connected to MySQL, we connected to Postgres. So we had to make some structural changes to do that. And the dream would be now that we, we spend some time upstream and a lot of the changes. And hopefully we do get around to that.But the, yeah, the pace at which we've had to move over the last uh, year and a half has been kind of scary and, and that's the main reason, but you know, hopefully now we're a little bit more established. We can hire some more people to, to just focus on, go true and, and bringing the two folks back together. [00:22:01] Jeremy: it's just a matter of, like you said speed, I suppose, because the PostgREST you, you chose to continue working off of the existing open source project, right? [00:22:15] Ant: Yeah, exactly. Exactly. And I think the other thing is it's not a major part of Netlify's business, as I understand it. I think if it was and if both companies had more resource behind it, it would make sense to obviously focus on on the single codebase but I think both companies don't contribute as much resource as as we would like to, but um, but it's, it's for me, it's, it's one of my favorite parts of the stack to work on because it's written in go and I kind of enjoy how that it all fits together.So Yeah. I, I like to dive in there. [00:22:55] Jeremy: w w what about go, or what about how it's structured? Do you particularly enjoy about the, that part of the project?[00:23:02] Ant: I think it's so I actually learned learned go through, gotrue and I'm, I have like a Python and C plus plus background And I hate the fact that I don't get to use Python and C plus posts rarely in my day to day job. It's obviously a lot of type script. And then when we inherited this code base, it was kind of, as I was picking it up I, it just reminded me a lot of, you know, a lot of the things I loved about Python and C plus plus, and, and the tooling around it as well. I just found to be exceptional. So, you know, you just do like a small amounts of conflig. Uh config, And it makes it very difficult to, to write bad code, if that makes sense.So the compiler will just, boot you back if you try and do something silly which isn't necessarily the case with, with JavaScript. I think TypeScript is a little bit better now, but Yeah, I just, it just reminded me a lot of my Python and C days.[00:24:01] Jeremy: Yeah, I'm not too familiar with go, but my understanding is that there's, there's a formatter that's a part of the language, so there's kind of a consistency there. And then the language itself tries to get people to, to build things in the same way, or maybe have simpler ways of building things. Um, I don't, I don't know.Maybe that's part of the appeal.[00:24:25] Ant: Yeah, exactly. And the package manager as well is great. It just does a lot of the importing automatically. and makes sure like all the declarations at the top are formatted correctly and, and are definitely there. So Yeah. just all of that tool chain is just really easy to pick up.[00:24:46] Jeremy: Yeah. And I, and I think compiled languages as well, when you have the static type checking. By the compiler, you know, not having things blow up and run time. That's, that's just such a big relief, at least for me in a lot of cases,[00:25:00] Ant: And I just loved the dopamine hits of when you compile something on it actually compiles this. I lose that with, with working with JavaScript. [00:25:11] Jeremy: for sure. One of the topics you mentioned earlier was how super base provides real-time database updates. And which is something that as far as I know is not natively a part of Postgres. So I wonder if you could explain a little bit about how that works and how that came about.[00:25:31] Ant: Yeah. So, So Postgres, when you add replication databases the way it does is it writes everything to this thing called the write ahead log, which is basically all the changes that uh, have, are going to be applied to, to the database. And when you connect to like a replication database. It basically streams that log across.And that's how the replica knows what, what changes to, to add. So we wrote a server, which basically pretends to be a Postgres rep, replica receives the right ahead log encodes it into JSON. And then you can subscribe to that server over web sockets. And so you can choose whether to subscribe, to changes on a particular schema or a particular table or particular columns, and even do equality matches on rows and things like this.And then we recently added the role level security policies to the real-time stream as well. So that was something that took us a while to, cause it was probably one of the largest technical challenges we've faced. But now that it's in the real-time stream is, is fully secure and you can apply these, these same policies that you apply over the CRUD API as well.[00:26:48] Jeremy: So for that part, did you have to look into the internals of Postgres and how it did its row level security and try to duplicate that in your own code?[00:26:59] Ant: Yeah, pretty much. I mean it's yeah, it's fairly complex and there's a guy on our team who, well, for him, it didn't seem as complex, let's say (laughs) , but yeah, that's pretty much it it's just a lot of it's effectively a SQL um, a Postgres extension itself, uh which in-in interprets those policies and applies them to, to the, to the, the right ahead log.[00:27:26] Jeremy: and this piece that you wrote, that's listening to the right ahead log. what was it written in and, and how did you choose that, that language or that stack?[00:27:36] Ant: Yeah. That's written in the Elixir framework which is based on Erlang very horizontally scalable. So any applications that you write in Elixir can kind of just scale horizontally the message passing and, you know, go into the billions and it's no problem. So it just seemed like a sensible choice for this type of application where you don't know.How large the wall is going to be. So it could just be like a few changes per second. It could be a million changes per second, then you need to be able to scale out. And I think Paul who's my co-founder originally, he wrote the first version of it and I think he wrote it as an excuse to learn Elixir, which is how, a lot of probably how PostgREST ended up being Haskell, I imagine.But uh, but it's meant that the Elixir community is still like relatively small. But it's a group of like very passionate and very um, highly skilled developers. So when we hire from that pool everyone who comes on board is just like, yeah, just, just really good and really enjoy is working with Elixir.So it's been a good source of a good source for hires as well. Just, just using those tools. [00:28:53] Jeremy: with a feature like this, I'm assuming it's where somebody goes to their website. They make a web socket connection to your application and they receive the updates that way. How have you seen how far you're able to push that in terms of connections, in terms of throughput, things like that?[00:29:12] Ant: Yeah, I don't actually have the numbers at hand. But we have, yeah, we have a team focused on obviously maximizing that but yeah, I don't I don't don't have those numbers right now. [00:29:24] Jeremy: one of the last things you've you've got on your website is a storage project or a storage product, I should say. And I believe it's written in TypeScript, so I was curious, we've got PostGrest, which is in Haskell. We've got go true and go. Uh, We've got the real-time database part in elixir.And so with storage, how did we finally get to TypeScript?[00:29:50] Ant: (Laughs) Well, the policy we kind of landed on was best tool for the job. Again, the good thing about being an open source is we're not resource constrained by the number of people who are in our team. It's by the number of people who are in the community and I'm willing to contribute. And so for that, I think one of the guys just went through a few different options that we could have went with, go just to keep it in line with a couple of the other APIs.But we just decided, you know, a lot of people well, everyone in the team like TypeScript is kind of just a given. And, and again, it was kind of down to speed, like what's the fastest uh we can get this up and running. And I think if we use TypeScript, it was, it was the best solution there. But yeah, but we just always go with whatever is best.Um, We don't worry too much uh, about, you know, the resources we have because the open source community has just been so great in helping us build supabase. And building supabase is like building like five companies at the same time actually, because each of these vertical stacks could be its own startup, like the auth stack And the storage layer, and all of this stuff.And you know, each has, it does have its own dedicated team. So yeah. So we're not too worried about the variation in languages.[00:31:13] Jeremy: And the storage layer is this basically a wrapper around S3 or like what is that product doing?[00:31:21] Ant: Yeah, exactly. It's it's wraparound as three. It, it would also work with all of the S3 compatible storage systems. There's a few Backblaze and a few others. So if you wanted to self host and use one of those alternatives, you could, we just have everything in our own S3 booklets inside of AWS.And then the other awesome thing about the storage system is that because we store the metadata inside of Postgres. So basically the object tree of what buckets and folders and files are there. You can write your role level policies against the object tree. So you can say this, this user should only access this folder and it's, and it's children which was kind of. Kind of an accident. We just landed on that. But it's one of my favorite things now about writing applications and supervisors is the rollover policies kind of work everywhere.[00:32:21] Jeremy: Yeah, it's interesting. It sounds like everything. Whether it's the storage or the authentication it's all comes back to postgres, right? At all. It's using the row level security. It's using everything that you put into the tables there, and everything's just kind of digging into that to get what it needs.[00:32:42] Ant: Yeah. And that's why I say we are a database company. We are a Postgres company. We're all in on postgres. We got asked in the early days. Oh, well, would you also make it my SQL compatible compatible with something else? And, but the amounts. Features Postgres has, if we just like continue to leverage them then it, it just makes the stack way more powerful than if we try to you know, go thin across multiple different databases.[00:33:16] Jeremy: And so that, that kind of brings me to, you mentioned how your Postgres companies, so when somebody signs up for supabase they create their first instance. What's what's happening behind the scenes. Are you creating a Postgres instance for them in a container, for example, how do you size it? That sort of thing.[00:33:37] Ant: Yeah. So it's basically just easy to under the hood for us we, we have plans eventually to be multi-cloud. But again, going down to the speed of execution that the. The fastest way was to just spin up a dedicated instance, a dedicated Postgres instance per user on EC2. We do also package all of the API APIs together in a second EC2 instance.But we're starting to break those out into clustered services. So for example, you know, not every user will use the storage API, so it doesn't make sense to Rooney for every user regardless. So we've, we've made that multitenant, the application code, and now we just run a huge global cluster which people connect through to access the S3 bucket.Basically and we're gonna, we have plans to do that for the other services as well. So right now it's you got two EC2 instances. But over time it will be just the Postgres instance and, and we wanted. Give everyone a dedicated instance, because there's nothing worse than sharing database resource with all the users, especially when you don't know how heavily they're going to use it, whether they're going to be bursty.So I think one of the things we just said from the start is everyone gets a Postgres instance and you get access to it as well. You can use your Postgres connection string to, to log in from the command line and kind of do whatever you want. It's yours.[00:35:12] Jeremy: so did it, did I get it right? That when I sign up, I create a super base account. You're actually creating an two instance for me specifically. So it's like every customer gets their, their own isolated it's their own CPU, their own Ram, that sort of thing.[00:35:29] Ant: Yeah, exactly, exactly. And, and the way the. We've set up the monitoring as well, is that we can expose basically all of that to you in the dashboard as well. so you can, you have some control over like the resource you want to use. If you want to a more powerful instance, we can do that. A lot of that stuff is automated.So if someone scales beyond the allocated disk size, the disk will automatically scale up by 50% each time. And we're working on automating a bunch of these, these other things as well.[00:36:03] Jeremy: so is it, is it where, when you first create the account, you might create, for example, a micro instance, and then you have internal monitoring tools that see, oh, the CPU is getting heady hit pretty hard. So we need to migrate this person to a bigger instance, that kind of thing.[00:36:22] Ant: Yeah, pretty much exactly. [00:36:25] Jeremy: And is that, is that something that the user would even see or is it the case of where you send them an email and go like, Hey, we notice you're hitting the limits here. Here's what's going to happen. [00:36:37] Ant: Yeah.In, in most cases it's handled automatically. There are people who come in and from day one, they say has my requirements. I'm going to have this much traffic. And I'm going to have, you know, a hundred thousand users hitting this every hour. And in those cases we will over-provisioned from the start.But if it's just the self service case, then it will be start on a smaller instance and an upgrade over time. And this is one of our biggest challenges over the next five years is we want to move to a more scalable Postgres. So cloud native Postgres. But the cool thing about this is there's a lot of.Different companies and individuals working on this and upstreaming into Postgres itself. So for us, we don't need to, and we, and we would never want to fork Postgres and, you know, and try and separate the storage and the the computes. But more we're gonna fund people who are already working on this so that it gets upstreamed into Postgres itself.And it's more cloud native. [00:37:46] Jeremy: Yeah. So I think the, like we talked a little bit about how Firebase was the original inspiration and when you work with Firebase, you, you don't think about an instance at all, right? You, you just put data in, you get data out. And it sounds like in this case, you're, you're kind of working from the standpoint of, we're going to give you this single Postgres instance.As you hit the limits, we'll give you a bigger one. But at some point you, you will hit a limit of where just that one instance is not enough. And I wonder if there's you have any plans for that, or if you're doing anything currently to, to handle that.[00:38:28] Ant: Yeah. So, so the medium goal is to do replication like horizontal scaling. We, we do that for some users already but we manually set that up. we do want to bring that to the self serve model as well, where you can just choose from the start. So I want, you know, replicas in these, in these zones and in these different data centers.But then, like I said, the long-term goal is that. it's not based on. Horizontally scaling a number of instances it's just a Postgres itself can, can scale out. And I think we will get to, I think, honestly, the race at which the Postgres community is working, I think we'll be there in two years.And, and if we can contribute resource towards that, that goal, I think yeah, like we'd love to do that, but yeah, but for now, it's, we're working on this intermediate solution of, of what people already do with, Postgres, which is, you know, have you replicas to make it highly available.[00:39:30] Jeremy: And with, with that, I, I suppose at least in the short term, the goal is that your monitoring software and your team is handling the scaling up the instance or creating the read replicas. So to the user, it, for the most part feels like a managed service. And then yeah, the next step would be to, to get something more similar to maybe Amazon's Aurora, I suppose, where it just kind of, you pay per use.[00:40:01] Ant: Yeah, exactly. Exactly. Aurora was kind of the goal from the start. It's just a shame that it's proprietary. Obviously. [00:40:08] Jeremy: right. Um, but it sounds, [00:40:10] Ant: the world would be a better place. If aurora was opensource. [00:40:15] Jeremy: yeah. And it sounds like you said, there's people in the open source community that are, that are trying to get there. just it'll take time. to, to all this, about making it feel seamless, making it feel like a serverless experience, even though internally, it really isn't, I'm guessing you must have a fair amount of monitoring or ways that you're making these decisions.I wonder if you can talk a little bit about, you know, what are the metrics you're looking at and what are the applications you're you have to, to help you make these decisions?[00:40:48] Ant: Yeah. definitely. So we started with Prometheus which is a, you know, metrics gathering tool. And then we moved to Victoria metrics which was just easier for us to scale out. I think soon we'll be managing like a hundred thousand Postgres databases will have been deployed on, on supabase. So definitely, definitely some scale. So this kind of tooling needs to scale to that as well. And then we have agents kind of everywhere on each application on, on the database itself. And we listen for things like the CPU and the Ram and the network IO. We also poll. Uh, Postgres itself. Th there's a extension called PG stats statements, which will give us information about what are, the intensive queries that are running on that, on that box.So we just collect as much of this as possible um, which we then obviously use internally. We set alerts to, to know when, when we need to upgrade in a certain direction, but we also have an end point where the dashboard subscribes to these metrics as well. So the user themselves can see a lot of this information.And we, I think at the moment we do a lot of the, the Ram the CPU, that kind of stuff, but we're working on adding just more and more of these observability metrics uh, so people can can know it could, because it also helps with Let's say you might be lacking an index on a particular table and not know about it.And so if we can expose that to you and give you alerts about that kind of thing, then it obviously helps with the developer experience as well.[00:42:29] Jeremy: Yeah. And th that brings me to something that I, I hear from platform as a service companies, where if a user has a problem, whether that's a crash or a performance problem, sometimes it can be difficult to distinguish between is it a problem in their application or is this a problem in super base or, you know, and I wonder how your support team kind of approaches that.[00:42:52] Ant: Yeah, no, it's, it's, it's a great question. And it's definitely something we, we deal with every day, I think because of where we're at as a company we've always seen, like, we actually have a huge advantage in that.we can provide. Rarely good support. So anytime an engineer joins super base, we tell them your primary job is actually frontline support.Everything you do afterwards is, is secondary. And so everyone does a four hour shift per week of, of working directly with the customers to help determine this kind of thing. And where we are at the moment is we are happy to dive in and help people with their application code because it helps our engineers land about how it's being used and where the pitfalls are, where we need better documentation, where we need education.So it's, that is all part of the product at the moment, actually. And, and like I said, because we're not a 10,000 person company we, it's an advantage that we have, that we can deliver that level of support at the moment. [00:44:01] Jeremy: w w what are some of the most common things you see happening? Like, is it I would expect you mentioned indexing problems, but I'm wondering if there's any specific things that just come up again and again,[00:44:15] Ant: I think like the most common is people not batching their requests. So they'll write an application, which, you know, needs to, needs to pull 10,000 rows and they send 10,000 requests (laughs) . That that's, that's a typical one for, for people just getting started maybe. Yeah. and, then I think the other thing we faced in the early days was. People storing blobs in the database which we obviously solve that problem by introducing file storage. But people will be trying to store, you know, 50 megabytes, a hundred megabyte files in Postgres itself, and then asking why the performance was so bad.So I think we've, we've mitigated that one by, by introducing the blob storage.[00:45:03] Jeremy: and when you're, you mentioned you have. Over a hundred thousand instances running. I imagine there have to be cases where an incident occurs, where something doesn't go quite right. And I wonder if you could give an example of one and how it was resolved.[00:45:24] Ant: Yeah, it's a good question. I think, yeah, w w we've improved the systems since then, but there was a period where our real time server wasn't able to handle rarely large uh, right ahead logs. So w there was a period where people would just make tons and tons of requests and updates to, to Postgres. And the real time subscriptions were failing. But like I said, we have some really great Elixir devs on the team, so they were able to jump on that fairly quickly. And now, you know, the application is, is way more scalable as a result. And that's just kind of how the support model works is you have a period where everything is breaking and then uh, then you can just, you know, tackle these things one by one. [00:46:15] Jeremy: Yeah, I think any, anybody at a, an early startup is going to run into that. Right? You put it out there and then you find out what's broken, you fix it and you just get better and better as it goes along.[00:46:28] Ant: Yeah, And the funny thing was this model of, of deploying EC2 instances. We had that in like the first week of starting super base, just me and Paul. And it was never intended to be the final solution. We just kind of did it quickly and to get something up and running for our first handful of users But it's scaled surprisingly well.And actually the things that broke as we started to get a lot of traffic and a lot of attention where was just silly things. Like we give everyone their own domain when they start a new project. So you'll have project ref dot super base dot in or co. And the things that were breaking where like, you know, we'd ran out of sub-domains with our DNS provider and then, but, and those things always happen in periods of like intense traffic.So we ha we were on the front page of hacker news, or we had a tech crunch article, and then you discover that you've ran out of sub domains and the last thousand people couldn't deploy their projects. So that's always a fun a fun challenge because you are then dependent on the external providers as well and theirs and their support systems.So yeah, I think. We did a surprisingly good job of, of putting in good infrastructure from the start. But yeah, all of these crazy things just break when obviously when you get a lot of, a lot of traffic[00:48:00] Jeremy: Yeah, I find it interesting that you mentioned how you started with creating the EC2 instances and it turned out that just work. I wonder if you could walk me through a little bit about how it worked in the beginning, like, was it the two of you going in and creating instances as people signed up and then how it went from there to where it is today?[00:48:20] Ant: yeah. So there's a good story about, about our fast user, actually. So me and Paul used to contract for a company in Singapore, which was an NFT company. And so we knew the lead developer very well. And we also still had the Postgres credentials on, on our own machines. And so what we did was we set up the th th the other funny thing is when we first started, we didn't intend to host the database.We, we thought we were just gonna host the applications that would connect to your existing Postgres instance. And so what we did was we hooked up the applications to, to the, to the Postgres instance of this, of this startup that we knew very well. And then we took the bus to their office and we sat with the lead developer, and we said, look, we've already set this thing up for you.What do you think. know, when, when you think like, ah, we've, we've got the best thing ever, but it's not until you put it in front of someone and you see them, you know, contemplating it and you're like, oh, maybe, maybe it's not so good. Maybe we don't have anything. And we had that moment of panic of like, oh, maybe we just don't maybe this isn't great.And then what happened was he didn't like use us. He didn't become a supabase user. He asked to join the team. [00:49:45] Jeremy: nice, nice.[00:49:46] Ant: that was a good a good kind of a moment where we thought, okay, maybe we have got something, maybe this is maybe this isn't terrible. So, so yeah, so he became our first employee. Yeah. [00:49:59] Jeremy: And so yeah, so, so that case was, you know, the very beginning you set everything up from, from scratch. Now that you have people signing up and you have, you know, I don't know how many signups you get a day. Did you write custom infrastructure or applications to do the provisioning or is there an open source project that you're using to handle that[00:50:21] Ant: Yeah. It's, it's actually mostly custom. And you know, AWS does a lot of the heavy lifting for you. They just provide you with a bunch of API end points. So a lot of that is just written in TypeScript fairly straightforward and, and like I said, you never intended to be the thing that last. Two years into the business.But it's, it's just scaled surprisingly well. And I'm sure at some point we'll, we'll swap it out for some I don't orchestration tooling like Pulumi or something like this. But actually the, what we've got just works really well.[00:50:59] Ant: Be because we're so into Postgres our queuing system is a Postgres extension called PG boss. And then we have a fleet of workers, which are. Uh, We manage on EC ECS. Um, So it's just a bunch of VMs basically which just subscribed to the, to the queue, which lives inside the database.And just performs all the, whether it be a project creation, deletion modification a whole, whole suite of these things. Yeah. [00:51:29] Jeremy: very cool. And so even your provisioning is, is based on Postgres.[00:51:33] Ant: Yeah, exactly. Exactly (laughs) . [00:51:36] Jeremy: I guess in that case, I think, did you say you're using the right ahead log there to in order to get notifications?[00:51:44] Ant: We do use real time, and this is the fun thing about building supabase is we use supabase to build supabase. And a lot of the features start with things that we build for ourselves. So the, the observability features we have a huge logging division. So, so w we were very early users of a tool called a log flare, which is also written in Elixir.It's basically a log sync backed up by BigQuery. And we loved it so much and we became like super log flare power users that it was kind of, we decided to eventually acquire the company. And now we can just offer log flare to all of our customers as well as part of using supabase. So you can query your logs and get really good business intelligence on what your users um, consuming in from your database.[00:52:35] Jeremy: the lock flare you're mentioning though, you said that that's a log sink and that that's actually not going to Postgres, right. That's going to a different type of store.[00:52:43] Ant: Yeah. That is going to big query actually. [00:52:46] Jeremy: Oh, big query. Okay. [00:52:47] Ant: yeah, and maybe eventually, and this is the cool thing about watching the Postgres progression is it's become. It's bringing like transactional and analytical databases together. So it's traditionally been a great transactional database, but if you look at a lot of the changes that have been made in recent versions, it's becoming closer and closer to an analytical database.So maybe at some point we will use it, but yeah, but big query works just great. [00:53:18] Jeremy: Yeah. It's, it's interesting to see, like, I, I know that we've had episodes on different extensions to Postgres where I believe they change out how the storage works. So there's yeah, it's really interesting how it's it's this one database, but it seems like it can take so many different forms. [00:53:36] Ant: It's just so extensible and that's why we're so bullish on it because okay. Maybe it wasn't always the best database, but now it seems like it is becoming the best database and the rate at which it's moving. It's like, where's it going to be in five years? And we're just, yeah, we're just very bullish on, on Postgres.As you can tell from the amount of mentions it's had in this episode.[00:54:01] Jeremy: yeah, we'll have to count how many times it's been said. I'm sure. It's, I'm sure it's up there. Is there anything else we, we missed or think you should have mentioned.[00:54:12] Ant: No, some of the things we're excited about are cloud functions. So it's the thing we just get asked for the most at anytime we post anything on Twitter, you're guaranteed to get a reply, which is like when functions. And we're very pleased to say that it's, it's almost there. So um, that will hopefully be a really good developer experience where also we launched like a, a graph QL Postgres extension where the resolver lives inside of Postgres.And that's still in early alpha, but I think I'm quite excited for when we can start offering that on the on the hosted platform as well. People will have that option to, to use GraphQL instead of, or as well as the restful API.[00:55:02] Jeremy: the, the common thread here is that PostgreSQL you're able to take it really, really far. Right. In terms of scale up, eventually you'll have the read replicas. Hopefully you'll have. Some kind of I don't know what you would call Aurora, but it's, it's almost like self provisioning, maybe not sharing what, how you describe it.But I wonder as a, as a company, like we talked about big query, right? I wonder if there's any use cases that you've come across, either from customers or in your own work where you're like, I just, I just can't get it to fit into Postgres.[00:55:38] Ant: I think like, not very often, but sometimes we'll, we will respond to support requests and recommend that people use Firebase. they're rarelylike if, if they really do have like large amounts of unstructured data, which is which, you know, documented storage is, is kind of perfect for that. We'll just say, you know, maybe you should just use Firebase.So we definitely come across things like that. And, and like I said, we love, we love Firebase, so we're definitely not trying to, to uh, destroy as a tool. I think it, it has its use cases where it's an incredible tool yeah. And provides a lot of inspiration for, for what we're building as well. [00:56:28] Jeremy: all right. Well, I think that's a good place to, to wrap it up, but where can people hear more about you hear more about supabase?[00:56:38] Ant: Yeah, so supeabase is at supabase.com. I'm on Twitter at ant Wilson. Supabase is on Twitter at super base. Just hits us up. We're quite active on the and then definitely check out the repose gets up.com/super base. There's lots of great stuff to dig into as we discussed. There's a lot of different languages, so kind of whatever you're into, you'll probably find something where you can contribute. [00:57:04] Jeremy: Yeah, and we, we sorta touched on this, but I think everything we've talked about with the exception of the provisioning part and the monitoring part is all open source. Is that correct? [00:57:16] Ant: Yeah, exactly.And as, yeah. And hopefully everything we build moving forward, including functions and graph QL we'll continue to be open source.[00:57:31] Jeremy: And then I suppose the one thing I, I did mean to touch on is what, what is the, the license for all the components you're using that are open source?[00:57:41] Ant: It's mostly Apache2 or MIT. And then obviously Postgres has its own Postgres license. So as long as it's, it's one of those, then we, we're not too precious. I, As I said, we inherit a fair amounts of projects. So we contribute to and adopt projects. So as long as it's just very permissive, then we don't care too much.[00:58:05] Jeremy: As far as the projects that your team has worked on, I've noticed that over the years, we've seen a lot of companies move to things like the business source license or there's, there's all these different licenses that are not quite so permissive. And I wonder like what your thoughts are on that for the future of your company and why you think that you'll be able to stay permissive.[00:58:32] Ant: Yeah, I really, really, rarely hope that we can stay permissive. forever. It's, it's a philosophical thing for, for us. You know, when we, we started the business, it's what just very, just very, as individuals into the idea of open source. And you know, if, if, if AWS come along at some point and offer hosted supabase on AWS, then it will be a signal that where we're doing something.Right. And at that point we just, I think we just need to be. The best team to continue to move super boost forward. And if we are that, and I, I think we will be there and then hopefully we will never have to tackle this this licensing issue. [00:59:19] Jeremy: All right. Well, I wish you, I wish you luck.[00:59:23] Ant: Thanks. Thanks for having me. [00:59:25] Jeremy: This has been Jeremy Jung for software engineering radio. Thanks for listening. 

Marketers Morgen podcast
Kreativitet med API og UID – Adgangen til data er stor

Marketers Morgen podcast

Play Episode Listen Later Apr 5, 2022 11:50


For den kreative affiliate, der gerne vil have lidt mere indblik i, hvordan forretningen klarer sig, så er mulighederne mange. Ved at bruge API'er eller UID (gennem Partner-ads), så kan du få indsigt i, hvad der virker og ikke virker samt begynde at servere mere målrettet indhold til de besøgende på dit site.

网事头条|听见新鲜事
豆瓣被曝出在截图中添加盲水印,包含用户 UID 等信息

网事头条|听见新鲜事

Play Episode Listen Later Feb 21, 2022 0:37


近日,不少社交用户称:豆瓣应用在页面中嵌入了难以察觉的水印,水印信息使用的颜色与网页背景色相同难以看到,但开启夜间模式会看见。如果使用鼠标全选区域则可以透过高亮背景发现水印,也可以通过调色软件对截图颜色进行调整后看到水印。据悉,如果用户是处于登录状态,那么水印包含了用户 UID,如果没有登录,水印包含了 TID 和带时区的完整时间。值得一提的是,目前豆瓣已经将这种盲水印修改,隐藏了用户 ID。

Into the Bytecode
Mike Sall & Blake West: Goldfinch, uncollateralized loans in emerging markets.

Into the Bytecode

Play Episode Listen Later Feb 8, 2022 88:51


Mike Sall and Blake West are the founders of Goldfinch, a decentralized protocol facilitating uncollateralized credit.“One of the borrowers is a company based in Uganda. They provide rent-to-own loans for motorcycle taxis to thousands of customers. They've borrowed $5m to expand their operations."Thousands of people in countries like Uganda, India, and Brazil have been financed by Goldfinch loans through local lenders, largely without realizing crypto is the source of funds.These local lenders are largely innovative fintechs in the global south, and have historically fallen into an uncanny valley — they need too much capital for what is available in their local financial markets, and too little capital to navigate foreign institutional markets.3:09 - The 'lightbulb' moment8:20 - The financing gap for emerging-market borrowers13:04 - Borrower profiles; Tugende, DiviBank, and Greenway15:43 - Interfacing with Goldfinch20:37 - Crypto-native KYC and how UID works23:18 - Bottlenecks for the global adoption of crypto34:40 - Compliance requirements for Goldfinch in the United States45:25 - Compliance requirements for borrowers in emerging markets50:56 - Demographics of ‘Backers'52:43 - Incentive alignment and fraud-prevention1:03:53 - Learnings from shipping a production smart contract system1:15:01 - Launching GFI token and governance of the protocol1:26:04 - The macro point of viewCheck out our website for other episodes: intothebytecode.xyzSubscribe to our newsletter for updates: bytecode.substack.comTwitter: twitter.com/sinahab

Marijuana Tomorrow
Episode 93 - Will Delivery Kill the Brick and Mortar Store?

Marijuana Tomorrow

Play Episode Listen Later Feb 4, 2022 91:14


This week we look at the latest news from Senator Majority Leader Chuck E. Schumer and his Cannabis Administration and Opportunity Act, as he met with advocates about social equity issues.  And then we turn our attention to a new bill in California that would make it a felony to grow more than six plants without a permit. And finally we'll look at why Americans appear to prefer cannabis delivery services to visiting a brick and mortar store.  We'll be discussing those stories and more on the BEST cannabis podcast in the business... As we like to say around here, “Everyone knows what happened in marijuana today, but you need to know what's happening in Marijuana Tomorrow!”  ----more---- Segment 1 - Sen. Schumer Meets With Advocates and Gives an Update https://www.marijuanamoment.net/schumer-gives-update-on-federal-marijuana-legalization-and-banking-in-meeting-with-equity-advocates/ ----more---- Segment 2 - Could Growing More than 6 Plants be a Felony in California? https://mjbizdaily.com/california-bill-introduced-to-make-unlicensed-cannabis-cultivation-a-felony/ ----more---- Segment 3 - Will Delivery Kill the Brick and Mortar Store? https://www.businesswire.com/news/home/20220127005269/en/55-of-America%E2%80%99s-Generation-Zs-and-60-of-Millennials-Have-More-Delivery-Apps-Than-Streaming-Services/ ----more---- Big Finish link - California Recall Notice: https://cannabis.ca.gov/2022/01/dcc-orders-recall-of-packaged-cannabis-flower-due-to-mold-contamination/ Brand name: Claybourne Co.  Strain: Head Banger Track-and-trace UID number: 1A406030000326B000094476   Batch number: 28090621HB ----more---- This episode of Marijuana Tomorrow is brought to you by Cannabeta Realty.

The Transcript
The Transcript Podcast Episode 40

The Transcript

Play Episode Listen Later Nov 16, 2021 12:30


In this episode, we cover Unified ID 2.0, the challenging labor market, and AMC becoming a crypto company. Show Notes:00:00:00 Introduction00:00:11 Stong global economy despite supply chain issues and inflation00:00:45 Lots of job offers, very few takers00:05:15 Crypto is growing at the same rate internet did00:07:19 Very rapid adoption of web 3.000:09:39 Shift from cookies to UID 2.0 

3' Grezzi di Cristina Marras
3' grezzi Ep. 315 Trasporti sessisti.mp3

3' Grezzi di Cristina Marras

Play Episode Listen Later Nov 14, 2021 3:01


Anche oggetti e servizi apparentemente neutri e 'oggettivi' nascondono invece un'anima sessista e discriminatoria nei confronti delle categorie diverse da quelle dai maschi. Prendiamo il campo dei trasporti, per esempio.LINK Interessante articolo su Wired che parla del sessismo nei trasportihttps://www.wired.it/article/donne-trasporti-cop26/?uID=555265807147984a9f0e1ea76363a98b363bef55e0684287e50b8ebf7f6dd592&uID=c5f1739191adf71f17dc5071e75e9c8cda37d0488e13cce5ad2bb58f43c57dcc&utm_brand=wi&utm_campaign=daily&utm_mailing=WI_NEWS_Daily%202021-11-14&utm_medium=email&utm_source=news&utm_term=WI_NEWS_Daily

Kinda Funny Games Daily: Video Games News Podcast
Deathloop Reviews: A Game of the Year Contender? - Kinda Funny Games Daily 09.13.21

Kinda Funny Games Daily: Video Games News Podcast

Play Episode Listen Later Sep 13, 2021 59:21


Blessing and Tim talk about all these Deathloop reviews, Fortnite's new season being cube shaped, and more! Time Stamps - 00:00:00 - Start 00:02:00 - Housekeeping Our Deathloop review is up right now! It's a Kinda Funny Gamescast featuring Gamespot's Tamoor Hussain and PS I Love You XOXO's Janet Garcia. It's in depth, it's spoiler free, and it's up right now on youtube.com//KindaFunnyGames and Podcast services around the globe. Also a reminder: It's SUB-TEMBER On Twitch! Viewers across the platform throughout the month can take advantage of 20% off subscriptions for first-time subscribers and gifted subs. Your support means the world to us here at Kinda Funny and right now you can take advantage of this deal and receive benefits like ad-free viewing, sub emotes, and more. Thank you to our Patreon Producers: The Kinda Funny Destiny 2 PC Clan & Black Jack The Roper Report  - 00:03:42 - Deathloop review round up 00:17:00 - Do you think we should wait until games created wholly under Microsoft's ownership are released before making that judgement? - Best Friends Q:Grezick 00:27:00 - Fortnite season 8 is all about the cubes - Andrew Webster @ The Verge 00:37:20 - Ad 00:39:13 - Nintendo drops base model Switch price in Europe - Danielle Partis @ GiBiz 00:47:28 - Multiple Little Big Planet servers are shutting down - Taylor Lyles @ IGN 00:49:55 - Hideo Kojima wants to make games that change in real time - Tom Ivan @ VGC 00:52:20 - Out today  Reader mail  - 00:53:30 - TV and monitors and want to take full advantage of the consoles. - Parker Begale 00:55:00 - Squad Up:geekreate(Genshin Impact (PC, Mobile, PlayStation 4/5 cross play)) UID: 612040573 00:56:05 - You‘re Wrong Tomorrow's Hosts: Greg and Whitta

Masters of Privacy
Monographic: A legal approach to "cookieless" marketing

Masters of Privacy

Play Episode Listen Later May 12, 2021 23:41


As an answer to the obvious legal challenges of ID-based, cross-media deduplication (currently greater than those faced by third-party cookies), Google Chrome's Privacy Sandbox, and its related W3C Working Group, provides a framework for advertisers and publishers to leverage a browser-level interest graph while preserving anonymity, through the use of aggregate data and minimum audience thresholds. As key drawbacks, there is little control on the consumer side, and local storage could result in data leaks when coexisting with either shared-identity, third-party cookies, and platform-specific IDs or walled gardens. We will address these and other issues from a legal perspective (ePrivacy + GDPR, mostly), and your humble host (Sergio Maldonado) will be on his own for this particular mission. References: The State of Cookieless (on Medium)

Identity Revolution
Discussing How UID 2.0 Helps Consumers, Companies with Bill Michels

Identity Revolution

Play Episode Listen Later Mar 23, 2021 22:12


Data is an essential factor in decision-making. Companies leverage third-party software to access vital information and use them to improve their business practices. But what about identity? How do companies understand its role?Bill Michels, a GM at Trade Desk, introduces Unified ID 2.0, a "cookie sync" solution that helps different parties understand the identity of who's on their webpage. Cory and Bill discuss the vital importance of identity and how companies should leverage it to offer their content and improve their advertising practices.In this episode of the Identity Revolution podcast, Bill Michels will reveal exciting features of the UID 2.0 project and the importance of industry collaboration in its realization. Furthermore, Bill reveals predictions for the future of companies in MarTech, AdTech, and data space. Do they have a bright future ahead of them?

The Transcript
The Transcript Podcast Episode 7

The Transcript

Play Episode Listen Later Feb 23, 2021 13:32


In this episode, we cover the rising optimism that the second half of 2021 will be better, the high e-commerce ambitions of Walmart, and the new form of internet identification that is UID 2.0. You can read this week's newsletter here.Show Notes00:00 - Introduction00:20 - An accelerated return to normalization 01:01 - Business travel likely to normalize in 202202:33 - Prices in oil and gas could go up04:51 - Software is the new language of business05:45 - Cyberattacks are getting more sophisticated06:27 - UID 2.0 07:49 - Walmart's ambitious e-commerce plan12:12 - Farewell to Marriott CEO Arne Sorenson

KickerRadio 管不住嘴滑板广播
KickerTalk104 - 艺术总监是怎么看滑板片的

KickerRadio 管不住嘴滑板广播

Play Episode Listen Later Nov 2, 2020 39:23


借着不久前刚刚结束的 PUSHFEST 上首映的 CREW'D UP 三支滑板视频,我们请来 UID 艺术总监张艺磊跟我们一起聊聊他是怎么看滑板片的。这期也是我们再次回归视频版,视频版有完整三支片子回放并实时点评,可以移步小破站搜索 KICKERCLUB 收看。00:50 广告公司 UID 艺术总监张益磊(Roerf Gerbang)做客 KickRadio 滑板广播!05:20 PUSHFEST 滑板电影节第一个环节最好片子的是,关于中国疫情和滑板的:「Diplomatic Immunity」。08:10 Dizzle 半夜才完工?Lovespot 首映当天下午交片?甜街是滑板五条人?10:50 艺术总监“评片”开始:甜街纪实感强,有很多本土化气质。14:00 Dizzle 动作帅,鱼眼+音乐,味很正。16:10 Lovespot 音乐烘托了情绪,剪辑有风格。收看完整版视频可在哔哩哔哩搜索:KickerClub19:50 李伦最狠毒的地方就是:鱼人头套。增加了滑板视频的叙事感。22:45 疯狂看滑板片之后发现:视觉疲劳了;开始看老片:这是必须补的课。“一个片子在转场有设计感最对我胃口。”BUT“滑手对地形,动作的解读才是最根本的!”33:06 独立滑板影片,Nike,adidas 等不同品牌的滑板风格。KickerRadio 是 KickerClub.com 制作播出的中国第一个滑板网络电台。追溯中国滑板历史,聚焦核心滑手故事,关注滑板社群的不断扩大与成长。滑板不是为了改变世界而是为了不被这个世界所改变。

KickerRadio 管不住嘴滑板广播
KickerTalk104 - 艺术总监是怎么看滑板片的

KickerRadio 管不住嘴滑板广播

Play Episode Listen Later Nov 2, 2020 39:23


借着不久前刚刚结束的 PUSHFEST 上首映的 CREW'D UP 三支滑板视频,我们请来 UID 艺术总监张艺磊跟我们一起聊聊他是怎么看滑板片的。这期也是我们再次回归视频版,视频版有完整三支片子回放并实时点评,可以移步小破站搜索 KICKERCLUB 收看。00:50 广告公司 UID 艺术总监张益磊(Roerf Gerbang)做客 KickRadio 滑板广播!05:20 PUSHFEST 滑板电影节第一个环节最好片子的是,关于中国疫情和滑板的:「Diplomatic Immunity」。08:10 Dizzle 半夜才完工?Lovespot 首映当天下午交片?甜街是滑板五条人?10:50 艺术总监“评片”开始:甜街纪实感强,有很多本土化气质。14:00 Dizzle 动作帅,鱼眼+音乐,味很正。16:10 Lovespot 音乐烘托了情绪,剪辑有风格。收看完整版视频可在哔哩哔哩搜索:KickerClub19:50 李伦最狠毒的地方就是:鱼人头套。增加了滑板视频的叙事感。22:45 疯狂看滑板片之后发现:视觉疲劳了;开始看老片:这是必须补的课。“一个片子在转场有设计感最对我胃口。”BUT“滑手对地形,动作的解读才是最根本的!”33:06 独立滑板影片,Nike,adidas 等不同品牌的滑板风格。KickerRadio 是 KickerClub.com 制作播出的中国第一个滑板网络电台。追溯中国滑板历史,聚焦核心滑手故事,关注滑板社群的不断扩大与成长。滑板不是为了改变世界而是为了不被这个世界所改变。

BSD Now
194: Daemonic plans

BSD Now

Play Episode Listen Later May 17, 2017 93:35


This week on BSD Now we cover the latest FreeBSD Status Report, a plan for Open Source software development, centrally managing bhyve with Ansible, libvirt, and pkg-ssh, and a whole lot more. This episode was brought to you by Headlines FreeBSD Project Status Report (January to March 2017) (https://www.freebsd.org/news/status/report-2017-01-2017-03.html) While a few of these projects indicate they are a "plan B" or an "attempt III", many are still hewing to their original plans, and all have produced impressive results. Please enjoy this vibrant collection of reports, covering the first quarter of 2017. The quarterly report opens with notes from Core, The FreeBSD Foundation, the Ports team, and Release Engineering On the project front, the Ceph on FreeBSD project had made considerable advances, and is now usable as the net/ceph-devel port via the ceph-fuse module. Eventually they hope to have a kernel RADOS block device driver, so fuse is not required CloudABI update, including news that the Bitcoin reference implementation is working on a port to CloudABI eMMC Flash and SD card updates, allowing higher speeds (max speed changes from ~40 to ~80 MB/sec). As well, the MMC Stack can now also be backed by the CAM framework. Improvements to the Linuxulator More detail on the pNFS Server plan B that we discussed in a previous week Snow B.V. is sponsoring a dutch translation of the FreeBSD Handbook using the new .po system *** A plan for open source software maintainers (http://www.daemonology.net/blog/2017-05-11-plan-for-foss-maintainers.html) Colin Percival describes in his blog “a plan for open source software maintainers”: I've been writing open source software for about 15 years now; while I'm still wet behind the ears compared to FreeBSD greybeards like Kirk McKusick and Poul-Henning Kamp, I've been around for long enough to start noticing some patterns. In particular: Free software is expensive. Software is expensive to begin with; but good quality open source software tends to be written by people who are recognized as experts in their fields (partly thanks to that very software) and can demand commensurate salaries. While that expensive developer time is donated (either by the developers themselves or by their employers), this influences what their time is used for: Individual developers like doing things which are fun or high-status, while companies usually pay developers to work specifically on the features those companies need. Maintaining existing code is important, but it is neither fun nor high-status; and it tends to get underweighted by companies as well, since maintenance is inherently unlikely to be the most urgent issue at any given time. Open source software is largely a "throw code over the fence and walk away" exercise. Over the past 15 years I've written freebsd-update, bsdiff, portsnap, scrypt, spiped, and kivaloo, and done a lot of work on the FreeBSD/EC2 platform. Of these, I know bsdiff and scrypt are very widely used and I suspect that kivaloo is not; but beyond that I have very little knowledge of how widely or where my work is being used. Anecdotally it seems that other developers are in similar positions: At conferences I've heard variations on "you're using my code? Wow, that's awesome; I had no idea" many times. I have even less knowledge of what people are doing with my work or what problems or limitations they're running into. Occasionally I get bug reports or feature requests; but I know I only hear from a very small proportion of the users of my work. I have a long list of feature ideas which are sitting in limbo simply because I don't know if anyone would ever use them — I suspect the answer is yes, but I'm not going to spend time implementing these until I have some confirmation of that. A lot of mid-size companies would like to be able to pay for support for the software they're using, but can't find anyone to provide it. For larger companies, it's often easier — they can simply hire the author of the software (and many developers who do ongoing maintenance work on open source software were in fact hired for this sort of "in-house expertise" role) — but there's very little available for a company which needs a few minutes per month of expertise. In many cases, the best support they can find is sending an email to the developer of the software they're using and not paying anything at all — we've all received "can you help me figure out how to use this" emails, and most of us are happy to help when we have time — but relying on developer generosity is not a good long-term solution. Every few months, I receive email from people asking if there's any way for them to support my open source software contributions. (Usually I encourage them to donate to the FreeBSD Foundation.) Conversely, there are developers whose work I would like to support (e.g., people working on FreeBSD wifi and video drivers), but there isn't any straightforward way to do this. Patreon has demonstrated that there are a lot of people willing to pay to support what they see as worthwhile work, even if they don't get anything directly in exchange for their patronage. It seems to me that this is a case where problems are in fact solutions to other problems. To wit: Users of open source software want to be able to get help with their use cases; developers of open source software want to know how people are using their code. Users of open source software want to support the the work they use; developers of open source software want to know which projects users care about. Users of open source software want specific improvements; developers of open source software may be interested in making those specific changes, but don't want to spend the time until they know someone would use them. Users of open source software have money; developers of open source software get day jobs writing other code because nobody is paying them to maintain their open source software. I'd like to see this situation get fixed. As I envision it, a solution would look something like a cross between Patreon and Bugzilla: Users would be able sign up to "support" projects of their choosing, with a number of dollars per month (possibly arbitrary amounts, possibly specified tiers; maybe including $0/month), and would be able to open issues. These could be private (e.g., for "technical support" requests) or public (e.g., for bugs and feature requests); users would be able to indicate their interest in public issues created by other users. Developers would get to see the open issues, along with a nominal "value" computed based on allocating the incoming dollars of "support contracts" across the issues each user has expressed an interest in, allowing them to focus on issues with higher impact. He poses three questions to users about whether or not people (users and software developers alike) would be interested in this and whether payment (giving and receiving, respectively) is interesting Check out the comments (and those on https://news.ycombinator.com/item?id=14313804 (reddit.com)) as well for some suggestions and discussion on the topic *** OpenBSD vmm hypervisor: Part 2 (http://www.h-i-r.net/2017/04/openbsd-vmm-hypervisor-part-2.html) We asked for people to write up their experience using OpenBSD's VMM. This blog post is just that This is going to be a (likely long-running, infrequently-appended) series of posts as I poke around in vmm. A few months ago, I demonstrated some basic use of the vmm hypervisor as it existed in OpenBSD 6.0-CURRENT around late October, 2016. We'll call that video Part 1. Quite a bit of development was done on vmm before 6.1-RELEASE, and it's worth noting that some new features made their way in. Work continues, of course, and I can only imagine the hypervisor technology will mature plenty for the next release. As it stands, this is the first release of OpenBSD with a native hypervisor shipped in the base install, and that's exciting news in and of itself To get our virtual machines onto the network, we have to spend some time setting up a virtual ethernet interface. We'll run a DHCP server on that, and it'll be the default route for our virtual machines. We'll keep all the VMs on a private network segment, and use NAT to allow them to get to the network. There is a way to directly bridge VMs to the network in some situations, but I won't be covering that today. Create an empty disk image for your new VM. I'd recommend 1.5GB to play with at first. You can do this without doas or root if you want your user account to be able to start the VM later. I made a "vmm" directory inside my home directory to store VM disk images in. You might have a different partition you wish to store these large files in. Boot up a brand new vm instance. You'll have to do this as root or with doas. You can download a -CURRENT install kernel/ramdisk (bsd.rd) from an OpenBSD mirror, or you can simply use the one that's on your existing system (/bsd.rd) like I'll do here. The command will start a VM named "test.vm", display the console at startup, use /bsd.rd (from our host environment) as the boot image, allocate 256MB of memory, attach the first network interface to the switch called "local" we defined earlier in /etc/vm.conf, and use the test image we just created as the first disk drive. Now that the VM disk image file has a full installation of OpenBSD on it, build a VM configuration around it by adding the below block of configuration (with modifications as needed for owner, path and lladdr) to /etc/vm.conf I've noticed that VMs with much less than 256MB of RAM allocated tend to be a little unstable for me. You'll also note that in the "interface" clause, I hard-coded the lladdr that was generated for it earlier. By specifying "disable" in vm.conf, the VM will show up in a stopped state that the owner of the VM (that's you!) can manually start without root access. Let us know how VMM works for you *** News Roundup openbsd changes of note 621 (http://www.tedunangst.com/flak/post/openbsd-changes-of-note-621) More stuff, more fun. Fix script to not perform tty operations on things that aren't ttys. Detected by pledge. Merge libdrm 2.4.79. After a forced unmount, also unmount any filesystems below that mount point. Flip previously warm pages in the buffer cache to memory above the DMA region if uvm tells us it is available. Pages are not automatically promoted to upper memory. Instead it's used as additional memory only for what the cache considers long term buffers. I/O still requires DMA memory, so writing to a buffer will pull it back down. Makefile support for systems with both gcc and clang. Make i386 and amd64 so. Take a more radical approach to disabling colours in clang. When the data buffered for write in tmux exceeds a limit, discard it and redraw. Helps when a fast process is running inside tmux running inside a slow terminal. Add a port of witness(4) lock validation tool from FreeBSD. Use it with mplock, rwlock, and mutex in the kernel. Properly save and restore FPU context in vmm. Remove KGDB. It neither compiles nor works. Add a constant time AES implementation, from BearSSL. Remove SSHv1 from ssh. and more... *** Digging into BSD's choice of Unix group for new directories and files (https://utcc.utoronto.ca/~cks/space/blog/unix/BSDDirectoryGroupChoice) I have to eat some humble pie here. In comments on my entry on an interesting chmod failure, Greg A. Woods pointed out that FreeBSD's behavior of creating everything inside a directory with the group of the directory is actually traditional BSD behavior (it dates all the way back to the 1980s), not some odd new invention by FreeBSD. As traditional behavior it makes sense that it's explicitly allowed by the standards, but I've also come to think that it makes sense in context and in general. To see this, we need some background about the problem facing BSD. In the beginning, two things were true in Unix: there was no mkdir() system call, and processes could only be in one group at a time. With processes being in only one group, the choice of the group for a newly created filesystem object was easy; it was your current group. This was felt to be sufficiently obvious behavior that the V7 creat(2) manpage doesn't even mention it. Now things get interesting. 4.1c BSD seems to be where mkdir(2) is introduced and where creat() stops being a system call and becomes an option to open(2). It's also where processes can be in multiple groups for the first time. The 4.1c BSD open(2) manpage is silent about the group of newly created files, while the mkdir(2) manpage specifically claims that new directories will have your effective group (ie, the V7 behavior). This is actually wrong. In both mkdir() in sysdirectory.c and maknode() in ufssyscalls.c, the group of the newly created object is set to the group of the parent directory. Then finally in the 4.2 BSD mkdir(2) manpage the group of the new directory is correctly documented (the 4.2 BSD open(2) manpage continues to say nothing about this). So BSD's traditional behavior was introduced at the same time as processes being in multiple groups, and we can guess that it was introduced as part of that change. When your process can only be in a single group, as in V7, it makes perfect sense to create new filesystem objects with that as their group. It's basically the same case as making new filesystem objects be owned by you; just as they get your UID, they also get your GID. When your process can be in multiple groups, things get less clear. A filesystem object can only be in one group, so which of your several groups should a new filesystem object be owned by, and how can you most conveniently change that choice? One option is to have some notion of a 'primary group' and then provide ways to shuffle around which of your groups is the primary group. Another option is the BSD choice of inheriting the group from context. By far the most common case is that you want your new files and directories to be created in the 'context', ie the group, of the surrounding directory. If you fully embrace the idea of Unix processes being in multiple groups, not just having one primary group and then some number of secondary groups, then the BSD choice makes a lot of sense. And for all of its faults, BSD tended to relatively fully embrace its changes While it leads to some odd issues, such as the one I ran into, pretty much any choice here is going to have some oddities. Centrally managed Bhyve infrastructure with Ansible, libvirt and pkg-ssh (http://www.shellguardians.com/2017/05/centrally-managed-bhyve-infrastructure.html) At work we've been using Bhyve for a while to run non-critical systems. It is a really nice and stable hypervisor even though we are using an earlier version available on FreeBSD 10.3. This means we lack Windows and VNC support among other things, but it is not a big deal. After some iterations in our internal tools, we realised that the installation process was too slow and we always repeated the same steps. Of course, any good sysadmin will scream "AUTOMATION!" and so did we. Therefore, we started looking for different ways to improve our deployments. We had a look at existing frameworks that manage Bhyve, but none of them had a feature that we find really important: having a centralized repository of VM images. For instance, SmartOS applies this method successfully by having a backend server that stores a catalog of VMs and Zones, meaning that new instances can be deployed in a minute at most. This is a game changer if you are really busy in your day-to-day operations. The following building blocks are used: The ZFS snapshot of an existing VM. This will be our VM template. A modified version of oneoff-pkg-create to package the ZFS snapshots. pkg-ssh and pkg-repo to host a local FreeBSD repo in a FreeBSD jail. libvirt to manage our Bhyve VMs. The ansible modules virt, virtnet and virtpool. Once automated, the installation process needs 2 minutes at most, compared with the 30 minutes needed to manually install VM plus allowing us to deploy many guests in parallel. NetBSD maintainer in the QEMU project (https://blog.netbsd.org/tnf/entry/netbsd_maintainer_in_the_qemu) QEMU - the FAST! processor emulator - is a generic, Open Source, machine emulator and virtualizer. It defines state of the art in modern virtualization. This software has been developed for multiplatform environments with support for NetBSD since virtually forever. It's the primary tool used by the NetBSD developers and release engineering team. It is run with continuous integration tests for daily commits and execute regression tests through the Automatic Test Framework (ATF). The QEMU developers warned the Open Source community - with version 2.9 of the emulator - that they will eventually drop support for suboptimally supported hosts if nobody will step in and take the maintainership to refresh the support. This warning was directed to major BSDs, Solaris, AIX and Haiku. Thankfully the NetBSD position has been filled - making NetBSD to restore official maintenance. Beastie Bits OpenBSD Community Goes Gold (http://undeadly.org/cgi?action=article&sid=20170510012526&mode=flat&count=0) CharmBUG's Tor Hack-a-thon has been pushed back to July due to scheduling difficulties (https://www.meetup.com/CharmBUG/events/238218840/) Direct Rendering Manager (DRM) Driver for i915, from the Linux kernel to Haiku with the help of DragonflyBSD's Linux Compatibility layer (https://www.haiku-os.org/blog/vivek/2017-05-05_[gsoc_2017]_3d_hardware_acceleration_in_haiku/) TomTom lists OpenBSD in license (https://twitter.com/bsdlme/status/863488045449977864) London Net BSD Meetup on May 22nd (https://mail-index.netbsd.org/regional-london/2017/05/02/msg000571.html) KnoxBUG meeting May 30th, 2017 - Introduction to FreeNAS (http://knoxbug.org/2017-05-30) *** Feedback/Questions Felix - Home Firewall (http://dpaste.com/35EWVGZ#wrap) David - Docker Recipes for Jails (http://dpaste.com/0H51NX2#wrap) Don - GoLang & Rust (http://dpaste.com/2VZ7S8K#wrap) George - OGG feed (http://dpaste.com/2A1FZF3#wrap) Roller - BSDCan Tips (http://dpaste.com/3D2B6J3#wrap) ***