Podcasts about computing machinery acm

  • 31PODCASTS
  • 36EPISODES
  • 43mAVG DURATION
  • ?INFREQUENT EPISODES
  • Apr 9, 2025LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about computing machinery acm

Latest podcast episodes about computing machinery acm

CERIAS Security Seminar Podcast
Josiah Dykstra, Lessons for Cybersecurity from the American Public Health System

CERIAS Security Seminar Podcast

Play Episode Listen Later Apr 9, 2025 50:16


This talk explores how the principles and practices of the American public health system can inform and enhance modern cybersecurity strategies. Drawing on insights from our recent CRA Quad Paper, we examine the parallels between public health methodologies and the challenges faced in today's digital landscape. By analyzing historical responses to public health crises, we identify strategies for improving situational awareness, inter-organizational collaboration, and adaptive risk management in cybersecurity. The discussion highlights how lessons from public health can bridge the gap between technical cybersecurity teams and policymakers, fostering a more holistic and effective defense against emerging cyber threats. About the speaker: Josiah Dykstra is the Director of Strategic Initiatives at Trail of Bits. He previously served for 19 years as a senior technical leader at the National Security Agency (NSA). Dr. Dykstra is an experienced cyber practitioner and researcher whose focus has included the psychology and economics of cybersecurity. He received the CyberCorps® Scholarship for Service (SFS) fellowship and is one of ten people in the SFS Hall of Fame. In 2017, he received the Presidential Early Career Award for Scientists and Engineers (PECASE) from then President Barack Obama. Dr. Dykstra is a Fellow of the American Academy of Forensic Sciences (AAFS) and a Distinguished Member of the Association for Computing Machinery (ACM). He is the author of numerous research papers, the book Essential Cybersecurity Science (O'Reilly Media, 2016), and co-author of Cybersecurity Myths and Misconceptions (Pearson, 2023). Dr. Dykstra holds a Ph.D. in computer science from the University of Maryland, Baltimore County.

What is The Future for Cities?
256I_Marcus Foth, Professor of Urban Informatics at Queensland University of Technology

What is The Future for Cities?

Play Episode Listen Later Sep 18, 2024 51:01


"Science has already fixed climate change. We know what's causing it, we know what to do about it. The fact that there is paralysis is not the problem of science." Are you interested in listening to scientists? What do you think about the urgency of actions after scientific proof? How can we use the planetary indicators for better urban futures? Interview Marcus Foth, Professor of Urban Informatics at Queensland University of Technology. We talk about his vision for the future of cities, urban visioning, declining cities, opportunities in health-arts-social sciences, donut economics, and many more. Marcus Foth is a Professor of Urban Informatics in the School of Design and a Chief Investigator in the QUT Digital Media Research Centre (DMRC), Faculty of Creative Industries, Education, and Social Justice, Queensland University of Technology, Brisbane, Australia. For more than two decades, Marcus has led ubiquitous computing and interaction design research into interactive digital media, screen, mobile and smart city applications. Marcus founded the Urban Informatics Research Lab in 2006 and the QUT Design Lab in 2016. He is a member of the More-than-Human Futures research group. Marcus has published more than 270 peer-reviewed publications. He is a Fellow of the Australian Computer Society and the Queensland Academy of Arts and Sciences, a Distinguished Member of the international Association for Computing Machinery (ACM), and currently serves on Australia's national College of Experts. Find out more about Marcus through these links: Marcus Foth on LinkedIn; @sunday9pm as Marcus Foth on X; QUT Design Lab website; Marcus Foth at QUT; Marcus Foth website; Marcus Foth on Google Scholar; Connecting episodes you might be interested in: No.159 - Interview with Michael Browne about Aboriginal ideas in urban planning; No.186 - Interview with Tom Bosschaert about stages of grief with sustainability; No.216 - Interview with Sara Stace about doughnut economics; No.255R - Participation, co-creation, and public space What wast the most interesting part for you? What questions did arise for you? Let me know on Twitter ⁠⁠⁠@WTF4Cities⁠⁠⁠ or on the ⁠⁠⁠wtf4cities.com⁠⁠⁠ website where the ⁠⁠⁠shownotes⁠⁠⁠ are also available. I hope this was an interesting episode for you and thanks for tuning in. Music by ⁠⁠⁠Lesfm ⁠⁠⁠from ⁠⁠⁠Pixabay⁠

The Machine: A computer science education podcast
Nuria Oliver - Big Data, Artificial Intelligence and Addressing Imbalances

The Machine: A computer science education podcast

Play Episode Listen Later Mar 7, 2024 55:44


To help celebrate International Women's Day and Women's History Month, Rob spoke with Spanish computer scientist Nuria Oliver about her work to date, such as using big data systems to help unbanked people access credit in developing nations or combating bias in AI systems. Nuria recounted how she first became interested in computing and turned that interest into a career. They also discussed the gender imbalance in computing today and Nuria offered some thought-provoking suggestions as to how these issues might be addressed. Nuria is also a fellow with Association of Computing Machinery, so thanks to the ACM for setting up the interview.   Here are links to projects mentioned during the podcast: ELLIS – European Laboratory for Learning and Intelligent Systems https://ellis.eu Data-Pop Alliance https://datapopalliance.org Nuria Oliver's Personal Website https://www.nuriaoliver.com Association of Computing Machinery (ACM) https://www.acm.org   To keep up to date with The Machine, you can find the podcast on X/Twitter @machine_podcast or you can connect with Rob O'Connor via LinkedIn

The Brand Called You
The Confluence of Data, Storytelling, and the AI Frontier | Yannis Ioannidis | PhD Professor, Department of Informatics & Telecom, National & Kapodistrian University of Athens

The Brand Called You

Play Episode Listen Later Mar 1, 2024 58:34


Step into the dynamic world of Yannis Ioannidis, a luminary in the realm of data management and storytelling innovation. As the President of the ACM (Association for Computing Machinery) and a trailblazer in the field of data science, Ioannidis shares insights into his groundbreaking work. Join us as we explore the crossroads where large language models, education, and the pursuit of artificial general intelligence intersect, as articulated by one of the industry's thought leaders.  [00:39] - About Yannis Ioannidis Yannis is the Professor of Informatics and Telecommunications at National and Kapodistrian University of Athens. He is the current President of the Association for Computing Machinery (ACM). --- Support this podcast: https://podcasters.spotify.com/pod/show/tbcy/support

The New Stack Podcast
2023 Top Episodes - The End of Programming is Nigh

The New Stack Podcast

Play Episode Listen Later Dec 27, 2023 31:59


Is the end of programming nigh? That's the big question posed in this episode recorded earlier in 2023. It was very popular among listeners, and with the topic being as relevant as ever, we wanted to wrap up the year by highlighting this conversation again.If you ask Matt Welsh, he'd say yes, the end of programming is upon us. As Richard McManus wrote on The New Stack, Welsh is a former professor of computer science at Harvard who spoke at a virtual meetup of the Chicago Association for Computing Machinery (ACM), explaining his thesis that ChatGPT and GitHub Copilot represent the beginning of the end of programming.Welsh joined us on The New Stack Makers to discuss his perspectives about the end of programming and answer questions about the future of computer science, distributed computing, and more.Welsh is now the founder of fixie.ai, a platform they are building to let companies develop applications on top of large language models to extend with different capabilities.For 40 to 50 years, programming language design has had one goal. Make it easier to write programs, Welsh said in the interview.Still, programming languages are complex, Welsh said. And no amount of work is going to make it simple. Learn more from The New Stack about AI and the future of software development:Top 5 Large Language Models and How to Use Them Effectively30 Non-Trivial Ways for Developers to Use GPT-4Developer Tips in AI Prompt Engineering

CERIAS Security Seminar Podcast
Stuart Shapiro, MITRE PANOPTIC™ Privacy Threat Model

CERIAS Security Seminar Podcast

Play Episode Listen Later Sep 13, 2023 53:23


As privacy moves from a predominantly compliance-oriented approach to one that is risk-based, privacy risk modeling has taken on increased importance. While a variety of innovative pre-existing options are available for privacy consequences and a few for vulnerabilities, privacy threat models, particularly ones focused on attacks (as opposed to threat actors) remain relatively scarce. To address this gap and facilitate more sophisticated privacy risk management of increasingly complex systems, MITRE has developed the Pattern and Action Nomenclature Of Privacy Threats In Context (PANOPTIC™). By providing an empirically-driven taxonomy of privacy threat activities and actions – as well as contextual elements – to support environmental and system-specific threat modeling, PANOPTIC is intended to do for privacy practitioners what MITRE ATT&CK® has done for security practitioners. This presentation discusses the underpinnings and provides an overview of PANOPTIC and its use. About the speaker: Stuart S. Shapiro is a Principal Cyber Security and Privacy Engineer and a co-leader of the Privacy Capability in the MITRE Labs Cyber Solutions Innovation Center at the MITRE Corporation. At MITRE he has led multiple research and operational efforts in the areas of privacy engineering, privacy risk management, and privacy enhancing technologies (PETs), including projects focused on connected vehicles and on de-identification. He has also held academic positions and has taught courses on the history, politics, and ethics of information and communication technologies. His professional affiliations include the International Association of Privacy Professionals (IAPP) and the Association for Computing Machinery (ACM).

Practical Significance
Practical Significance | Episode 32: Getting the ‘Data’ on CSAB with Andrew (Andy) Phillips

Practical Significance

Play Episode Listen Later Jul 31, 2023 26:52


On April 21, 2021, the American Statistical Association became a full member of CSAB, joining the world's two largest professional and technical societies for computing—the Association for Computing Machinery (ACM) and the IEEE Computer Society (IEEE-CS). This month, Practical Significance co-hosts Donna LaLonde and Ron Wasserstein welcome to the show newly appointed CSAB Executive Director Andrew (Andy) Phillips to get an update on data science accreditation. In addition to data science programs, CSAB is the lead ABET member society for accreditation of degree programs in computer science, cybersecurity, information systems, ... The post Practical Significance | Episode 32: Getting the ‘Data' on CSAB with Andrew (Andy) Phillips first appeared on Amstat News.

UC Berkeley (Audio)
AI Meets Copyright

UC Berkeley (Audio)

Play Episode Listen Later Jun 30, 2023 48:30


This series on artificial intelligence explores recent breakthroughs of AI, its broader societal implications and its future potential. In this presentation, Pamela Samuelson, professor of Law and Information at UC Berkeley, discusses whether computer-generated texts and images fall under the copyright law. She says that early on, the consensus was that AI was just a tool, like a camera, so humans could claim copyright in machine-generated outputs to which they made contributions. Now the consensus is that AI-generated texts and images are not copyrightable for the lack of a human author. The urgent questions today focus on whether ingesting in-copyright works as training data is copyright infringement and whether the outputs of AI programs are infringing derivative works of the ingested images. Four recent lawsuits, one involving GitHub's Copilot and three involving Stable Diffusion, will address these issues. Samuelson has been a member of the UC Berkeley School of Law faculty since 1996. She has written and spoken extensively about the challenges that new information technologies pose for traditional legal regimes, especially for intellectual property law. She is a member of the American Academy of Arts & Sciences, a fellow of the Association for Computing Machinery (ACM), a contributing editor of Communications of the ACM, a past fellow of the John D. & Catherine T. MacArthur Foundation, a member of the American Law Institute, and an honorary professor of the University of Amsterdam. Series: "The Future of AI" [Science] [Business] [Show ID: 38859]

Science (Video)
AI Meets Copyright

Science (Video)

Play Episode Listen Later Jun 30, 2023 48:30


This series on artificial intelligence explores recent breakthroughs of AI, its broader societal implications and its future potential. In this presentation, Pamela Samuelson, professor of Law and Information at UC Berkeley, discusses whether computer-generated texts and images fall under the copyright law. She says that early on, the consensus was that AI was just a tool, like a camera, so humans could claim copyright in machine-generated outputs to which they made contributions. Now the consensus is that AI-generated texts and images are not copyrightable for the lack of a human author. The urgent questions today focus on whether ingesting in-copyright works as training data is copyright infringement and whether the outputs of AI programs are infringing derivative works of the ingested images. Four recent lawsuits, one involving GitHub's Copilot and three involving Stable Diffusion, will address these issues. Samuelson has been a member of the UC Berkeley School of Law faculty since 1996. She has written and spoken extensively about the challenges that new information technologies pose for traditional legal regimes, especially for intellectual property law. She is a member of the American Academy of Arts & Sciences, a fellow of the Association for Computing Machinery (ACM), a contributing editor of Communications of the ACM, a past fellow of the John D. & Catherine T. MacArthur Foundation, a member of the American Law Institute, and an honorary professor of the University of Amsterdam. Series: "The Future of AI" [Science] [Business] [Show ID: 38859]

University of California Audio Podcasts (Audio)

This series on artificial intelligence explores recent breakthroughs of AI, its broader societal implications and its future potential. In this presentation, Pamela Samuelson, professor of Law and Information at UC Berkeley, discusses whether computer-generated texts and images fall under the copyright law. She says that early on, the consensus was that AI was just a tool, like a camera, so humans could claim copyright in machine-generated outputs to which they made contributions. Now the consensus is that AI-generated texts and images are not copyrightable for the lack of a human author. The urgent questions today focus on whether ingesting in-copyright works as training data is copyright infringement and whether the outputs of AI programs are infringing derivative works of the ingested images. Four recent lawsuits, one involving GitHub's Copilot and three involving Stable Diffusion, will address these issues. Samuelson has been a member of the UC Berkeley School of Law faculty since 1996. She has written and spoken extensively about the challenges that new information technologies pose for traditional legal regimes, especially for intellectual property law. She is a member of the American Academy of Arts & Sciences, a fellow of the Association for Computing Machinery (ACM), a contributing editor of Communications of the ACM, a past fellow of the John D. & Catherine T. MacArthur Foundation, a member of the American Law Institute, and an honorary professor of the University of Amsterdam. Series: "The Future of AI" [Science] [Business] [Show ID: 38859]

Science (Audio)
AI Meets Copyright

Science (Audio)

Play Episode Listen Later Jun 30, 2023 48:30


This series on artificial intelligence explores recent breakthroughs of AI, its broader societal implications and its future potential. In this presentation, Pamela Samuelson, professor of Law and Information at UC Berkeley, discusses whether computer-generated texts and images fall under the copyright law. She says that early on, the consensus was that AI was just a tool, like a camera, so humans could claim copyright in machine-generated outputs to which they made contributions. Now the consensus is that AI-generated texts and images are not copyrightable for the lack of a human author. The urgent questions today focus on whether ingesting in-copyright works as training data is copyright infringement and whether the outputs of AI programs are infringing derivative works of the ingested images. Four recent lawsuits, one involving GitHub's Copilot and three involving Stable Diffusion, will address these issues. Samuelson has been a member of the UC Berkeley School of Law faculty since 1996. She has written and spoken extensively about the challenges that new information technologies pose for traditional legal regimes, especially for intellectual property law. She is a member of the American Academy of Arts & Sciences, a fellow of the Association for Computing Machinery (ACM), a contributing editor of Communications of the ACM, a past fellow of the John D. & Catherine T. MacArthur Foundation, a member of the American Law Institute, and an honorary professor of the University of Amsterdam. Series: "The Future of AI" [Science] [Business] [Show ID: 38859]

Business (Video)
AI Meets Copyright

Business (Video)

Play Episode Listen Later Jun 30, 2023 48:30


This series on artificial intelligence explores recent breakthroughs of AI, its broader societal implications and its future potential. In this presentation, Pamela Samuelson, professor of Law and Information at UC Berkeley, discusses whether computer-generated texts and images fall under the copyright law. She says that early on, the consensus was that AI was just a tool, like a camera, so humans could claim copyright in machine-generated outputs to which they made contributions. Now the consensus is that AI-generated texts and images are not copyrightable for the lack of a human author. The urgent questions today focus on whether ingesting in-copyright works as training data is copyright infringement and whether the outputs of AI programs are infringing derivative works of the ingested images. Four recent lawsuits, one involving GitHub's Copilot and three involving Stable Diffusion, will address these issues. Samuelson has been a member of the UC Berkeley School of Law faculty since 1996. She has written and spoken extensively about the challenges that new information technologies pose for traditional legal regimes, especially for intellectual property law. She is a member of the American Academy of Arts & Sciences, a fellow of the Association for Computing Machinery (ACM), a contributing editor of Communications of the ACM, a past fellow of the John D. & Catherine T. MacArthur Foundation, a member of the American Law Institute, and an honorary professor of the University of Amsterdam. Series: "The Future of AI" [Science] [Business] [Show ID: 38859]

The New Stack Podcast
The End of Programming is Nigh

The New Stack Podcast

Play Episode Listen Later Mar 29, 2023 31:42


s the end of programming nigh?If you ask Matt Welsh, he'd say yes. As Richard McManus wrote on The New Stack, Welsh is a former professor of computer science at Harvard who spoke at a virtual meetup of the Chicago Association for Computing Machinery (ACM), explaining his thesis that ChatGPT and GitHub Copilot represent the beginning of the end of programming.Welsh joined us on The New Stack Makers to discuss his perspectives about the end of programming and answer questions about the future of computer science, distributed computing, and more.Welsh is now the founder of fixie.ai, a platform they are building to let companies develop applications on top of large language models to extend with different capabilities.For 40 to 50 years, programming language design has had one goal. Make it easier to write programs, Welsh said in the interview.Still, programming languages are complex, Welsh said. And no amount of work is going to make it simple. 

WERU 89.9 FM Blue Hill, Maine Local News and Public Affairs Archives
Notes from the Electronic Cottage 10/20/22: Election Voting Security

WERU 89.9 FM Blue Hill, Maine Local News and Public Affairs Archives

Play Episode Listen Later Oct 20, 2022 9:02


Producer/Host: Jim Campbell It’s election season and we’re hearing a lot of stuff that is pretty unbelievable. But we aren’t hearing much about voting machine companies doctoring votes at this point in time. Why is that? Hmmm. But even if we aren’t hearing a lot of about alleged voting machine fraud now, it’s still important to think about how we might guarantee that vote counting is accurate. The Association for Computing Machinery (ACM) thinks so and in October of 2022 released a TechBrief entitled “Election Security: Risk Limiting Audits.” It’s brief and offers Risk Limiting Audits (RLA) as a tool to ensure that electronic vote counting is accurate and transparent. Only four pages and definitely worth reading, here About the host: Jim Campbell has a longstanding interest in the intersection of digital technology, law, and public policy and how they affect our daily lives in our increasingly digital world. He has banged around non-commercial radio for decades and, in the little known facts department (that should probably stay that way), he was one of the readers voicing Richard Nixon's words when NPR broadcast the entire transcript of the Watergate tapes. Like several other current WERU volunteers, he was at the station's sign-on party on May 1, 1988 and has been a volunteer ever since doing an early stint as a Morning Maine host, and later producing WERU program series including Northern Lights, Conversations on Science and Society, Sound Portrait of the Artist, Selections from the Camden Conference, others that will probably come to him after this is is posted, and, of course, Notes from the Electronic Cottage. The post Notes from the Electronic Cottage 10/20/22: Election Voting Security first appeared on WERU 89.9 FM Blue Hill, Maine Local News and Public Affairs Archives.

Notes From The Electronic Cottage | WERU 89.9 FM Blue Hill, Maine Local News and Public Affairs Archives
Notes from the Electronic Cottage 10/20/22: Election Voting Security

Notes From The Electronic Cottage | WERU 89.9 FM Blue Hill, Maine Local News and Public Affairs Archives

Play Episode Listen Later Oct 20, 2022 9:02


Producer/Host: Jim Campbell It’s election season and we’re hearing a lot of stuff that is pretty unbelievable. But we aren’t hearing much about voting machine companies doctoring votes at this point in time. Why is that? Hmmm. But even if we aren’t hearing a lot of about alleged voting machine fraud now, it’s still important to think about how we might guarantee that vote counting is accurate. The Association for Computing Machinery (ACM) thinks so and in October of 2022 released a TechBrief entitled “Election Security: Risk Limiting Audits.” It’s brief and offers Risk Limiting Audits (RLA) as a tool to ensure that electronic vote counting is accurate and transparent. Only four pages and definitely worth reading, here About the host: Jim Campbell has a longstanding interest in the intersection of digital technology, law, and public policy and how they affect our daily lives in our increasingly digital world. He has banged around non-commercial radio for decades and, in the little known facts department (that should probably stay that way), he was one of the readers voicing Richard Nixon's words when NPR broadcast the entire transcript of the Watergate tapes. Like several other current WERU volunteers, he was at the station's sign-on party on May 1, 1988 and has been a volunteer ever since doing an early stint as a Morning Maine host, and later producing WERU program series including Northern Lights, Conversations on Science and Society, Sound Portrait of the Artist, Selections from the Camden Conference, others that will probably come to him after this is is posted, and, of course, Notes from the Electronic Cottage. The post Notes from the Electronic Cottage 10/20/22: Election Voting Security first appeared on WERU 89.9 FM Blue Hill, Maine Local News and Public Affairs Archives.

Screaming in the Cloud
ChaosSearch and the Evolving World of Data Analytics with Thomas Hazel

Screaming in the Cloud

Play Episode Listen Later Oct 4, 2022 35:21


About ThomasThomas Hazel is Founder, CTO, and Chief Scientist of ChaosSearch. He is a serial entrepreneur at the forefront of communication, virtualization, and database technology and the inventor of ChaosSearch's patented IP. Thomas has also patented several other technologies in the areas of distributed algorithms, virtualization and database science. He holds a Bachelor of Science in Computer Science from University of New Hampshire, Hall of Fame Alumni Inductee, and founded both student & professional chapters of the Association for Computing Machinery (ACM).Links Referenced: ChaosSearch: https://www.chaossearch.io/ Twitter: https://twitter.com/ChaosSearch Facebook: https://www.facebook.com/CHAOSSEARCH/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at AWS AppConfig. Engineers love to solve, and occasionally create, problems. But not when it's an on-call fire-drill at 4 in the morning. Software problems should drive innovation and collaboration, NOT stress, and sleeplessness, and threats of violence. That's why so many developers are realizing the value of AWS AppConfig Feature Flags. Feature Flags let developers push code to production, but hide that that feature from customers so that the developers can release their feature when it's ready. This practice allows for safe, fast, and convenient software development. You can seamlessly incorporate AppConfig Feature Flags into your AWS or cloud environment and ship your Features with excitement, not trepidation and fear. To get started, go to snark.cloud/appconfig. That's snark.cloud/appconfig.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted episode is brought to us by our returning sponsor and friend, ChaosSearch. And once again, the fine folks at ChaosSearch has seen fit to basically subject their CTO and Founder, Thomas Hazel, to my slings and arrows. Thomas, thank you for joining me. It feels like it's been a hot minute since we last caught up.Thomas: Yeah, Corey. Great to be on the program again, then. I think it's been almost a year. So, I look forward to these. They're fun, they're interesting, and you know, always a good time.Corey: It's always fun to just take a look at companies' web pages in the Wayback Machine, archive.org, where you can see snapshots of them at various points in time. Usually, it feels like this is either used for long-gone things and people want to remember the internet of yesteryear, or alternately to deliver sick burns with retorting a “This you,” when someone winds up making an unpopular statement. One of the approaches I like to use it for, which is significantly less nefarious—usually—is looking back in time at companies' websites, just to see how the positioning of the product evolves over time.And ChaosSearch has had an interesting evolution in that direction. But before we get into that, assuming that there might actually be people listening who do not know the intimate details of exactly what it is you folks do, what is ChaosSearch, and what might you folks do?Thomas: Yeah, well said, and I look forward to [laugh] doing the Wayback Time because some of our ideas, way back when, seemed crazy, but now they make a lot of sense. So, what ChaosSearch is all about is transforming customers' cloud object stores like Amazon S3 into an analytical database that supports search and SQL-type use cases. Now, where's that apply? In log analytics, observability, security, security data lakes, operational data, particularly at scale, where you just stream your data into your data lake, connect our service, our SaaS service, to that lake and automagically we index it and provide well-known APIs like Elasticsearch and integrate with Kibana or Grafana, and SQL APIs, something like, say, a Superset or Tableau or Looker into your data. So, you stream it in and you get analytics out. And the key thing is the time-cost complexity that we all know that operational data, particularly at scale, like terabytes and a day and up causes challenges, and we all know how much it costs.Corey: They certainly do. One of the things that I found interesting is that, as I've mentioned before, when I do consulting work at The Duckbill Group, we have absolutely no partners in the entire space. That includes AWS, incidentally. But it was easy in the beginning because I was well aware of what you folks were up to, and it was great when there was a use case that matched of you're spending an awful lot of money on Elasticsearch; consider perhaps migrating some of that—if it makes sense—to ChaosSearch. Ironically, when you started sponsoring some of my nonsense, that conversation got slightly trickier where I had to disclose, yeah our media arm is does have sponsorships going on with them, but that has no bearing on what I'm saying.And if they take their sponsorships away—please don't—then we would still be recommending them because it's the right answer, and it's what we would use if we were in your position. We receive no kickbacks or partner deal or any sort of reseller arrangement because it just clouds the whole conflict of interest perception. But you folks have been fantastic for a long time in a bunch of different ways.Thomas: Well, you know, I would say that what you thought made a lot of sense made a lot of sense to us as well. So, the ChaosSearch idea just makes sense. Now, you had to crack some code, solve some problems, invent some technology, and create some new architecture, but the idea that Elasticsearch is a useful solution with all the tooling, the visualization, the wonderful community around that, was a good place to start, but here's the problem: setting it up, scaling it out, keep it up, when things are happening, things go bump in the night. All those are real challenges, and one of them was just the storaging of the data. Well, what if you could make S3 the back-end store? One hundred percent; no SSDs or HDDs. Makes a lot of sense.And then support the APIs that your tooling uses. So, it just made a lot of sense on what we were trying to do, just no one thought of it. Now, if you think about the Northstar you were talking about, you know, five, six years ago, when I said, transforming cloud storage into an analytical database for search and SQL, people thought that was crazy and mad. Well, now everyone's using Cloud Storage, everyone's using S3 as a data lake. That's not in question anymore.But it was a question five, six, you know, years ago. So, when we met up, you're like, “Well, that makes sense.” It always made sense, but people either didn't think was possible, or were worried, you know, I'll just try to set up an Elastic cluster and deal with it. Because that's what happens when you particularly deal with large-scale implementations. So, you know, to us, we would love the Elastic API, the tooling around it, but what we all know is the cost, the time the complexity, to manage it, to scale it out, just almost want to pull your hair out. And so, that's where we come in is, don't change what you do, just change how you do it.Corey: Every once in a while, I'll talk to a client who's running an Amazon Elasticsearch cluster, and they have nothing but good things to say about it. Which, awesome. On the one hand, part of me wishes that I had some of their secrets, but often what's happened is that they have this down to a science, they have a data lifecycle that's clearly defined and implemented, the cluster is relatively static, so resizes aren't really a thing, and it just works for their use cases. And in those scenarios, like, “Do you care about the bill?” “Not overly. We don't have to think about it.”Great. Then why change? If there's no pain, you're not going to sell someone something, especially when we're talking, this tends to be relatively smaller-scale as well. It's okay, great, they're spending $5,000 a month on it. It doesn't necessarily justify the engineering effort to move off.Now, when you start looking at this, and, “Huh, that's a quarter million bucks a month we're spending on this nonsense, and it goes down all the time,” yeah, that's when it starts to be one of those logical areas to start picking apart and diving into. What's also muddied the waters since the last time we really went in-depth on any of this was it used to be we would be talking about it exactly like we are right now, about how it's Elasticsearch-compatible. Technically, these days, we probably shouldn't be saying it is OpenSearch compatible because of the trademark issues between Elastic and AWS and the Schism of the OpenSearch fork of the Elasticsearch project. And now it feels like when you start putting random words in front of the word search, ChaosSearch fits right in. It feels like your star is rising.Thomas: Yeah, no, well said. I appreciate that. You know, it's funny when Elastic changed our license, we all didn't know what was going to happen. We knew something was going to happen, but we didn't know what was going to happen. And Amazon, I say ironically, or, more importantly, decided they'll take up the open mantle of keeping an open, free solution.Now, obviously, they recommend running that in their cloud. Fair enough. But I would say we don't hear as much Elastic replacement, as much as OpenSearch replacement with our solution because of all the benefits that we talked about. Because the trigger points for when folks have an issue with the OpenSearch or Elastic stack is got too expensive, or it was changing so much and it was falling over, or the complexity of the schema changing, or all the above. The pipelines were complex, particularly at scale.That's both for Elasticsearch, as well as OpenSearch. And so, to us, we want either to win, but we want to be the replacement because, you know, at scale is where we shine. But we have seen a real trend where we see less Elasticsearch and more OpenSearch because the community is worried about the rules that were changed, right? You see it day in, day out, where you have a community that was built around open and fair and free, and because of business models not working or the big bad so-and-so is taking advantage of it better, there's a license change. And that's a trust change.And to us, we're following the OpenSearch path because it's still open. The 600-pound gorilla or 900-pound gorilla of Amazon. But they really held the mantle, saying, “We're going to stay open, we assume for as long as we know, and we'll follow that path. But again, at that scale, the time, the costs, we're here to help solve those problems.” Again, whether it's on Amazon or, you know, Google et cetera.Corey: I want to go back to what I mentioned at the start of this with the Wayback Machine and looking at how things wound up unfolding in the fullness of time. The first time that it snapshotted your site was way back in the year 2018, which—Thomas: Nice. [laugh].Corey: Some of us may remember, and at that point, like, I wasn't doing any work with you, and later in time I would make fun of you folks for this, but back then your brand name was in all caps, so I would periodically say things like this episode is sponsored by our friends at [loudly] CHAOSSEARCH.Thomas: [laugh].Corey: And once you stopped capitalizing it and that had faded from the common awareness, it just started to look like I had the inability to control the volume of my own voice. Which, fair, but generally not mid-sentence. So, I remember those early days, but the positioning of it was, “The future of log management and analytics,” back in 2018. Skipping forward a year later, you changed this because apparently in 2019, the future was already here. And you were talking about, “Log search analytics, purpose-built for Amazon S3. Store everything, ask anything all on your Amazon S3.”Which is awesome. You were still—unfortunately—going by the all caps thing, but by 2020, that wound up changing somewhat significantly. You were at that point, talking for it as, “The data platform for scalable log analytics.” Okay, it's clearly heading in a log direction, and that made a whole bunch of sense. And now today, you are, “The data lake platform for analytics at scale.” So, good for you, first off. You found a voice?Thomas: [laugh]. Well, you know, it's funny, as a product mining person—I'll take my marketing hat off—we've been building the same solution with the same value points and benefits as we mentioned earlier, but the market resonates with different terminology. When we said something like, “Transforming your Cloud Object Storage like S3 into an analytical database,” people were just were like, blown away. Is that even possible? Right? And so, that got some eyes.Corey: Oh, anything is a database if you hold that wrong. Absolutely.Thomas: [laugh]. Yeah, yeah. And then you're saying log analytics really resonated for a few years. Data platform, you know, is more broader because we do more broader things. And now we see over the last few years, observability, right? How do you fit in the observability viewpoint, the stack where log analytics is one aspect to it?Some of our customers use Grafana on us for that lens, and then for the analysis, alerting, dashboarding. You can say that Kibana in the hunting aspect, the log aspects. So, you know, to us, we're going to put a message out there that resonates with what we're hearing from our customers. For instance, we hear things like, “I need a security data lake. I need that. I need to stream all my data. I need to have all the data because what happens today that now, I need to know a week, two weeks, 90 days.”We constantly hear, “I need at least 90 days forensics on that data.” And it happens time and time again. We hear in the observability stack where, “Hey, I love Datadog, but I can't afford it more than a week or two.” Well, that's where we come in. And we either replace Datadog for the use cases that we support, or we're auxiliary to it.Sometimes we have an existing Grafana implementation, and then they store data in us for the long tail. That could be the scenario. So, to us, the message is around what resonates with our customers, but in the end, it's operational data, whether you want to call it observability, log analytics, security analytics, like the data lake, to us, it's just access to your data, all your data, all the time, and supporting the APIs and the tooling that you're using. And so, to me, it's the same product, but the market changes with messaging and requirements. And this is why we always felt that having a search and SQL platform is so key because what you'll see in Elastic or OpenSearch is, “Well, I only support the Elastic API. I can't do correlations. I can't do this. I can't do that. I'm going to move it over to say, maybe Athena but not so much. Maybe a Snowflake or something else.”Corey: “Well, Thomas, it's very simple. Once you learn our own purpose-built, domain-specific language, specifically for our product, well, why are you still sitting here, go learn that thing.” People aren't going to do that.Thomas: And that's what we hear. It was funny, I won't say what the company was, a big banking company that we're talking to, and we hear time and time again, “I only want to do it via the Elastic tooling,” or, “I only want to do it via the BI tooling.” I hear it time and time again. Both of these people are in the same company.Corey: And that's legitimate as well because there's a bunch of pre-existing processes pointing at things and we're not going to change 200 different applications in their data model just because you want to replace a back-end system. I also want to correct myself. I was one tab behind. This year's branding is slightly different: “Search and analyze unlimited log data in your cloud object storage.” Which is, I really like the evolution on this.Thomas: Yeah, yeah. And I love it. And what was interesting is the moving, the setting up, the doubling of your costs, let's say you have—I mean, we deal with some big customers that have petabytes of data; doubling your petabytes, that means, if your Elastic environment is costing you tens of millions and then you put into Snowflake, that's also going to be tens of millions. And with a solution like ours, you have really cost-effective storage, right? Your cloud storage, it's secure, it's reliable, it's Elastic, and you attach Chaos to get the well-known APIs that your well-known tooling can analyze.So, to us, our evolution has been really being the end viewpoint where we started early, where the search and SQL isn't here today—and you know, in the future, we'll be coming out with more ML type tooling—but we have two sides: we have the operational, security, observability. And a lot of the business side wants access to that data as well. Maybe it's app data that they need to do analysis on their shopping cart website, for instance.Corey: The thing that I find curious is, the entire space has been iterating forward on trying to define observability, generally, as whatever people are already trying to sell in many cases. And that has seemed to be a bit of a stumbling block for a lot of folks. I figured this out somewhat recently because I've built the—free for everyone to use—the lasttweetinaws.com, Twitter threading client.That's deployed to 20 different AWS regions because it's go—the idea is that should be snappy for people, no matter where they happen to be on the planet, and I use it for conferences when I travel, so great, let's get ahead of it. But that also means I've got 20 different sources of logs. And given that it's an omnibus Lambda function, it's very hard to correlate that to users, or user sessions, or even figure out where it's going. The problem I've had is, “Oh, well, this seems like something I could instrument to spray logs somewhere pretty easily, but I don't want to instrument it for 15 different observability vendors. Why don't I just use otel—or Open Telemetry—and then tell that to throw whatever I care about to various vendors and do a bit of a bake-off?” The problem, of course, is that open telemetry and Lambda seem to be in just the absolute wrong directions. A lot.Thomas: So, we see the same trend of otel coming out, and you know, this is another API that I'm sure we're going to go all-in on because it's getting more and more talked about. I won't say it's the standard that I think is trending to all your points about I need to normalize a process. But as you mentioned, we also need to correlate across the data. And this is where, you know, there are times where search and hunting and alerting is awesome and wonderful and solves all your needs, and sometimes correlation. Imagine trying to denormalize all those logs, set up a pipeline, put it into some database, or just do a SELECT *, you know, join this to that to that, and get your answers.And so, I think both OpenTelemetry and SQL and search all need to be played into one solution, or at least one capability because if you're not doing that, you're creating some hodgepodge pipeline to move it around and ultimately get your questions answered. And if it takes weeks—maybe even months, depending on the scale—you may sometimes not choose to do it.Corey: One other aspect that has always annoyed me about more or less every analytics company out there—and you folks are no exception to this—is the idea of charging per gigabyte ingested because that inherently sets up a weird dichotomy of, well, this is costing a lot, so I should strive to log less. And that is sort of the exact opposite, not just of the direction you folks want customers to go in, but also where customers themselves should be going in. Where you diverge from an awful lot of those other companies because of the nature of how you work, is that you don't charge them again for retention. And the idea that, yeah, the fact that anything stored in ChaosSearch lives in your own S3 buckets, you can set your own lifecycle policies and do whatever you want to do with that is a phenomenal benefit, just because I've always had a dim view of short-lived retention periods around logs, especially around things like audit logs. And these days, I would consider getting rid of audit logging data and application logging data—especially if there's a correlation story—any sooner than three years feels like borderline malpractice.Thomas: [laugh]. We—how many times—I mean, we've heard it time and time again is, “I don't have access to that data because it was too costly.” No one says they don't want the data. They just can't afford the data. And one of the key premises that if you don't have all the data, you're at risk, particularly in security—I mean, even audits. I mean, so many times our customers ask us, you know, “Hey, what was this going on? What was that go on?” And because we can so cost-effectively monitor our own service, we can provide that information for them. And we hear this time and time again.And retention is not a very sexy aspect, but it's so crucial. Anytime you look in problems with X solution or Y solution, it's the cost of the data. And this is something that we wanted to address, officially. And why do we make it so cost-effective and free after you ingest it was because we were using cloud storage. And it was just a great place to land the data cost-effective, securely.Now, with that said, there are two types of companies I've seen. Everybody needs at least 90 days. I see time and time again. Sure, maybe daily, in a weeks, they do a lot of their operation, but 90 days is where it lands. But there's also a bunch of companies that need it for years, for compliance, for audit reasons.And imagine trying to rehydrate, trying to rebuild—we have one customer—again I won't say who—has two petabytes of data that they rehydrate when they need it. And they say it's a nightmare. And it's growing. What if you just had it always alive, always accessible? Now, as we move from search to SQL, there are use cases where in the log world, they just want to pay upfront, fixed fee, this many dollars per terabyte, but as we get into the more ad hoc side of it, more and more folks are asking for, “Can I pay per query?”And so, you'll see coming out soon, about scenarios where we have a different pricing model. For logs, typically, you want to pay very consistent, you know, predetermined cost structure, but in the case of more security data lakes, where you want to go in the past and not really pay for something until you use it, that's going to be an option as well coming out soon. So, I would say you need both in the pricing models, but you need the data to have either side, right?Corey: This episode is sponsored in part by our friends at ChaosSearch. You could run Elasticsearch or Elastic Cloud—or OpenSearch as they're calling it now—or a self-hosted ELK stack. But why? ChaosSearch gives you the same API you've come to know and tolerate, along with unlimited data retention and no data movement. Just throw your data into S3 and proceed from there as you would expect. This is great for IT operations folks, for app performance monitoring, cybersecurity. If you're using Elasticsearch, consider not running Elasticsearch. They're also available now in the AWS marketplace if you'd prefer not to go direct and have half of whatever you pay them count towards your EDB commitment. Discover what companies like Equifax, Armor Security, and Blackboard already have. To learn more, visit chaossearch.io and tell them I sent you just so you can see them facepalm, yet again.Corey: You'd like to hope. I mean, you could always theoretically wind up just pulling what Ubiquiti apparently did—where this came out in an indictment that was unsealed against an insider—but apparently one of their employees wound up attempting to extort them—which again, that's not their fault, to be clear—but what came out was that this person then wound up setting the CloudTrail audit log retention to one day, so there were no logs available. And then as a customer, I got an email from them saying there was no evidence that any customer data had been accessed. I mean, yeah, if you want, like, the world's most horrifyingly devilish best practice, go ahead and set your log retention to nothing, and then you too can confidently state that you have no evidence of anything untoward happening.Contrast this with what AWS did when there was a vulnerability reported in AWS Glue. Their analysis of it stated explicitly, “We have looked at our audit logs going back to the launch of the service and have conclusively proven that the only time this has ever happened was in the security researcher who reported the vulnerability to us, in their own account.” Yeah, one of those statements breeds an awful lot of confidence. The other one makes me think that you're basically being run by clowns.Thomas: You know what? CloudTrail is such a crucial—particularly Amazon, right—crucial service because of that, we see time and time again. And the challenge of CloudTrail is that storing a long period of time is costly and the messiness the JSON complexity, every company struggles with it. And this is how uniquely—how we represent information, we can model it in all its permutations—but the key thing is we can store it forever, or you can store forever. And time and time again, CloudTrail is a key aspect to correlate—to your question—correlate this happened to that. Or do an audit on two years ago, this happened.And I got to tell you, to all our listeners out there, please store your CloudTrail data—ideally in ChaosSearch—because you're going to need it. Everyone always needs that. And I know it's hard. CloudTrail data is messy, nested JSON data that can explode; I get it. You know, there's tricks to do it manually, although quite painful. But CloudTrail, every one of our customers is indexing with us in CloudTrail because of stories like that, as well as the correlation across what maybe their application log data is saying.Corey: I really have never regretted having extra logs lying around, especially with, to be very direct, the almost ridiculously inexpensive storage classes that S3 offers, especially since you can wind up having some of the offline retrieval stuff as part of a lifecycle policy now with intelligent tiering. I'm a big believer in just—again—the Glacier Deep Archive I've at the cost of $1,000 a month per petabyte, with admittedly up to 12 hours of calling that as a latency. But that's still, for audit logs and stuff like that, why would I ever want to delete things ever again?Thomas: You're exactly right. And we have a bunch of customers that do exactly that. And we automate the entire process with you. Obviously, it's your S3 account, but we can manage across those tiers. And it's just to a point where, why wouldn't you? It's so cost-effective.And the moments where you don't have that information, you're at risk, whether it's internal audits, or you're providing a service for somebody, it's critical data. With CloudTrail, it's critical data. And if you're not storing it and if you're not making it accessible through some tool like an Elastic API or Chaos, it's not worth it. I think, to your point about your story, it's epically not worth it.Corey: It's really not. It's one of those areas where that is not a place to overly cost optimize. This is—I mean we talked earlier about my business and perceptions of conflict of interest. There's a reason that I only ever charge fixed-fee and not percentage of savings or whatnot because, at some point, I'll be placed in a position of having to say nonsense, like, “Do you really need all of these backups?” That doesn't make sense at that point.I do point out things like you have hourly disk snapshots of your entire web fleet, which has no irreplaceable data on them dating back five years. Maybe cleaning some of that up might be the right answer. The happy answer is somewhere in between those two, and it's a business decision around exactly where that line lies. But I'm a believer in never regretting having kept logs almost into perpetuity. Until and unless I start getting more or less pillaged by some particularly rapacious vendor that's oh, yeah, we're going to charge you not just for ingest, but also for retention. And for how long you want to keep it, we're going to treat it like we're carving it into platinum tablets. No. Stop that.Thomas: [laugh]. Well, you know, it's funny, when we first came out, we were hearing stories that vendors were telling customers why they didn't need their data, to your point, like, “Oh, you don't need that,” or, “Don't worry about that.” And time and time again, they said, “Well, turns out we didn't need that.” You know, “Oh, don't index all your data because you just know what you know.” And the problem is that life doesn't work out that way business doesn't work out that way.And now what I see in the market is everyone's got tiering scenarios, but the accessibility of that data takes some time to get access to. And these are all workarounds and bandaids to what fundamentally is if you design an architecture and a solution is such a way, maybe it's just always hot; maybe it's just always available. Now, we talked about tiering off to something very, very cheap, then it's like virtually free. But you know, our solution was, whether it's ultra warm, or this tiering that takes hours to rehydrate—hours—no one wants to live in that world, right? They just want to say, “Hey, on this date on this year, what was happening? And let me go look, and I want to do it now.”And it has to be part of the exact same system that I was using already. I didn't have to call up IT to say, “Hey, can you rehydrate this?” Or, “Can I go back to the archive and look at it?” Although I guess we're talking about archiving with your website, viewing from days of old, I think that's kind of funny. I should do that more often myself.Corey: I really wish that more companies would put themselves in the customers' shoes. And for what it's worth, periodically, I've spoken to a number of very happy ChaosSearch customers. I haven't spoken to any angry ones yet, which tells me you're either terrific at crisis comms, or the product itself functions as intended. So, either way, excellent job. Now, which team of yours is doing that excellent job, of course, is going to depend on which one of those outcomes it is. But I'm pretty good at ferreting out stories on those things.Thomas: Well, you know, it's funny, being a company that's driven by customer ask, it's so easy build what the customer wants. And so, we really take every input of what the customer needs and wants—now, there are cases where we relace Splunk. They're the Cadillac, they have all the bells and whistles, and there's times where we'll say, “Listen, that's not what we're going to do. We're going to solve these problems in this vector.” But they always keep on asking, right? You know, “I want this, I want that.”But most of the feedback we get is exactly what we should be building. People need their answers and how they get it. It's really helped us grow as a company, grow as a product. And I will say ever since we went live now many, many years ago, all our roadmap—other than our Northstar of transforming cloud storage into a search SQL big data analytics database has been customer-driven, market customer-driven, like what our customer is asking for, whether it's observability and integrating with Grafana and Kibana or, you know, security data lakes. It's just a huge theme that we're going to make sure that we provide a solution that meets those needs.So, I love when customers ask for stuff because the product just gets better. I mean, yeah, sometimes you have to have a thick skin, like, “Why don't you have this?” Or, “Why don't you have that?” Or we have customers—and not to complain about customers; I love our customers—but they sometimes do crazy things that we have to help them on crazy-ify. [laugh]. I'll leave it at that. But customers do silly things and you have to help them out. I hope they remember that, so when they ask for a feature that maybe takes a month to make available, they're patient with us.Corey: We sure can hope. I really want to thank you for taking so much time to once again suffer all of my criticisms, slings and arrows, blithe market observations, et cetera, et cetera. If people want to learn more, where's the best place to find you?Thomas: Well, of course, chaossearch.io. There's tons of material about what we do, use cases, case studies; we just published a big case study with Equifax recently. We're in Gartner and a whole bunch of Hype Cycles that you can pull down to see how we fit in the market.Reach out to us. You can set up a trial, kick the tires, again, on your cloud storage like S3. And ChaosSearch on Twitter, we have a Facebook, we have all this classic social medias. But our website is really where all the good content and whether you want to learn about the architecture and how we've done it, and use cases; people who want to say, “Hey, I have a problem. How do you solve it? How do I learn more?”Corey: And we will, of course, put links to that in the show notes. For my own purposes, you could also just search for the term ChaosSearch in your email inbox and find one of their sponsored ads in my newsletter and click that link, but that's a little self-serving as we do it. I'm kidding. I'm kidding. There's no need to do that. That is not how we ever evaluate these things. But it is funny to tell that story. Thomas, thank you so much for your time. As always, it's appreciated.Thomas: Corey Quinn, I truly enjoyed this time. And I look forward to upcoming re:Invent. I'm assuming it's going to be live like last year, and this is where we have a lot of fun with the community.Corey: Oh, I have no doubt that we're about to go through that particular path very soon. Thank you. It's been an absolute pleasure.Thomas: Thank you.Corey: Thomas Hazel, CTO and Founder of ChaosSearch. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, insulting comment that I will then set to have a retention period of one day, and then go on to claim that I have received no negative feedback.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Spielsinn Podcast
#34) Podpüree - Teil 3: CHI 2022

Spielsinn Podcast

Play Episode Listen Later Jun 1, 2022 41:23


Die Conference on Human Factors in Computing Systems (CHI) ist die internationale Flagschiff-Konferenz im Bereich der Mensch-Computer-Interaktion, veranstaltet von der zugehörigen Special Interest Group der Association for Computing Machinery (ACM). 2022 fand sie vom 30. April bis 5. Mai in New Orleans, aber auch online statt und lud zu über 400 Sessions ein. Darunter sind auch über 600 wissenschaftliche Veröffentlichungen, die praktischerweise in je ca. 8 Minuten langen Videos zusammengefasst und für jede:n öffentlich zugänglich sind. Philip hat sich diejenigen davon herausgesucht, die inhaltlich zum Podcast passen und gibt einen Einblick in Paper zu Virtual Reality, Spielelementen und -mechaniken, Erlernbarkeit von digitalen Spielen, Dark Patterns, dem Einfluss der Pandemie und anderen Schnittstellen zwischen Mensch-Computer-Interaktionen und Games. Dabei tauschen wir uns u. a. auch über den Transfer von digitalen Artefakten oder Erlebnissen aus der Spielwelt ins echte Leben aus, wie digitale ortsbasierte Spiele für Unternehmen interessant sein können, warum Musik und Sounds das Erlebnis von Brettspielen bereichern und wie Feedback von digitalen Spielen noch immersiver sein kann. Zum Schluss schauen wir noch in die Zukunft und geben spielsinnliche Veranstaltungstipps im aktuellen Jahr: Game Starter UI & UX Design (8. Juni, online), TwitchCon (16.+17. Juli, Amsterdam), StartPlay (5.+6. August, Koblenz), devcom (22.-26. August, Köln+online), Mensch und Computer (4.-7. September, Darmstadt), Clash of Realities (28.-30. September, Köln/Hybrid?), NordiCHI (8.-12. Oktober, Aarhus), CHI Play (2.-5. November, Bremen+online). Mehr Spielsinn zu: Ben und Guild Wars (#0), BoardGameArena & Online-Brettspiele (Bonus #1), Spielen in der Stadt (#13), Onboarding (#14), Animal Crossing (#15), Twitch (#17, #32), HCI-Forschung in Siegen: Gender & Diversity (#20), Playful Human-Food-Interaction (#22), Techniknutzung im Gefängnis (#25). FOLGE #34) Was kann es Schöneres geben als eine mit Sorgfalt geerntete Menge ausgelesener Kartoffeln, vermengt zu einem schmackhaftem Püree und wohlportioniert angerichtet? Sicher vieles! Aber gegessen wird bekanntlich das, was auf den Tisch kommt. Wir haben die Kartoffeln gegen Erfahrungsberichte und Empfehlungen eingetauscht und präsentieren Euch diesen Monat im Podcast ein - voila! - Podpüree. -- Sämtliche Links zum Podcast inkl. Discord, Socials & E-Mail-Kontakt gibt's bei linktr.ee/spielsinn.podcast

Intel on AI
Machine Learning and Molecular Simulation – Intel on AI Season 3, Episode 10

Intel on AI

Play Episode Listen Later May 4, 2022 59:34


In this episode of Intel on AI host Amir Khosrowshahi talks with Ron Dror about breakthroughs in computational biology and molecular simulation. Ron is an Associate Professor of Computer Science in the Stanford Artificial Intelligence Lab, leading a research group in cellular physiology and structural biology using molecular simulation and machine learning. Previously, Ron worked on the Anton supercomputer at D.E. Shaw Research after earning degrees in the fields of electrical engineering, computer science, and biological sciences from MIT, Cambridge, and Rice. His groundbreaking research has been featured in publications such as Science and Nature, presented at conferences like Neural Information Processing Systems (NeurIPS), and won awards from the Association of Computing Machinery (ACM) and others. In the podcast episode, Ron talks about his work with several important collaborators, his interdisciplinary approach to research, and how molecular modeling has improved over the years. He goes into detail about the gen-over-gen advancements made in the Anton supercomputer, including the Desmond software, and his recent work at Stanford with molecular dynamic simulations. The podcast closes with Amir asking detailed questions about Ron and his team's recent paper concerning RNA structure that was featured on the cover of Science. Academic research discussed in the podcast episode: Statistics of real-world illumination The Role of Natural Image Statistics in Biological Motion Estimation Surface reflectance recognition and real-world illumination statistics Accuracy of velocity estimation by Reichardt correlators Principles of Neural Design Levinthal's paradox Potassium channels Structural and Thermodynamic Properties of Selective Ion Binding in a K+ Channel Scalable Algorithms for Molecular Dynamics Simulations on Commodity Clusters Long-timescale molecular dynamics simulations of protein structure and function Parallel random numbers: as easy as 1, 2, 3 Biomolecular Simulation: A Computational Microscope for Molecular Biology Anton 2: Raising the Bar for Performance and Programmability in a Special-Purpose Molecular Dynamics Supercomputer Molecular Dynamics Simulation for All Structural basis for nucleotide exchange in heterotrimeric G proteins How GPCR Phosphorylation Patterns Orchestrate Arrestin-Mediated Signaling Highly accurate protein structure prediction with AlphaFold ATOM3D: Tasks on Molecules in Three Dimensions Geometric deep learning of RNA structure  

Screaming in the Cloud
Keeping the Chaos Searchable with Thomas Hazel

Screaming in the Cloud

Play Episode Listen Later Nov 30, 2021 44:43


About ThomasThomas Hazel is Founder, CTO, and Chief Scientist of ChaosSearch. He is a serial entrepreneur at the forefront of communication, virtualization, and database technology and the inventor of ChaosSearch's patented IP. Thomas has also patented several other technologies in the areas of distributed algorithms, virtualization and database science. He holds a Bachelor of Science in Computer Science from University of New Hampshire, Hall of Fame Alumni Inductee, and founded both student & professional chapters of the Association for Computing Machinery (ACM).Links:ChaosSearch: https://www.chaossearch.io TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by my friends at ThinkstCanary. Most companies find out way too late that they've been breached. ThinksCanary changes this and I love how they do it. Deploy canaries and canary tokens in minutes and then forget about them. What's great is the attackers tip their hand by touching them, giving you one alert, when it matters. I use it myself and I only remember this when I get the weekly update with a “we're still here, so you're aware” from them. It's glorious! There is zero admin overhead  to this, there are effectively no false positives unless I do something foolish. Canaries are deployed and loved on all seven continents. You can check out what people are saying at canary.love. And, their Kub config canary token is new and completely free as well. You can do an awful lot without paying them a dime, which is one of the things I love about them. It is useful stuff and not an, “ohh, I wish I had money.” It is speculator! Take a look; that's canary.love because it's genuinely rare to find a security product that people talk about in terms of love. It really is a unique thing to see. Canary.love. Thank you to ThinkstCanary for their support of my ridiculous, ridiculous non-sense.   Corey: This episode is sponsored in part by our friends at Vultr. Spelled V-U-L-T-R because they're all about helping save money, including on things like, you know, vowels. So, what they do is they are a cloud provider that provides surprisingly high performance cloud compute at a price that—while sure they claim its better than AWS pricing—and when they say that they mean it is less money. Sure, I don't dispute that but what I find interesting is that it's predictable. They tell you in advance on a monthly basis what it's going to going to cost. They have a bunch of advanced networking features. They have nineteen global locations and scale things elastically. Not to be confused with openly, because apparently elastic and open can mean the same thing sometimes. They have had over a million users. Deployments take less that sixty seconds across twelve pre-selected operating systems. Or, if you're one of those nutters like me, you can bring your own ISO and install basically any operating system you want. Starting with pricing as low as $2.50 a month for Vultr cloud compute they have plans for developers and businesses of all sizes, except maybe Amazon, who stubbornly insists on having something to scale all on their own. Try Vultr today for free by visiting: vultr.com/screaming, and you'll receive a $100 in credit. Thats v-u-l-t-r.com slash screaming.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted episode is brought to us by our friends at ChaosSearch.We've been working with them for a long time; they've sponsored a bunch of our nonsense, and it turns out that we've been talking about them to our clients since long before they were a sponsor because it actually does what it says on the tin. Here to talk to us about that in a few minutes is Thomas Hazel, ChaosSearch's CTO and founder. First, Thomas, nice to talk to you again, and as always, thanks for humoring me.Thomas: [laugh]. Hi, Corey. Always great to talk to you. And I enjoy these conversations that sometimes go up and down, left and right, but I look forward to all the fun we're going to have.Corey: So, my understanding of ChaosSearch is probably a few years old because it turns out, I don't spend a whole lot of time meticulously studying your company's roadmap in the same way that you presumably do. When last we checked in with what the service did-slash-does, you are effectively solving the problem of data movement and querying that data. The idea behind data warehouses is generally something that's shoved onto us by cloud providers where, “Hey, this data is going to be valuable to you someday.” Data science teams are big proponents of this because when you're storing that much data, their salaries look relatively reasonable by comparison. And the ChaosSearch vision was, instead of copying all this data out of an object store and storing it on expensive disks, and replicating it, et cetera, what if we queried it in place in a somewhat intelligent manner?So, you take the data and you store it, in this case, in S3 or equivalent, and then just query it there, rather than having to move it around all over the place, which of course, then incurs data transfer fees, you're storing it multiple times, and it's never in quite the format that you want it. That was the breakthrough revelation, you were Elasticsearch—now OpenSearch—API compatible, which was great. And that was, sort of, a state of the art a year or two ago. Is that generally correct?Thomas: No, you nailed our mission statement. No, you're exactly right. You know, the value of cloud object stores, S3, the elasticity, the durability, all these wonderful things, the problem was you couldn't get any value out of it, and you had to move it out to these siloed solutions, as you indicated. So, you know, our mission was exactly that, transformed customers' cloud storage into an analytical database, a multi-model analytical database, where our first use case was search and log analytics, replacing the ELK stack and also replacing the data pipeline, the schema management, et cetera. We automate the entire step, raw data to insights.Corey: It's funny we're having this conversation today. Earlier, today, I was trying to get rid of a relatively paltry 200 gigs or so of small files on an EFS volume—you know, Amazon's version of NFS; it's like an NFS volume except you're paying Amazon for the privilege—great. And it turns out that it's a whole bunch of operations across a network on a whole bunch of tiny files, so I had to spin up other instances that were not getting backed by spot terminations, and just firing up a whole bunch of threads. So, now the load average on that box is approaching 300, but it's plowing through, getting rid of that data finally.And I'm looking at this saying this is a quarter of a terabyte. Data warehouses are in the petabyte range. Oh, I begin to see aspects of the problem. Even searching that kind of data using traditional tooling starts to break down, which is sort of the revelation that Google had 20-some-odd years ago, and other folks have since solved for, but this is the first time I've had significant data that wasn't just easily searched with a grep. For those of you in the Unix world who understand what that means, condolences. We're having a support group meeting at the bar.Thomas: Yeah. And you know, I always thought, what if you could make cloud object storage like S3 high performance and really transform it into a database? And so that warehouse capability, that's great. We like that. However to manage it, to scale it, to configure it, to get the data into that, was the problem.That was the promise of a data lake, right? This simple in, and then this arbitrary schema on read generic out. The problem next came, it became swampy, it was really hard, and that promise was not delivered. And so what we're trying to do is get all the benefits of the data lake: simple in, so many services naturally stream to cloud storage. Shoot, I would say every one of our customers are putting their data in cloud storage because their data pipeline to their warehousing solution or Elasticsearch may go down and they're worried they'll lose the data.So, what we say is what if you just said activate that data lake and get that ELK use case, get that BI use case without that data movement, as you indicated, without that ETL-ing, without that data pipeline that you're worried is going to fall over. So, that vision has been Chaos. Now, we haven't talked in, you know, a few years, but this idea that we're growing beyond what we are just going after logs, we're going into new use cases, new opportunities, and I'm looking forward to discussing with you.Corey: It's a great answer that—though I have to call out that I am right there with you as far as inappropriately using things as databases. I know that someone is going to come back and say, “Oh, S3 is a database. You're dancing around it. Isn't that what Athena is?” Which is named, of course, after the Greek Goddess of spending money on AWS? And that is a fair question, but to my understanding, there's a schema story behind that does not apply to what you're doing.Thomas: Yeah, and that is so crucial is that we like the relational access. The time-cost complexity to get it into that, as you mentioned, scaled access, I mean, it could take weeks, months to test it, to configure it, to provision it, and imagine if you got it wrong; you got to redo it again. And so our unique service removes all that data pipeline schema management. And because of our innovation because of our service, you do all schema definition, on the fly, virtually, what we call views on your index data, that you can publish an elastic index pattern for that consumption, or a relational table for that consumption. And that's kind of leading the witness into things that we're coming out with this quarter into 2022.Corey: I have to deal with a little bit of, I guess, a shame here because yeah, I'm doing exactly what you just described. I'm using Athena to wind up querying our customers' Cost and Usage Reports, and we spend a couple hundred bucks a month on AWS Glue to wind up massaging those into the way that they expect it to be. And it's great. Ish. We hook it up to Tableau and can make those queries from it, and all right, it's great.It just, burrr goes the money printer, and we somehow get access and insight to a lot of valuable data. But even that is knowing exactly what the format is going to look like. Ish. I mean, Cost and Usage Reports from Amazon are sort of aspirational when it comes to schema sometimes, but here we are. And that's been all well and good.But now the idea of log files, even looking at the base case of sending logs from an application, great. Nginx, or Apache, or [unintelligible 00:07:24], or any of the various web servers out there all tend to use different logging formats just to describe the same exact things, start spreading that across custom in-house applications and getting signal from that is almost impossible. “Oh,” people say, “So, we'll use a structured data format.” Now, you're putting log and structuring requirements on application developers who don't care in the first place, and now you have a mess on your hands.Thomas: And it really is a mess. And that challenge is, it's so problematic. And schemas changing. You know, we have customers and one reasons why they go with us is their log data is changing; they didn't expect it. Well, in your data pipeline, and your Athena database, that breaks. That brings the system down.And so our system uniquely detects that and manages that for you and then you can pick and choose how you want to export in these views dynamically. So, you know, it's really not rocket science, but the problem is, a lot of the technology that we're using is designed for static, fixed thinking. And then to scale it is problematic and time-consuming. So, you know, Glue is a great idea, but it has a lot of sharp [pebbles 00:08:26]. Athena is a great idea but also has a lot of problems.And so that data pipeline, you know, it's not for digitally native, active, new use cases, new workloads coming up hourly, daily. You think about this long-term; so a lot of that data prep pipelining is something we address so uniquely, but really where the customer cares is the value of that data, right? And so if you're spending toils trying to get the data into a database, you're not answering the questions, whether it's for security, for performance, for your business needs. That's the problem. And you know, that agility, that time-to-value is where we're very uniquely coming in because we start where your data is raw and we automate the process all the way through.Corey: So, when I look at the things that I have stuffed into S3, they generally fall into a couple of categories. There are a bunch of logs for things I never asked for nor particularly wanted, but AWS is aggressive about that, first routing through CloudTrail so you can get charged 50-cent per gigabyte ingested. Awesome. And of course, large static assets, images I have done something to enter colloquially now known as shitposts, which is great. Other than logs, what could you possibly be storing in S3 that lends itself to, effectively, the type of analysis that you built around this?Thomas: Well, our first use case was the classic log use cases, app logs, web service logs. I mean, CloudTrail, it's famous; we had customers that gave up on elastic, and definitely gave up on relational where you can do a couple changes and your permutation of attributes for CloudTrail is going to put you to your knees. And people just say, “I give up.” Same thing with Kubernetes logs. And so it's the classic—whether it's CSV, where it's JSON, where it's log types, we auto-discover all that.We also allow you, if you want to override that and change the parsing capabilities through a UI wizard, we do discover what's in your buckets. That term data swamp, and not knowing what's in your bucket, we do a facility that will index that data, actually create a report for you for knowing what's in. Now, if you have text data, if you have log data, if you have BI data, we can bring it all together, but the real pain is at the scale. So classically, app logs, system logs, many devices sending IoT-type streams is where we really come in—Kubernetes—where they're dealing with terabytes of data per day, and managing an ELK cluster at that scale. Particularly on a Black Friday.Shoot, some of our customers like—Klarna is one of them; credit card payment—they're ramping up for Black Friday, and one of the reasons why they chose us is our ability to scale when maybe you're doing a terabyte or two a day and then it goes up to twenty, twenty-five. How do you test that scale? How do you manage that scale? And so for us, the data streams are, traditionally with our customers, the well-known log types, at least in the log use cases. And the challenge is scaling it, is getting access to it, and that's where we come in.Corey: I will say the last time you were on the show a couple of years ago, you were talking about the initial logging use case and you were speaking, in many cases aspirationally, about where things were going. What a difference a couple years is made. Instead of talking about what hypothetical customers might want, or what—might be able to do, you're just able to name-drop them off the top of your head, you have scaled to approximately ten times the number of employees you had back then. You've—Thomas: Yep. Yep.Corey: —raised, I think, a total of—what, 50 million?—since then.Thomas: Uh, 60 now. Yeah.Corey: Oh, 60? Fantastic.Thomas: Yeah, yeah.Corey: Congrats. And of course, how do you do it? By sponsoring Last Week in AWS, as everyone should. I'm taking clear credit for that every time someone announces around, that's the game. But no, there is validity to it because telling fun stories and sponsoring exciting things like this only carry you so far. At some point, customers have to say, yeah, this is solving a pain that I have; I'm willing to pay you money to solve it.And you've clearly gotten to a point where you are addressing the needs of those customers at a pretty fascinating clip. It's bittersweet from my perspective because it seems like the majority of your customers have not come from my nonsense anymore. They're finding you through word of mouth, they're finding through more traditional—read as boring—ad campaigns, et cetera, et cetera. But you've built a brand that extends beyond just me. I'm no longer viewed as the de facto ombudsperson for any issue someone might have with ChaosSearch on Twitters. It's kind of, “Aww, the company grew up. What happened there?”Thomas: No, [laugh] listen, this you were great. We reached out to you to tell our story, and I got to be honest. A lot of people came by, said, “I heard something on Corey Quinn's podcasts,” or et cetera. And it came a long way now. Now, we have, you know, companies like Equifax, multi-cloud—Amazon and Google.They love the data lake philosophy, the centralized, where use cases are now available within days, not weeks and months. Whether it's logs and BI. Correlating across all those data streams, it's huge. We mentioned Klarna, [APM Performance 00:13:19], and, you know, we have Armor for SIEM, and Blackboard for [Observers 00:13:24].So, it's funny—yeah, it's funny, when I first was talking to you, I was like, “What if? What if we had this customer, that customer?” And we were building the capabilities, but now that we have it, now that we have customers, yeah, I guess, maybe we've grown up a little bit. But hey, listen to you're always near and dear to our heart because we remember, you know, when you stop[ed by our booth at re:Invent several times. And we're coming to re:Invent this year, and I believe you are as well.Corey: Oh, yeah. But people listening to this, it's if they're listening the day it's released, this will be during re:Invent. So, by all means, come by the ChaosSearch booth, and see what they have to say. For once they have people who aren't me who are going to be telling stories about these things. And it's fun. Like, I joke, it's nothing but positive here.It's interesting from where I sit seeing the parallels here. For example, we have both had—how we say—adult supervision come in. You have a CEO, Ed, who came over from IBM Storage. I have Mike Julian, whose first love language is of course spreadsheets. And it's great, on some level, realizing that, wow, this company has eclipsed my ability to manage these things myself and put my hands-on everything. And eventually, you have to start letting go. It's a weird growth stage, and it's a heck of a transition. But—Thomas: No, I love it. You know, I mean, I think when we were talking, we were maybe 15 employees. Now, we're pushing 100. We brought on Ed Walsh, who's an amazing CEO. It's funny, I told him about this idea, I invented this technology roughly eight years ago, and he's like, “I love it. Let's do it.” And I wasn't ready to do it.So, you know, five, six years ago, I started the company always knowing that, you know, I'd give him a call once we got the plane up in the air. And it's been great to have him here because the next level up, right, of execution and growth and business development and sales and marketing. So, you're exactly right. I mean, we were a young pup several years ago, when we were talking to you and, you know, we're a little bit older, a little bit wiser. But no, it's great to have Ed here. And just the leadership in general; we've grown immensely.Corey: Now, we are recording this in advance of re:Invent, so there's always the question of, “Wow, are we going to look really silly based upon what is being announced when this airs?” Because it's very hard to predict some things that AWS does. And let's be clear, I always stay away from predictions, just because first, I have a bit of a knack for being right. But also, when I'm right, people will think, “Oh, Corey must have known about that and is leaking,” whereas if I get it wrong, I just look like a fool. There's no win for me if I start doing the predictive dance on stuff like that.But I have to level with you, I have been somewhat surprised that, at least as of this recording, AWS has not moved more in your direction because storing data in S3 is kind of their whole thing, and querying that data through something that isn't Athena has been a bit of a reach for them that they're slowly starting to wrap their heads around. But their UltraWarm nonsense—which is just, okay, great naming there—what is the point of continually having a model where oh, yeah, we're going to just age it out, the stuff that isn't actively being used into S3, rather than coming up with a way to query it there. Because you've done exactly that, and please don't take this as anything other than a statement of fact, they have better access to what S3 is doing than you do. You're forced to deal with this thing entirely from a public API standpoint, which is fine. They can theoretically change the behavior of aspects of S3 to unlock these use cases if they chose to do so. And they haven't. Why is it that you're the only folks that are doing this?Thomas: No, it's a great question, and I'll give them props for continuing to push the data lake [unintelligible 00:17:09] to the cloud providers' S3 because it was really where I saw the world. Lakes, I believe in. I love them. They love them. However, they promote the move the data out to get access, and it seems so counterintuitive on why wouldn't you leave it in and put these services, make them more intelligent? So, it's funny, I've trademark ‘Smart Object Storage,' I actually trademarked—I think you [laugh] were a part of this—‘UltraHot,' right? Because why would you want UltraWarm when you can have UltraHot?And the reason, I feel, is that if you're using Parquet for Athena [unintelligible 00:17:40] store, or Lucene for Elasticsearch, these two index technologies were not designed for cloud storage, for real-time streaming off of cloud storage. So, the trick is, you have to build UltraWarm, get it off of what they consider cold S3 into a more warmer memory or SSD type access. What we did, what the invention I created was, that first read is hot. That first read is fast.Snowflake is a good example. They give you a ten terabyte demo example, and if you have a big instance and you do that first query, maybe several orders or groups, it could take an hour to warm up. The second query is fast. Well, what if the first query is in seconds as well? And that's where we really spent the last five, six years building out the tech and the vision behind this because I like to say you go to a doctor and say, “Hey, Doc, every single time I move my arm, it hurts.” And the doctor says, “Well, don't move your arm.”It's things like that, to your point, it's like, why wouldn't they? I would argue, one, you have to believe it's possible—we're proving that it is—and two, you have to have the technology to do it. Not just the index, but the architecture. So, I believe they will go this direction. You know, little birdies always say that all these companies understand this need.Shoot, Snowflake is trying to be lake-y; Databricks is trying to really bring this warehouse lake concept. But you still do all the pipelining; you still have to do all the data management the way that you don't want to do. It's not a lake. And so my argument is that it's innovation on why. Now, they have money; they have time, but, you know, we have a big head start.Corey: I remembered last year at re:Invent they released a, shall we say, significant change to S3 that it enabled read after write consistency, which is awesome, for again, those of us in the business of misusing things as databases. But for some folks, the majority of folks I would say, it was a, “I don't know what that means and therefore I don't care.” And that's fine. I have no issue with that. There are other folks, some of my customers for example, who are suddenly, “Wait a minute. This means I can sunset this entire janky sidecar metadata system that is designed to make sure that we are consistent in our use of S3 because it now does it automatically under the hood?” And that's awesome. Does that change mean anything for ChaosSearch?Thomas: It doesn't because of our architecture. We're append-only, write-once scenario, so a lot of update-in-place viewpoints. My viewpoint is that if you're seeing S3 as the database and you need that type of consistency, it make sense of why you'd want it, but because of our distributive fabric, our stateless architecture, our append-only nature, it really doesn't affect us.Now, I talked to the S3 team, I said, “Please if you're coming up with this feature, it better not be slower.” I want S3 to be fast, right? And they said, “No, no. It won't affect performance.” I'm like, “Okay. Let's keep that up.”And so to us, any type of S3 capability, we'll take advantage of it if benefits us, whether it's consistency as you indicated, performance, functionality. But we really keep the constructs of S3 access to really limited features: list, put, get. [roll-on 00:20:49] policies to give us read-only access to your data, and a location to write our indices into your account, and then are distributed fabric, our service, acts as those indices and query them or searches them to resolve whatever analytics you need. So, we made it pretty simple, and that is allowed us to make it high performance.Corey: I'll take it a step further because you want to talk about changes since the last time we spoke, it used to be that this was on top of S3, you can store your data anywhere you want, as long as it's S3 in the customer's account. Now, you're also supporting one-click integration with Google Cloud's object storage, which, great. That does mean though, that you're not dependent upon provider-specific implementations of things like a consistency model for how you've built things. It really does use the lowest common denominator—to my understanding—of object stores. Is that something that you're seeing broad adoption of, or is this one of those areas where, well, you have one customer on a different provider, but almost everything lives on the primary? I'm curious what you're seeing for adoption models across multiple providers?Thomas: It's a great question. We built an architecture purposely to be cloud-agnostic. I mean, we use compute in a containerized way, we use object storage in a very simple construct—put, get, list—and we went over to Google because that made sense, right? We have customers on both sides. I would say Amazon is the gorilla, but Google's trying to get there and growing.We had a big customer, Equifax, that's on both Amazon and Google, but we offer the same service. To be frank, it looks like the exact same product. And it should, right? Whether it's Amazon Cloud, or Google Cloud, multi-select and I want to choose either one and get the other one. I would say that different business types are using each one, but our bulk of the business isn't Amazon, but we just this summer released our SaaS offerings, so it's growing.And you know, it's funny, you never know where it comes from. So, we have one customer—actually DigitalRiver—as one of our customers on Amazon for logs, but we're growing in working together to do a BI on GCP or on Google. And so it's kind of funny; they have two departments on two different clouds with two different use cases. And so do they want unification? I'm not sure, but they definitely have their BI on Google and their operations in Amazon. It's interesting.Corey: You know its important to me that people learn how to use the cloud effectively. Thats why I'm so glad that Cloud Academy is sponsoring my ridiculous non-sense. They're a great way to build in demand tech skills the way that, well personally, I learn best which I learn by doing not by reading. They have live cloud labs that you can run in real environments that aren't going to blow up your own bill—I can't stress how important that is. Visit cloudacademy.com/corey. Thats C-O-R-E-Y, don't drop the “E.” Use Corey as a promo-code as well. You're going to get a bunch of discounts on it with a lifetime deal—the price will not go up. It is limited time, they assured me this is not one of those things that is going to wind up being a rug pull scenario, oh no no. Talk to them, tell me what you think. Visit: cloudacademy.com/corey,  C-O-R-E-Y and tell them that I sent you!Corey: I know that I'm going to get letters for this. So, let me just call it out right now. Because I've been a big advocate of pick a provider—I care not which one—and go all-in on it. And I'm sitting here congratulating you on extending to another provider, and people are going to say, “Ah, you're being inconsistent.”No. I'm suggesting that you as a provider have to meet your customers where they are because if someone is sitting in GCP and your entire approach is, “Step one, migrate those four petabytes of data right on over here to AWS,” they're going to call you that jackhole that you would be by making that suggestion and go immediately for option B, which is literally anything that is not ChaosSearch, just based upon that core misunderstanding of their business constraints. That is the way to think about these things. For a vendor position that you are in as an ISV—Independent Software Vendor for those not up on the lingo of this ridiculous industry—you have to meet customers where they are. And it's the right move.Thomas: Well, you just said it. Imagine moving terabytes and petabytes of data.Corey: It sounds terrific if I'm a salesperson for one of these companies working on commission, but for the rest of us, it sounds awful.Thomas: We really are a data fabric across clouds, within clouds. We're going to go where the data is and we're going to provide access to where that data lives. Our whole philosophy is the no-movement movement, right? Don't move your data. Leave it where it is and provide access at scale.And so you may have services in Google that naturally stream to GCS; let's do it there. Imagine moving that amount of data over to Amazon to analyze it, and vice versa. 2020, we're going to be in Azure. They're a totally different type of business, users, and personas, but you're getting asked, “Can you support Azure?” And the answer is, “Yes,” and, “We will in 2022.”So, to us, if you have cloud storage, if you have compute, and it's a big enough business opportunity in the market, we're there. We're going there. When we first started, we were talking to MinIO—remember that open-source, object storage platform?—We've run on our laptops, we run—this [unintelligible 00:25:04] Dr. Seuss thing—“We run over here; we run over there; we run everywhere.”But the honest truth is, you're going to go with the big cloud providers where the business opportunity is, and offer the same solution because the same solution is valued everywhere: simple in; value out; cost-effective; long retention; flexibility. That sounds so basic, but you mentioned this all the time with our Rube Goldberg, Amazon diagrams we see time and time again. It's like, if you looked at that and you were from an alien planet, you'd be like, “These people don't know what they're doing. Why is it so complicated?” And the simple answer is, I don't know why people think it's complicated.To your point about Amazon, why won't they do it? I don't know, but if they did, things would be different. And being honest, I think people are catching on. We do talk to Amazon and others. They see the need, but they also have to build it; they have to invent technology to address it. And using Parquet and Lucene are not the answer.Corey: Yeah, it's too much of a demand on the producers of that data rather than the consumer. And yeah, I would love to be able to go upstream to application developers and demand they do things in certain ways. It turns out as a consultant, you have zero authority to do that. As a DevOps team member, you have limited ability to influence it, but it turns out that being the ‘department of no' quickly turns into being the ‘department of unemployment insurance' because no one wants to work with you. And collaboration—contrary to what people wish to believe—is a key part of working in a modern workplace.Thomas: Absolutely. And it's funny, the demands of IT are getting harder; the actual getting the employees to build out the solutions are getting harder. And so a lot of that time is in the pipeline, is the prep, is the schema, the sharding, and et cetera, et cetera, et cetera. My viewpoint is that should be automated away. More and more databases are being autotune, right?This whole knobs and this and that, to me, Glue is a means to an end. I mean, let's get rid of it. Why can't Athena know what to do? Why can't object storage be Athena and vice versa? I mean, to me, it seems like all this moving through all these services, the classic Amazon viewpoint, even their diagrams of having this centralized repository of S3, move it all out to your services, get results, put it back in, then take it back out again, move it around, it just doesn't make much sense. And so to us, I love S3, love the service. I think it's brilliant—Amazon's first service, right?—but from there get a little smarter. That's where ChaosSearch comes in.Corey: I would argue that S3 is in fact, a modern miracle. And one of those companies saying, “Oh, we have an object store; it's S3 compatible.” It's like, “Yeah. We have S3 at home.” Look at S3 at home, and it's just basically a series of failing Raspberry Pis.But you have this whole ecosystem of things that have built up and sprung up around S3. It is wildly understated just how scalable and massive it is. There was an academic paper recently that won an award on how they use automated reasoning to validate what is going on in the S3 environment, and they talked about hundreds of petabytes in some cases. And folks are saying, ah, S3 is hundreds of petabytes. Yeah, I have clients storing hundreds of petabytes.There are larger companies out there. Steve Schmidt, Amazon's CISO, was recently at a Splunk keynote where he mentioned that in security info alone, AWS itself generates 500 petabytes a day that then gets reduced down to a bunch of stuff, and some of it gets loaded into Splunk. I think. I couldn't really hear the second half of that sentence because of the sound of all of the Splunk salespeople in that room becoming excited so quickly you could hear it.Thomas: [laugh]. I love it. If I could be so bold, those S3 team, they're gods. They are amazing. They created such an amazing service, and when I started playing with S3 now, I guess, 2006 or 7, I mean, we were using for a repository, URL access to get images, I was doing a virtualization [unintelligible 00:29:05] at the time—Corey: Oh, the first time I played with it, “This seems ridiculous and kind of dumb. Why would anyone use this?” Yeah, yeah. It turns out I'm really bad at predicting the future. Another reason I don't do the prediction thing.Thomas: Yeah. And when I started this company officially, five, six years ago, I was thinking about S3 and I was thinking about HDFS not being a good answer. And I said, “I think S3 will actually achieve the goals and performance we need.” It's a distributed file system. You can run parallel puts and parallel gets. And the performance that I was seeing when the data was a certain way, certain size, “Wait, you can get high performance.”And you know, when I first turned on the engine, now four or five years ago, I was like, “Wow. This is going to work. We're off to the races.” And now obviously, we're more than just an idea when we first talked to you. We're a service.We deliver benefits to our customers both in logs. And shoot, this quarter alone we're coming out with new features not just in the logs, which I'll talk about second, but in a direct SQL access. But you know, one thing that you hear time and time again, we talked about it—JSON, CloudTrail, and Kubernetes; this is a real nightmare, and so one thing that we've come out with this quarter is the ability to virtually flatten. Now, you heard time and time again, where, “Okay. I'm going to pick and choose my data because my database can't handle whether it's elastic, or say, relational.” And all of a sudden, “Shoot, I don't have that. I got to reindex that.”And so what we've done is we've created a index technology that we're always planning to come out with that indexes the JSON raw blob, but in the data refinery have, post-index you can select how to unflatten it. Why is that important? Because all that tooling, whether it's elastic or SQL, is now available. You don't have to change anything. Why is Snowflake and BigQuery has these proprietary JSON APIs that none of these tools know how to use to get access to the data?Or you pick and choose. And so when you have a CloudTrail, and you need to know what's going on, if you picked wrong, you're in trouble. So, this new feature we're calling ‘Virtual Flattening'—or I don't know what we're—we have to work with the marketing team on it. And we're also bringing—this is where I get kind of excited where the elastic world, the ELK world, we're bringing correlations into Elasticsearch. And like, how do you do that? They don't have the APIs?Well, our data refinery, again, has the ability to correlate index patterns into one view. A view is an index pattern, so all those same constructs that you had in Kibana, or Grafana, or Elastic API still work. And so, no more denormalizing, no more trying to hodgepodge query over here, query over there. You're actually going to have correlations in Elastic, natively. And we're excited about that.And one more push on the future, Q4 into 2022; we have been given early access to S3 SQL access. And, you know, as I mentioned, correlations in Elastic, but we're going full in on publishing our [TPCH 00:31:56] report, we're excited about publishing those numbers, as well as not just giving early access, but going GA in the first of the year, next year.Corey: I look forward to it. This is also, I guess, it's impossible to have a conversation with you, even now, where you're not still forward-looking about what comes next. Which is natural; that is how we get excited about the things that we're building. But so much less of what you're doing now in our conversations have focused around what's coming, as opposed to the neat stuff you're already doing. I had to double-check when we were talking just now about oh, yeah, is that Google cloud object store support still something that is roadmapped, or is that out in the real world?No, it's very much here in the real world, available today. You can use it. Go click the button, have fun. It's neat to see at least some evidence that not all roadmaps are wishes and pixie dust. The things that you were talking to me about years ago are established parts of ChaosSearch now. It hasn't been just, sort of, frozen in amber for years, or months, or these giant periods of time. Because, again, there's—yeah, don't sell me vaporware; I know how this works. The things you have promised have come to fruition. It's nice to see that.Thomas: No, I appreciate it. We talked a little while ago, now a few years ago, and it was a bit of aspirational, right? We had a lot to do, we had more to do. But now when we have big customers using our product, solving their problems, whether it's security, performance, operation, again—at scale, right? The real pain is, sure you have a small ELK cluster or small Athena use case, but when you're dealing with terabytes to petabytes, trillions of rows, right—billions—when you were dealing trillions, billions are now small. Millions don't even exist, right?And you're graduating from computer science in college and you say the word, “Trillion,” they're like, “Nah. No one does that.” And like you were saying, people do petabytes and exabytes. That's the world we're living in, and that's something that we really went hard at because these are challenging data problems and this is where we feel we uniquely sit. And again, we don't have to break the bank while doing it.Corey: Oh, yeah. Or at least as of this recording, there's a meme going around, again, from an old internal Google Video, of, “I just want to serve five terabytes of traffic,” and it's an internal Google discussion of, “I don't know how to count that low.” And, yeah.Thomas: [laugh].Corey: But there's also value in being able to address things at much larger volume. I would love to see better responsiveness options around things like Deep Archive because the idea of being able to query that—even if you can wait a day or two—becomes really interesting just from the perspective of, at that point, current cost for one petabyte of data in Glacier Deep Archive is 1000 bucks a month. That is ‘why would I ever delete data again?' Pricing.Thomas: Yeah. You said it. And what's interesting about our technology is unlike, let's say Lucene, when you index it, it could be 3, 4, or 5x the raw size, our representation is smaller than gzip. So, it is a full representation, so why don't you store it efficiently long-term in S3? Oh, by the way, with the Glacier; we support Glacier too.And so, I mean, it's amazing the cost of data with cloud storage is dramatic, and if you can make it hot and activated, that's the real promise of a data lake. And, you know, it's funny, we use our own service to run our SaaS—we log our own data, we monitor, we alert, have dashboards—and I can't tell you how cheap our service is to ourselves, right? Because it's so cost-effective for long-tail, not just, oh, a few weeks; we store a whole year's worth of our operational data so we can go back in time to debug something or figure something out. And a lot of that's savings. Actually, huge savings is cloud storage with a distributed elastic compute fabric that is serverless. These are things that seem so obvious now, but if you have SSDs, and you're moving things around, you know, a team of IT professionals trying to manage it, it's not cheap.Corey: Oh, yeah, that's the story. It's like, “Step one, start paying for using things in cloud.” “Okay, great. When do I stop paying?” “That's the neat part. You don't.” And it continues to grow and build.And again, this is the thing I learned running a business that focuses on this, the people working on this, in almost every case, are more expensive than the infrastructure they're working on. And that's fine. I'd rather pay people than technologies. And it does help reaffirm, on some level, that—people don't like this reminder—but you have to generate more value than you cost. So, when you're sitting there spending all your time trying to avoid saving money on, “Oh, I've listened to ChaosSearch talk about what they do a few times. I can probably build my own and roll it at home.”It's, I've seen the kind of work that you folks have put into this—again, you have something like 100 employees now; it is not just you building this—my belief has always been that if you can buy something that gets you 90, 95% of where you are, great. Buy it, and then yell at whoever selling it to you for the rest of it, and that'll get you a lot further than, “We're going to do this ourselves from first principles.” Which is great for a weekend project for just something that you have a passion for, but in production mistakes show. I've always been a big proponent of buying wherever you can. It's cheaper, which sounds weird, but it's true.Thomas: And we do the same thing. We have single-sign-on support; we didn't build that ourselves, we use a service now. Auth0 is one of our providers now that owns that [crosstalk 00:37:12]—Corey: Oh, you didn't roll your own authentication layer? Why ever not? Next, you're going to tell me that you didn't roll your own payment gateway when you wound up charging people on your website to sign up?Thomas: You got it. And so, I mean, do what you do well. Focus on what you do well. If you're repeating what everyone seems to do over and over again, time, costs, complexity, and… service, it makes sense. You know, I'm not trying to build storage; I'm using storage. I'm using a great, wonderful service, cloud object storage.Use whats works, whats works well, and do what you do well. And what we do well is make cloud object storage analytical and fast. So, call us up and we'll take away that 2 a.m. call you have when your cluster falls down, or you have a new workload that you are going to go to the—I don't know, the beach house, and now the weekend shot, right? Spin it up, stream it in. We'll take over.Corey: Yeah. So, if you're listening to this and you happen to be at re:Invent, which is sort of an open question: why would you be at re:Invent while listening to a podcast? And then I remember how long the shuttle lines are likely to be, and yeah. So, if you're at re:Invent, make it on down to the show floor, visit the ChaosSearch booth, tell them I sent you, watch for the wince, that's always worth doing. Thomas, if people have better decision-making capability than the two of us do, where can they find you if they're not in Las Vegas this week?Thomas: So, you find us online chaossearch.io. We have so much material, videos, use cases, testimonials. You can reach out to us, get a free trial. We have a self-service experience where connect to your S3 bucket and you're up and running within five minutes.So, definitely chaossearch.io. Reach out if you want a hand-held, white-glove experience POV. If you have those type of needs, we can do that with you as well. But we booth on re:Invent and I don't know the booth number, but I'm sure either we've assigned it or we'll find it out.Corey: Don't worry. This year, it is a low enough attendance rate that I'm projecting that you will not be as hard to find in recent years. For example, there's only one expo hall this year. What a concept. If only it hadn't taken a deadly pandemic to get us here.Thomas: Yeah. But you know, we'll have the ability to demonstrate Chaos at the booth, and really, within a few minutes, you'll say, “Wow. How come I never heard of doing it this way?” Because it just makes so much sense on why you do it this way versus the merry-go-round of data movement, and transformation, and schema management, let alone all the sharding that I know is a nightmare, more often than not.Corey: And we'll, of course, put links to that in the [show notes 00:39:40]. Thomas, thank you so much for taking the time to speak with me today. As always, it's appreciated.Thomas: Corey, thank you. Let's do this again.Corey: We absolutely will. Thomas Hazel, CTO and Founder of ChaosSearch. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast episode, please leave a five-star review on your podcast platform of choice, whereas if you've hated this episode, please leave a five-star review on your podcast platform of choice along with an angry comment because I have dared to besmirch the honor of your homebrewed object store, running on top of some trusty and reliable Raspberries Pie.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Onda Universitaria El Podcast
Campus Productivo: Association of Computing Machinery (ACM)

Onda Universitaria El Podcast

Play Episode Listen Later Oct 18, 2021 16:38


En este cuarto episodio de Campus Productivo, te presentamos la Association of Computing Machinery (ACM) de la Pontificia Universidad Católica de Puerto Rico. Jayra Rentas de "Se Fue Viral" y Jean Jesús de “ExpresARTE” dialogan con Ricardo Velázquez , presidente de ACM, sobre qué es ACM, qué puedes esperar de la asociación y actividades que tienen pendientes. Si te interesa el programa de Sistemas de Información, quieres saber cuál es la importancia de estos programas en esta nueva modalidad virtual o simplemente deseas conocer qué es ACM, este episodio es para ti. ¿Qué esperas? ¡Dale play!

The Idealcast with Gene Kim by IT Revolution
Open Source Software as a Triumph of Information Hiding, Modularity, and Creating Optionality

The Idealcast with Gene Kim by IT Revolution

Play Episode Listen Later Sep 2, 2021 131:59


In this newest episode of The Idealcast, Gene Kim speaks with Dr. Gail Murphy, Professor of Computer Science and Vice President of Research and Innovation at the University of British Columbia. She is also the co-founder, board member, and former Chief Scientist at Tasktop. Dr. Murphy's research focuses on improving the productivity of software developers and knowledge workers by providing the necessary tools to identify, manage, and coordinate the information that matters most for their work.   During the episode, Kim and Dr. Murphy explore the properties of modularity and information hiding, and how one designs architectures that create them. They also discuss how open source libraries create the incredible software supply chains that developers benefit from everyday, and the surprising new risks they can create.   They discuss the ramifications of system design considerations and decisions made by software developers and why defining software developers' productivity remains elusive. They further consider open-source software as a triumph of information hiding and how it has created a massively interdependent set of libraries while also enabling incredible co-evolution, which is only made possible by modularity. Listen as Kim and Dr. Murphy discuss how technologists have both succeeded and fallen short on the dream of software being like building blocks, how software development is a subset of knowledge work, and the implications of that insight.   ABOUT THE GUEST   Gail C. Murphy is a Professor of Computer Science and Vice President of Research and Innovation at the University of British Columbia. She is a Fellow of the Royal Society of Canada and a Fellow of the Association for Computing Machinery (ACM), as well as co-founder, board member, and former Chief Scientist at Tasktop.   After completing her BS at the University of Alberta in 1987, she worked for five years as a software engineer in the Lower Mainland. She later pursued graduate studies in computer science at the University of Washington, earning first a MS (1994) and then a PhD (1996) before joining University of British Columbia.   Dr. Murphy's research focuses on improving the productivity of software developers and knowledge workers by providing the necessary tools to identify, manage, and coordinate the information that matters most for their work. She also maintains an active research group with post-doctoral and graduate students. YOU'LL LEARN ABOUT Why defining software developers' productivity remains elusive and how developers talk about what factors make them feel productive. The value of modularity and how one can achieve it. Ways to decompose software that can have surprising outcomes for even small systems. How open-source software is a triumph of information hiding, creating a massively interdependent set of libraries that also enable incredible co-evolution, which is only made possible by modularity. How we have exceeded and fallen short of the 1980s dream of software being like building blocks, where we can quickly create software by assembling modules, and what we have learned from the infamous leftpad and mime-magic incidents in the last two years. Why and how, in very specific areas, the entire software industry has standardized on a set of modules versus in other areas, where we continue to seemingly go in the opposite direction. A summary of some of the relevant work of Dr. Carliss Baldwin, the William L. White Professor of Business Administration at the Harvard Business School. Dr. Baldwin studies the process of design and its impact of design architecture on firm strategy, platforms, and business ecosystems. How software development is a subset of knowledge work and the implications of that insight. RESOURCES Dr. Mik Kersten on The Idealcast Project to Product: How to Survive and Thrive in the Age of Digital Disruption with the Flow Framework by Mik Kersten Tasktop The Unicorn Project: A Novel about Developers, Digital Disruption, and Thriving in the Age of Data by Gene Kim Fred Brooks The Mythical Man-Month On the Criteria To Be Used in Decomposing Systems into Modules by Dr. D.L. Parnas Comparison of embedded computer systems on board the Mars rovers Joshua Bloch How to design a good API and why it matters by Joshua Bloch Tricking Sand into Thinking: Deep Learning in Clojure by Dave Liepmann Gene Kim's reaction on Twitter Gource Gource in Bloom 800+ days of Minecraft in 8 minutes History of Bitcoin History of Python Eclipse Mylyn by Dr. Mik Kersten How one developer just broke Node, Babel and thousands of projects in 11 lines of JavaScript Laurie Voss' tweet Rails 5.2.5, 6.0.3.6 and 6.1.3.1 have been released Have there been any lawsuits involving breach of open source licences? GNU General Public License SemanticConflict Fostering Software Developer Productivity through Awareness Increase and Goal-Setting by André Meyer Gail Murphy on Mik + One Podcast On the criteria to be used in decomposing systems into modules Thoughts on Functional Programming Podcast by Eric Normand Alistair Cockburn's programming challenge on Twitter Gene Kim's tweet about BLAS: Basic Linear Algebra Subprograms Gene Kim's tweet about the Gource visualization on the scores of people making commits to the Python ecosystem repo Gene Kim's Twitter thread about Dr. Carliss Baldwin's talk: Part 1, Part 2 Academy of Management 2015 TIM Distinguished Scholar Prof Carliss Baldwin Design Rules, Vol. 1: The Power of Modularity by Carliss Y. Baldwin and Kim B. Clark Robert C. Merton Black–Scholes model Product Design and Development by Karl Ulrich Design structure matrix Three design structure matrices Real Option TIMESTAMPS [00:27] Intro [03:52] Meet Dr. Murphy [04:32] Determining where design occurs in software development [10:30] Refactoring [16:08] Defining developer productivity and why it defies explanation [20:26] What is modularity, architecture and why they're important [28:52] An extreme example [30:51] Information hiding [36:06] The leftpad and mime-magic incidents and SemanticConflict [44:13] The work of André Meyer [47:23] Open source is a triumph of information hiding [52:56] Architectures give different trade offs to different problems [57:25] Relationships between a leader's roles and responsibilities [1:05:10] BLAS: Basic Linear Algebra Subprograms [1:09:20] Communication paths within an organization [1:16:58] The Mylyn project [1:20:11] Analysis of Dr. David Parnas' 1972 paper [1:26:23] Falcon missile program and socio-technical congruence [1:31:10] The work of Dr. Carliss Baldwin [1:40:01] How Dr. Baldwin defines modularity [1:47:26] Modularity and open source software [1:51:31] Defining real options [1:53:17] 1 billion dollar rearchitecture project [1:55:29] This work is primarily about making decisions [2:01:58] Open source systems are Darwinian systems [2:06:33] Dr. Murphy's ideal of software developer's daily work [2:09:53] How to contact Dr. Murphy [2:11:01] Outro

The Prepare.ai Podcast
WashU's Chief Data Scientist Philip Payne on Covid One Year Later

The Prepare.ai Podcast

Play Episode Listen Later May 26, 2021 59:20


Fully and Dave talk national synthetic data repositories and what we've learned after collecting and analyzing a year's worth of Covid-19 data. Dr. Payne is the Janet and Bernard Becker Professor and Founding Director of the Institute for Informatics at Washington University in St. Louis. He is also the Associate Dean for the Office of Health Information and Data Science and the Chief Data Scientist for Washington University. He holds appointments as a Professor of General Medical Sciences and Computer Science and Engineering in the Schools of Medicine and Engineering and Applied Sciences, respectively. In this capacity, he is responsible for the creation and oversight of comprehensive biomedical informatics and data science research, training, and support programs aligned with the health and life science enterprise spanning Washington University, BJC Healthcare, and a variety of regional partners. Further, he serves as the director of the Biomedical Informatics components/programs that exist under the auspices of both the CTSA-funded Institute for Clinical and Translational Science (ICTS) and the NCI-funded Siteman Cancer Center at Washington University. He earned both masters and doctoral degrees in Biomedical Informatics at the Columbia University College of Physicians and Surgeons. He is an elected fellow of the American College of Medical Informatics (ACMI), the American Medical Informatics Association (AMIA), and the American Institute for Medical and Biological Engineering (AIMBE), and he also holds leadership appointments on numerous national steering, editorial, and advisory committees, including efforts associated with AMIA, Association for Computing Machinery (ACM), National Cancer Institute (NCI), National Library of Medicine (NLM), and the National Center for Advancing Translational Science (NCATS). His research portfolio broadly focuses upon the areas of translational bioinformatics (TBI) and clinical research informatics (CRI) and includes projects focusing on: 1) knowledge-based approaches to high-throughput hypothesis discovery and data-driven decision making; 2) distributed data management and analysis in support of clinical and translational research; and 3) human-factors and workflow analysis.

Conversas Sustentáveis - Transformando Mentes
Smart City - Um novo conceito de cidade.

Conversas Sustentáveis - Transformando Mentes

Play Episode Listen Later May 10, 2021 65:54


Você sabe o que é uma Smart City, ou uma Cidade Inteligente? Augusto Archer nos explanou os conceitos que tornam as Cidades Inteligentes. Augusto Archer é Co-Founder Cit4Life, Gerente Utilities da Sonda, Professor da FGV em Cidades Inteligentes e sustentáveis, Mentor associado da ABMEN, Startout e InovAtiva, Vice Chair do Chapter RJ da Association for Computing Machinery (ACM), Diretor de Novos Negócios e Inovação da Câmara de Comercio Brasil-Chile. Atua no Mercado de tecnologia a mais de 25 anos já tendo participado do board de grandes Empresas atuando com projetos relacionados a IoT, Telecom, Utilities, Smartcity, Smartgrid. Bacharel em Administração de Empresa com pós-graduação em Análise de Sistemas, Administração de Sistemas de Informações Gerenciais e MBA em Logística empresarial. Nos siga no instagram: @wslopes - @grupowbioenergy - @augusto_archer LinkedIn: Wagner Lopes / Grupo W Bioenergy / Augusto Archer Apoio: Bravosul Quer nos apoiar? Mande email para gwb@gwb.eco.br e nos ajude a chegar em mais pessoas.

Academic Dean
Dr. Marie desJardins, Simmons University

Academic Dean

Play Episode Listen Later Feb 15, 2021 48:14


Dr. Marie desJardins joined Simmons University as the Inaugural Dean of the College of Organizational, Computational, and Information Sciences in 2018. Previously, she was a member of the computer science faculty at the University of Maryland, Baltimore County, from 2001 to 2018, most recently as the Associate Dean for Academic Affairs in the College of Engineering and Information Technology. Before joining the faculty at UMBC, she was a Senior Computer Scientist at SRI International. She earned her A.B. in Engineering from Harvard University and her Ph.D. in Computer Science from the University of California, Berkeley.  Her research is in artificial intelligence, focusing on the areas of machine learning, multi-agent systems, decision making, and interactive AI. She was named one of the "Ten AI Researchers to Follow on Twitter" by TechRepublic and one of "14 Women in AI You Should Follow on Twitter" by craigconnects. She has published over 135 scientific papers on AI and CS education, and has been PI or co-PI on nearly $12,000,000 of external research funding, including a prestigious NSF CAREER Award. She has mentored 13 Ph.D. students, 27 M.S. students, and nearly 100 undergraduate researchers. She is known on campus and throughout her professional community for her dedication to mentoring, diversity, outreach, and innovative educational practices. Dr. desJardins is a Fellow of the Association for the Advancement of Artificial Intelligence (AAAI) and a Distinguished Member of the Association for Computing Machinery (ACM). She is a recipient of the Distinguished Alumni Award in Computer Science from UC Berkeley; the A. Richard Newton Educator ABIE Award from the Anita Borg Institute; the NCWIT Undergraduate Research Mentoring Award; and the CRA Undergraduate Research Mentoring Award. She was the 2014-2017 UMBC Distinguished Teaching Professor, was an inaugural UMBC Hrabowski Innovation Fellow, and was named one of UMBC's ten "Professors Not to Miss" in 2011.  Dr. desJardins is known nationally for her support of and commitment to improving student diversity, access, and quality of computer science courses at the high school level, and received multiple NSF awards to support her efforts in this area. She was the lead PI on the NSF-sponsored "CS Matters in Maryland" project, which created curriculum and trained high school teachers to teach the AP CS Principles course.  She built a statewide coalition in Maryland to increase access to K-12 CS education, with a focus on inclusion and diversity, and cofounded the Maryland Center for Computing Education, which received $5,000,000 in state funding for teacher preparation and advocacy. She was the Maryland team leader for the Exploring Computing Education Pathways (ECEP) Alliance and a founding member of the Maryland chapter of the Computer Science Teachers Association.

STEM EDUCATION LIFELINE (SEL)
Widening Participation in Computing

STEM EDUCATION LIFELINE (SEL)

Play Episode Listen Later Nov 7, 2020 62:11


The opinion of Dr. Ramon Caceres is highly valued.  He has been named a Fellow of the Institute of Electrical  and Electronic Engineers (IEEE) and  a Distinguished Scientist by the Association for Computing Machinery (ACM). He is on the board of directors of the Computing Research Association's Committee on Widening Participation (CRA-WP). Specifically, he is a co-chair of the CRA-WP Grad Cohort Workshop for Inclusion, Diversity, Equity, Accessibility, and Leadership Skills (IDEALS) (https://cra.org/cra-wp/grad-cohort-ideals/), in which he has also served as speaker and mentor.In this episode, we talk about his career and the current situation of minority groups in the Computer Science field. Caceres, Software Engineer at Google, provides informative recommendations for all ambitious youngsters out there pursuing this career and willing to overcome adversity.We are interested in hearing from you.  Please give us your feedback and let us know how we are doing and any topic you might want us to discuss.  Our email is:info@stemeducationlifeline.com

Irish Tech News Audio Articles
ACM and IMS to hold first ever Data Science Conference

Irish Tech News Audio Articles

Play Episode Listen Later Oct 8, 2020 8:45


The Association for Computing Machinery (ACM) together with the Institute of Mathematical Statistics (IMS) will hold the first-ever ACM-IMS Foundations of Data Science (FODS) Conference virtually on October 19-20. This interdisciplinary event will bring together researchers and practitioners to address foundational data science challenges in prediction, inference, fairness, ethics, and the future of data science. “Data science is a new, emerging field, building its foundations from computer science, statistics, and many other quantitative disciplines,” said FODS General Co-chair Jeannette Wing, Columbia University, and Fellow, Association for Computing Machinery. “Big data is not new: through large, one-of-a-kind, expensive instruments, scientists have been collecting and generating massive amounts of data for decades. What has changed is that the internet has become an instrument for anyone, not just scientists, to collect and generate data and that that data is about people. We also have powerful AI, machine learning, and statistical techniques that allow us to interpret and gain value from the data in new ways. And because so much data is about people, we must address upfront questions of ethics and privacy. We are witnessing a new era where every sector, including healthcare and finance, is being transformed by data science. We believe that our interdisciplinary approach to organizing this conference will make it an important research gathering for many years to come.” “FODS is a first-of-its-kind conference in that it is a collaboration between the two leading scientific societies in computing and statistics,” added FODS General Co-chair, David Madigan, Northeastern University, and Fellow, Institute for Mathematical Statistics. “We believe this cross-collaboration between computer scientists and statisticians is the most effective way to foster groundbreaking new research in this field. Building on the success of the initial summit ACM and IMS co-organized in 2019, we have put together an exciting program featuring the world’s top researchers and practitioners. We also hope that the virtual nature of this year’s conference will encourage participants from around the world to engage with us.” ACM-IMS FODS 2020 HIGHLIGHTS: Keynote Speakers 1. “AutoML and Interpretability: Powering the Machine Learning Revolution in Healthcare” Speaker: Michaela van der Schaar, The Alan Turing Institute AutoML and interpretability are both fundamental to the successful uptake of machine learning by non-expert end-users. This keynote presents state-of-the-art AutoML and interpretability methods for healthcare developed in van der Schaar’s lab and how they have been applied in various clinical settings (including cancer, cardiovascular disease, cystic fibrosis, and recently Covid-19). It then explains how these approaches form part of a broader vision for the future of machine learning in healthcare. 2.“Semantic Scholar, NLP, and the Fight Against COVID-19” Oren Etzioni, Allen Institute for AI (AI2) Etzioni’s talk will describe the dramatic creation of the COVID-19 Open Research Dataset (CORD-19) at the Allen Institute for AI and the broad range of efforts, both inside and outside of the Semantic Scholar project. This is to garner insights into COVID-19 and its treatment based on this data. The talk will highlight the difficult problems facing the emerging field of Scientific Language Processing. FODS 2020 Papers (Partial List) For a list of all accepted papers, visit here. 3. “Incentives Needed for Low-Cost Fair Data Reuse” Roland Maio, Augustin Chaintreau, Columbia University One of the central goals in algorithmic fairness is to build systems with fairness properties that compose gracefully. Although the importance of this goal was recognized early, limited progress has been made. In this work, Maio and Chaintreau propose a fresh approach to building fairly composable data-science pip...

Tech San Diego Presents
15: TSD Spotlight presents Association for Computing Machinery (ACM)

Tech San Diego Presents

Play Episode Listen Later Sep 30, 2020 19:47


Tech San Diego's University Talent Director Jarett Hartman sat down with  Casey Price, ACM President Stone Tao, ACM AI President Max Cohen, Side Projects Committee Member Adam Lee, Sponsorship Lead ACM at UCSD is UC San Diego's largest computer science organization, with over 1,000 members and more than 100 technical, professional, and social events year-round. We’re not just computer science, though — we bring together anyone and everyone who shares our love of technology, design, and innovation! Every UCSD student, regardless of skill level or background, is welcome to join our game nights, live coding workshops, and interview prep nights, and engage with our ACM Communities to explore various niches of computing.  Contact - contact@acmucsd.org (mailto:contact@acmucsd.org) Sponsor - sponsor@acmucsd.org (mailto:sponsor@acmucsd.org) Tech San Diego Spotlight is a production of Tech San Diego Presents. The show has a rotation of hosts from Tech San Diego and covers topics that affect the local and broader tech community. Executive Producer: Kevin Carroll Producer: Sara Spiva TSD Spotlight is edited by Andrew Sims of Hypable Impact (https://impact.hypable.com) Music by ikoliks

music uc san diego ucsd computing machinery acm andrew sims
The New Stack Podcast
OpenJS Keynote: JavaScript, the First 20 Years of the Web Stack

The New Stack Podcast

Play Episode Listen Later Jul 29, 2020 20:34


The first 20 years of JavaScript marked the dawn of the Web stack and a new generation of Web developers who had to deal with a community of traditional technologists. They also faced the continuous looming threat of Microsoft explained Allen Wirfs-Brock in a recorded keynote from OpenJS Foundation's virtual conference in June. Wirfs-Brock was also project editor of the ECMAScript specification from 2008-2015 and wrote the book-sized journal article for the Association for Computing Machinery (ACM) entitled “JavaScript: The First 20 Years” for the History of Programming Language conference (HOPL), with co-author Brendan Eich, JavaScript's creator. In this The New Stack Makers podcast, hosted by Alex Williams, founder and editor-in-chief of The New Stack, Wirfs-Brock of Wirfs-Brock Associates offers his historical perspective, including JavaScript's changes during the 25 years after Eich created the language. “What really happened is what people thought was going to be a minor piece of browser technology — particularly over the last 10 years — has really taken over the world of programming languages,” said Wirfs-Brock. “And so it's quite remarkable.”

Microsoft Research India Podcast
Podcast: Potential and Pitfalls of AI with Dr. Eric Horvitz

Microsoft Research India Podcast

Play Episode Listen Later Mar 2, 2020


Episode 001 | March 06, 2020 Dr. Eric Horvitz is a technical fellow at Microsoft, and is director of Microsoft Research Labs, including research centers in Redmond, Washington, Cambridge, Massachusetts, New York, New York, Montreal, Canada, Cambridge, UK, and Bengaluru, India. He is one of the world’s leaders in AI, and a thought leader in the use of AI in the complexity of the real world. On this podcast, we talk to Dr. Horvitz about a wide range of topics, including his thought leadership in AI, his study of AI and its influence on society, the potential and pitfalls of AI, and how useful AI can be in a country like India. Transcript Eric Horvitz: Humans will always want to make connection with humans, sociologists, social workers, physicians, teachers, we’re always going to want to make human connections and have human contacts. I think they’ll be amplified in a world of richer automation so much so that even when machines can generate art and write music, even music with lyrics that might put tear in someone’s eye if they didn’t know it was a machine, that will lead us to say, “Is that written by a human. I want to hear a song sung by a human who experienced something, the way I would experience something, not a machine.” And so I think human touch, human experience, human connection will grow even more important in a world of rising automation and those kinds of tasks and abilities will be even more compensated than they are today. (music plays) Host: Welcome to the Microsoft Research India podcast, where we explore cutting-edge research that’s impacting technology and society. I’m your host, Sridhar Vedantham. Host: Our guest today is Dr. Eric Horvitz, Technical Fellow and director of the Microsoft Research Labs. It’s tremendously exciting to have him as the first guest on the MSR India podcast because of his stature as a leader in research and his deep understanding of the technical and societal impact of AI. Among the many honors and recognitions Eric has received over the course of his career are the Feigenbaum Prize and the Allen Newell Prize for contributions to AI, and the CHI Academy honor for his work at the intersection of AI and human-computer interaction. He has been elected fellow of the National Academy of Engineering (NAE), the Association of Computing Machinery (ACM) and the Association for the Advancement of AI , where he also served as president. Eric is also a fellow of the American Association for the Advancement of Science (AAAS), the American Academy of Arts and Sciences, and the American Philosophical Society. He has served on advisory committees for the National Science Foundation, National Institutes of Health, President’s Council of Advisors on Science and Technology, DARPA, and the Allen Institute for AI. Eric has been deeply involved in studying the influences of AI on people and society, including issues around ethics, law, and safety. He chairs Microsoft’s Aether committee on AI, effects, and ethics in engineering and research. He established the One Hundred Year Study on AI at Stanford University and co-founded the Partnership on AI. Eric received his PhD and MD degrees at Stanford University. On this podcast, we talk to Eric about his journey in Microsoft Research, his own research, the potential and pitfalls he sees in AI, how AI can help in countries like India, and much more.   Host: Eric, welcome to the podcast. Eric Horvitz: It’s an honor to be here. I just heard I am the first interviewee for this new series. Host: Yes, you are, and we are really excited about that. I can’t think of anyone better to do the first podcast of the series with! There’s something I’ve been curious about for a long time. Researchers at Microsoft Research come with extremely impressive academic credentials. It’s always intrigued me that you have a medical degree and also a degree in computer science. What was the thinking behind this and how does one complement the other in the work that you do? Eric Horvitz: One of the deep shared attributes of folks at Microsoft Research and so many of our colleagues doing research in computer science is deep curiosity, and I’ve always been one of these folks that’s said “why” to everything. I’m sure my parents were frustrated with my sequence of whys starting with one question going to another. So I’ve been very curious as an undergraduate. I did deep dives into physics and chemistry. Of course, math to support it all – biology and by the time I was getting ready to go to grad school I really was exploring so many sciences, but the big “why” for me that I could not figure out was the why of human minds, the why of cognition. I just had no intuition as to how the cells, these tangles of the cells that we learn about in biology and neuroscience could have anything to do with my second to second experience as being a human being, and so you know what I have to just spend my graduate years diving into the unknowns about this from the scientific side of things. Of course, many people have provided answers over the centuries- some of the answers are the foundations of religious beliefs of various kinds and religious systems. So I decided to go get an MD-PhD, just why not understand humans deeply and human minds as well as the scientific side of nervous systems, but I was still an arc of learning as I hit grad school at Stanford and it was great to be at Stanford because the medical school was right next to the computer science department. You can literally walk over and I found myself sitting in computer science classes, philosophy classes, the philosophy of mind-oriented classes and cognitive psychology classes and so there to the side of that kind of grad school life and MD-PhD program, there are anatomy classes that’s being socialized into the medical school class, but I was delighted by the pursuit of- you might call it the philosophical and computational side of mind- and eventually I made the jump, the leap. I said “You know what, my pursuit is principles, I think that’s the best hope for building insights about what’s going on” and I turned around those principles into real world problems in particular since that was, had a foot in the medical school, how do we apply these systems in time-critical settings to help emergency room, physicians and trauma surgeons? Time critical action where computer systems had to act quickly, but had to really also act precisely when they maybe didn’t have enough time to think all the way and this led me to what I think is an interesting direction which is models of bounded-rationality which I think describes us all. Host: Let’s jump into a topic that seems to be on everybody’s mind today – AI. Everyone seems to have a different idea about what AI actually is and what it means to them. I also constantly keep coming across people who use AI and the term ML or machine learning as synonyms. What does AI mean to you and do you think there’s a difference between AI and ML? Eric Horvitz: The scientists and engineers that first used the phrase artificial intelligence did so in a beautiful document that’s so well written in terms of the questions it asks that it could be a proposal today to the National Science Foundation, and it would seem modern given that so many the problems have not been solved, but they laid out the vision including the pillars of artificial intelligence. This notion of perception building systems that could recognize or perceive sense in the world. This idea of reasoning with logic or other methods to reason about problems, solve problems, learning how can they become better at what they did with experience with other kinds of sources of information and this final notion they focused on as being very much in the realm of human intelligence language, understanding how to manipulate symbols in streams or sequences to express concepts and use of language. So, learning has always been an important part of artificial intelligence, it’s one of several pillars of work, it’s grown in importance of late so much so that people often write AI/ML to refer to machine learning but it’s one piece and it’s an always been an important piece of artificial intelligence. Host: I think that clarifies the difference between AI and ML. Today, we see AI all around us. What about AI really excites you and what do you think the potential pitfalls of AI could be? Eric Horvitz: So let me first say that AI is a constellation of technologies. It’s not a single technology. Although, these days there’s quite a bit of focus on the ability to learn how to predict or move or solve problems via machine learning analyzing large amounts of data which has become available over the last several decades, when it used to be scarce. I’m most excited about my initial goals to understand human minds. So, whenever I read it a paper on AI or see a talk or see a new theorem being proved my first reaction is, how does it grow my understanding, how does it help to answer the questions that have been long-standing in my mind about the foundations of human cognition? I don’t often say that to anybody but that’s what I’m thinking. Secondly, my sense is what a great endeavor to be pushing your whole life to better understand and comprehend human minds. It’s been a slow slog. However, insights have come about advances and how they relate to those questions but along the way what a fabulous opportunity to apply the latest advances to enhancing the lives of people, to empowering people in new ways and to create new kinds of automation that can lead to new kinds of value, new kinds of experiences for people. The whole notion of augmenting human intellect with machines has been something that’s fascinated me for many decades. So I love the fact that we can now leverage these technologies and apply them even though we’re still very early on in how these ideas relate to what’s going on in our minds. Applications include healthcare. There’s so much to do in healthcare with decreasing the cost of medicine while raising the quality of care. This idea of being able to take large amounts of data to build high quality, high precision diagnostic systems. Systems that can predict outcomes. We just created a system recently for example that can detect when a patient in a hospital is going to crash unexpectedly with organ system failures for example, and that can be used in ways that could alert physicians in advanced, medical teams to be ready to actually save patient’s lives. Even applications that we’re now seeing in daily life like cars that drive themselves. I drive a Tesla and I’ve been enjoying the experience of the semi-automated driving, the system can do. Just seeing how far we’ve gotten in a few years with systems that recognize patterns like the patterns on a road or that recognize objects in its way for automatic braking. These systems can save thousands of lives. I’m not sure about India but I know the United States statistics and there are a little bit more than 40,000 lives lost on the highways in the United States per year. Looking at the traffic outside here in Bangalore, I’m guessing that India is at least up there with tens of thousands of deaths per year. I believe that that AI systems can reduce these numbers of deaths by helping people to drive better even if it’s just in safety related features. Host: The number of fatalities on Indian roads is indeed huge and that’s in fact been one of the motivators for a different research project in the lab on which I hope to do a podcast in the near future. Eric Horvitz: I know it’s the HAMS project. Host: It is the HAMS project and I’m hoping that we can do a podcast with the researchers on that sometime soon. Now, going back to AI, what do you think we need to look out for or be wary of? People, including industry leaders seem to land on various points on a very broad spectrum ranging from “AI is great for humanity” to “AI is going to overpower and subsume the human race at some point of time.” Eric Horvitz: So, what’s interesting to me is that over the last three decades we’ve gone from AI stands for almost implemented, doesn’t really work very well. Have fun, good luck to this idea of just getting things up and running and being so excited there’s no other concerns but to get this thing out the door and have it for example, help physicians diagnose patients more accurately to now, “Wait a minute! We are putting these machines in places that historically have always relied upon human intelligence, as these machines for the first time edge into the realm of human intellects, what are the ethical issues coming to the fore? Are there intrinsic biases in the way data is created or collected, some of which might come from the society’s biases that creates the data? What about the safety issues and the harms that can come from these systems when they make a mistake? When will systems be used in ways that could deny people consequential services like a loan or education because of an unfair decision or a decision that aligns mysteriously or obviously with the way society has worked amplifying deep biases that have come through our history?” These are all concerns that many of us are bringing to light and asking for more resources and attention to focus on and also trying to cool the jets of some enthusiasts who want to just blast ahead and apply these technologies without thinking deeply about the implications, I’d say sometimes the rough edges of these technologies. Now, I’m very optimistic that we will find pathways to getting incredible amounts of value out of these systems when properly applied, but we need to watch out for all sorts of possible adverse effects when we take our AI and throw it into the complexity of the open world outside of our clean laboratories. Host: You’ve teed-up my next question perfectly. Is it incumbent upon large tech companies who are leading the charge as far as AI is concerned to be responsible for what AI is doing, and the ethics and the fairness and all the stuff behind AI which makes it kind of equitable to people at large? Eric Horvitz: It’s a good question. There are different points of view on that question. We’ve heard some company leaders issue policy statements along the lines of “We will produce technologies and make them available and it’s the laws of the country that will help guide how they’re used or regulate what we do. If there are no laws, there’s no reason why we shouldn’t be selling something with a focus on profit to our zeal with technology.” Microsoft’s point of view has been that the technology could be created by experts inside its laboratories and by its engineers. Sometimes is getting ahead of where legislation and regulation needs to be and therefore we bear a responsibility as a company in both informing regulatory agencies and the public at large about the potential downsides of technology and appropriate uses and misuses, as well as look carefully at what we do when we actually ship our products or make a cloud service available or build something for a customer. Host: Eric, I know that you personally are deeply involved in thinking through AI and it’s impact on society, how to make it fair, how make it transparent and so on. Could you talk a little bit about that, especially in the context of what Microsoft is doing to ensure that AI is actually good for everybody? Eric Horvitz: You know, these are why this is such a passion for me – I’ve been extremely interested starting with the technical issues which I thought- I think- really deep and fascinating, which is when you build a limited system by definition that’s much simpler than a complex universe that’s going to be immersed in, you take it from the laboratory into the open world. I refer to that as AI in the open world. You learn a lot about the limitations of the AI. You also learn to ask questions and to extend these systems so they’re humble, they understand their limitations, they understand how accurate they are, you get them a level of self-knowledge. This is a whole area of open world intelligence that I think really reads upon some of the early questions for me about what humans are doing, what their minds are doing, and potentially other animals, vertebrates. It started there for me. Back to your question now, we are facing the same kind of things when we take an AI technology and put it in the hands of a judge who might make decisions about criminal justice looking at recommendations based on statistics to help him or her take an action. Now we have to realize we have systems we’re building that work with people. People want explanations. They don’t want to look at a black box with an indicator on it. They will say, why is this system telling me this? So at Microsoft we’ve made significant investments, both in our research team and in our engineering teams and in our policy groups at thinking through details of the problems and solutions when it comes to a set of problems, and I’ll just list a few right now. Safety and robustness of AI systems, transparency and intelligibility of these systems- can they explain themselves, bias and fairness, how can we build systems that are fair along certain dimensions, engineering best practices. Well, what does it mean for a team working with tools to understand how to build a system and maintain it over time so, that it’s trustworthy. Human AI collaboration – what are principles by which we can enable people to better work in a fluid way with systems that might be trying to augment their intelligence such that is a back and forth and understanding of when a system is not confident, for example. Even notions about attention and cognition is, are these systems being used in ways that might be favorable to advertisers, but they’re grabbing your attention and holding them on an application because they’ve learned how to do that mysteriously – should we have a point of view about that? So Microsoft Research has stood up teams looking at these questions. We also have stood up an ethics advisory board that we call the Aether Committee to deliberate and provide advice on hard questions that are coming up across the spectrum of these issues and providing guidance to our senior leadership team at Microsoft in how we do our business. Host: I know you were the co-founder of the Partnership on AI. Can you talk a little bit about that and what it sought to achieve? Eric Horvitz: This vision arose literally at conferences and, in fact, one of the key meetings was at a pub in New York City after meeting at NYU, where several computer scientists got together, all passionate about seeing it go well for artificial intelligence technologies by investing in understanding and addressing some of these rough edges and we decided we could bring together the large IT companies, Amazon, Apple, Facebook, Google, Microsoft to think together about what it might mean to build an organization that was a nonprofit that balanced the IT companies with groups in civil society, academic groups, nonprofit AI research to think through these challenges and come up with best practices in a way that brought the companies together rather than separating them through a competitive spirit. Actually this organization was created by the force of the friendships of AI Scientists, many of whom go back to being in grad school together across many universities, this invisible college of people united in an interesting understanding how to do AI in the open world. Host: Do you think there is a role for governments to play where policies governing AI are concerned, or do you think it’s best left to technology companies, individual thinkers and leaders to figure out what to do with AI? Eric Horvitz: Well, AI is evolving quickly and like other technologies governments have a significant role to play in assuring the safety of these technologies, their fairness, their appropriate uses. I see regulatory activity being of course largely in the hands of governments being advised by leadership in academia and in industry and the public which has a lot to say about these technologies. There’s been quite a bit of interest and activity, some of that is part of the enthusiastic energy, you might say, going into thinking through AI right now. Some people say there’s a hype-cycle that’s leaking everywhere and to all regimes, including governments right now, but it’s great to see various agencies writing documents, asking for advice, looking for sets of principles, publishing principles and engaging multi-stakeholder groups across the world. Host: There’s been a lot of talk and many conversations about the impact that AI can have on the common man. One of the areas of concern with AI spreading is the loss of jobs at a large scale. What’s your opinion on how AI is going to impact jobs? Eric Horvitz: My sense is there’s a lot of uncertainty about this, what kind of jobs will be created, what kinds of jobs will go away. If you take a segment like driving cars, I was surprised at how large a percentage of the US population makes their living driving trucks. Now, what if the long haul parts of truck driving, long highway stretches goes away when it becomes automated, it’s unclear what the ripples of that effect will be on society, on the economy. It’s interesting, there are various studies underway. I was involved in the international academy study looking at the potential effects of new kinds of automation coming via computer science and other related technologies and the results of that analysis was that we’re flying in the dark. We don’t have enough data to make these decisions yet or to make these recommendations or they have understandings about how things are going to go. So, we see people saying things on all sides right now. My own sense is that there’ll be some significant influences of AI on our daily lives and how we make our livings. But I’ll say one thing. One of my expectations and it’s maybe also a hope is that as we see more automation in the world and as that shifts in nature of what we do daily and what were paid to do or compensated to do what we call work, there’ll be certain aspects of human discourse that we simply will learn, for a variety of reasons, that we cannot automate, we aren’t able to automate or we shouldn’t automate, and the way I refer to this as in the midst of the rise of new kinds of automation some of which reading on tasks and abilities we would have in the past assumed was the realm of human intellect will see a concurrent rise of an economy of human around human caring. You think about this, humans will always want to make connection with humans, sociologists, social workers, physicians, teachers, we’re always going to want to make human connections and have human contacts. I think they’ll be amplified in a world of richer automation so much so that even when machines can generate art and write music, even music with lyrics that might put tear in someone’s eye if they didn’t know it was a machine, that will lead us to say, “Is that written by a human. I want to hear a song sung by a human who experienced something, the way I would experience something, not a machine.” And so I think human touch, human experience, human connection will grow even more important in a world of rising automation and those kinds of tasks and abilities will be even more compensated than they are today. So, we’ll see even more jobs in this realm of human caring. Host: Now, switching gears a bit, you’ve been in Microsoft Research for a long time. How have you seen MSR evolve over time and as a leader of the organization, what’s your vision for MSR over the next few years? Eric Horvitz: It’s been such an interesting journey. When I came to Microsoft Research it was 1992, and Rick Rashid and Nathan Myhrvold convinced me to stay along with two colleagues. We just came out of Stanford grad school we had ideas about going into academia. We came up to Microsoft to visit, we thought we were just here for a day to check things out, maybe seven or eight people that were then called Microsoft Research and we said, “Oh come on, please we didn’t really see a big future.” But somehow we took a risk and we loved this mission statement that starts with “Expand the state-of-the-art.” Period. Second part of the mission statement, “Transfer those technologies as fast as possible into real products and services.” Third part of the statement was, “Contribute to the vibrancy of this organization.” I remember seeing in my mind as we committed to doing this, trying it out-  a vision of a lever with the fulcrum at the mountain top in the horizon. And I thought how can we make this company ours, our platform to take our ideas which then were bubbling. We had so many ideas about what we could do with AI from my graduate work and move the world, and that’s always been my sense for what Microsoft Research has been about. It’s a place where the top intellectual talent in the world, top scholars, often with entrepreneurial bents want to get something done can make Microsoft’s their platform for expressing their creativity and having real influence to enhancing the lives of millions of people. Host: Something I’ve heard for many years at Microsoft Research is that finding the right answer is not the biggest thing, what’s important is to ask the right, tough questions. And also that if you succeed in everything you do you are probably not taking enough risks. Does MSR continue to follow these philosophies? Eric Horvitz: Well, I’ve said three things about that. First of all, why should a large company have an organization like Microsoft Research? It’s unique. We don’t see that even in competitors. Most competitors are taking experts if they could attract them and they’re embedding them in product teams. Microsoft has had the foresight and we’re reaching 30 years now since we kicked off Microsoft Research to say, if we take top talent and attract this top talent into the company and we give these people time and we familiarize them with many of our problems and aspirations, they can not only come up with new ideas, out-of-the-box directions, they can also provide new kinds of leadership to the company as a whole, setting its direction, providing a weathervane, looking out to the late-breaking changes on the frontiers of computer science and other sciences and helping to shape Microsoft in the world, versus, for example, helping a specific product team do better with an existing current conception of what a product should be. Host: Do you see this role of Microsoft Research changing over the next few years? Eric Horvitz: Microsoft has changed over its history and one of my interests and my reflections and I shared this in an all-hands meeting just last night with MSR India. In fact, they tried out some new ideas coming out of a retreat that the leadership team from Microsoft Research had in December – just a few months ago, is how might we continue to think and reflect about being the best we can, given who we are. I’ve called it polishing the gem, not breaking it but polishing, buffing it out, thinking about what we can do with it to make ourselves even more effective in the world. One trend we’ve seen at Microsoft is that over the years we’ve gone from Microsoft Research, this separate tower of intellectual depth reaching out into the company in a variety of ways, forming teams, advising, working with outside agencies, with students in the world, with universities to a larger ecosystem of research at Microsoft, where we have pockets or advanced technology groups around the company doing great work and in some ways doing the kinds of things that Microsoft Research used to be doing, or solely doing at Microsoft in some ways. So we see that upping the game as to what a center of excellence should be doing. I’m just asking the question right now, what are our deep strengths, this notion of deep scholarship, deep ability, how can we best leverage that for the world and for the company, and how can we work with other teams in a larger R&D ecosystem, which has come to be at Microsoft? Host: You’ve been at the India Lab for a couple of days now. How has the trip been and what do you think of the work that the lab in India is doing? Eric Horvitz: You know we just hit 15 here – 15 years old so this lab is just getting out of adolescence- that’s a teenager. It seems like just yesterday when I was sitting with the Anandan, the first director of this lab looking at a one-pager that he had written about “Standing up a lab in India.” I was sitting in Redmond’s and having coffee and I tell you that was a fast 15 years, but it’s been great to see what this lab became and what it does. Each of our labs is unique in so many ways typically based on the culture it’s immersed in. The India lab is famous for its deep theoretical chops and fabulous theorists here, the best in the world. This interdisciplinary spirit of taking theory and melding it with real-world challenges to create incredible new kinds of services and software. One of the marquee areas of this lab has been this notion of taking a hard look and insightful gaze at emerging markets, Indian culture all up and thinking about how computing and computing platforms and communications can be harnessed in a variety of ways to enhance the lives of people, how can they be better educated, how can we make farms, agriculture be more efficient and productive, how can we think about new economic models, new kinds of jobs, how can we leverage new notions of what it means to do freelance or gig work. So the lab has its own feel, its own texture, and when I immerse myself in it for a few days I just love getting familiar with the latest new hires, the new research fellows, the young folks coming out of undergrad that are just bright-eyed and inject energy into this place. So I find Microsoft Research India to have a unique combination of talented researchers and engineers that brings to the table some of the deepest theory in the world’s theoretical understandings of hard computer science, including challenges with understanding the foundations of AI systems. There’s a lot of work going on right now. Machine learning as we discussed earlier, but we don’t have a deep understanding, for example, of how these neural network systems work and why they’re working so well and I just came out of a meeting where folks in this lab have come up with some of the first insights into why some of these procedures are working so well to understand that and understand their limitations and which ways to go and how to guide that, how to navigate these problems is rare and it takes a deep focus and ability to understand the complexity arising in these representations and methods. At the same time, we have the same kind of focus and intensity with a gaze at culture at emerging markets. There are some grand challenges with understanding the role of technology in society when it comes to a complex civilization, or I should say set of civilizations like we see in India today. This mix of futuristic, out-of-the-box advanced technology with rural farms, classical ways of doing things, meshing the old and the new and so many differences as you move from province to province, state to state, and these sociologists and practitioners that are looking carefully at ethnography, epidemiology, sociology, coupled with computer science are doing fabulous things here at the Microsoft Research India Lab. Even coming up with new thinking about how we can mesh opportunistic Wi-Fi with sneakers, Sneakernet and people walking around to share large amounts of data. I don’t think that project would have arisen anywhere, but at this lab. Host: Right. So you’ve again teed-up my next question perfectly. As you said India’s a very complex place in terms of societal inequities and wealth inequalities. Eric Horvitz: And technical inequality, it’s amazing how different things are from place to place. Host: That’s right. So, what do you think India can do to utilize AI better and do you think India is a place that can generate new innovative kinds of AI? Eric Horvitz: Well, absolutely, the latter is going to be true, because some of the best talent in computer science in the world is being educated and is working in this, in this country, so of course we will see fabulous things, fabulous innovations being originating in India in both in the universities and in research labs, including Microsoft Research. As to how to harness these technologies, you know, it takes a special skill to look at the currently available capabilities in a constellation of technologies and to think deeply about how to take them into the open world into the real world, the complex messy world. It often takes insights as well as a very caring team of people to stick with an idea and to try things out and to watch it and to nurture it and to involve multiple stakeholders in watching over time for example, even how a deployment works, gathering data about it and so on. So, I think some very promising areas include healthcare. There are some sets of illnesses that are low-hanging fruit for early detection and diagnosis, understanding where we could intervene early on by looking at pre-diabetes states for example and guiding patients early on to getting care to not go into more serious pathophysiologies, understanding when someone needs to be hospitalized, how long they should be hospitalized in a resource limited realm, we have to sort of selectively allocate resources, doing them more optimally can lead to great effects. This idea of understanding education, how to educate people, how to engage them over time, diagnosing which students might drop out early on and alerting teachers to invest more effort, understanding when students don’t understand something and automatically helping them get through a hard concept. We’re seeing interesting breakthroughs now in tutoring systems that can detect these states. Transportation – I mean, it’s funny we build systems in the United States and this what I was doing to predict traffic and to route cars ideally. Then we come to India and we look at the streets here we say, “I don’t think so, we need a different approach,” but it just raises the stakes on how we can apply AI in new ways. So, the big pillars are education, healthcare, transportation, even understanding how to guide resources and allocations in the economy. I think we’ll see big effects of insightful applications in this country. Host: This has been a very interesting conversation. Before we finish do you want to leave us with some final thoughts? Eric Horvitz: Maybe I’ll make a call out to young folks who are thinking about their careers and what they might want to do and to assure them that it’s worth it. It’s worth investing in taking your classes seriously, in asking lots of questions, in having your curiosities addressed by your teachers and your colleagues, family. There’s so much excitement and fun in doing research and development, in being able to build things and feel them and see how they work in the world, and maybe mostly being able to take ideas into reality in ways that you can see the output of your efforts and ideas really delivering value to people in the world. Host: That was a great conversation, Eric. Thank you! Eric Horvitz: Thank you, it’s been fun.

Berkeley Talks
Barbara Simons on election hacking and how to avoid it in 2020

Berkeley Talks

Play Episode Listen Later Oct 11, 2019 43:17


"There are a number of myths about elections that we've been hearing, saying that they are secure. And I want to shoot down two of those key myths," says Barbara Simons, board chair of Verified Voting, in a talk called "Can we recover from an attack on our election?" that she gave for the annual Minner Distinguished Lecture in Engineering Ethics on Sept. 18.The first myth, says Simons, is that because voting machines are never connected to the internet, they can't be hacked. The second is that there are so many types of voting systems that it's impossible to rig an election. She explains why both are untrue.She goes on to discuss how, in 2002, computers were introduced in U.S. elections without an analysis of the risks, how it led to states adopting paperless voting and what we need to do to avoid hacking in our 2020 presidential election."We have a solution, so that's the good news," says Simons. "We have a solution. You need voter-marked paper ballots. You need a strong chain of custody. And you need to physically sound, manually post-election ballot audits called risk-limiting audits."She says it's too late to have any laws passed in time for the 2020 election. Instead, we need the cooperation of local election officials and a national campaign. And, she says, it's up to volunteers and staff to help the election officials do risk-limiting audits. "If we can do that, there's a good chance we can avoid hacking of the 2020 election. But that's a big 'if.'"Simons is the former president of the Association for Computing Machinery (ACM), the nation’s largest educational and scientific computing society. An expert on electronic voting, she is the co-author of Broken Ballots: Will Your Vote Count? and has been on the board of advisers of the U.S. Election Assistance Commission since 2008.The Minner Distinguished Lecture in Engineering Ethics is an annual lecture supported by the Minner Endowment, a gift from Berkeley Engineering alumnus Warren Minner and his wife, Marjorie.Listen and read a transcript on Berkeley News.Watch a video of Simons' talk on Berkeley Engineering's website. See acast.com/privacy for privacy and opt-out information.

simons election hacking verified voting computing machinery acm berkeley news barbara simons
Google Cloud Platform Podcast
Google AI with Jeff Dean

Google Cloud Platform Podcast

Play Episode Listen Later Sep 11, 2018 44:15


Jeff Dean, the lead of Google AI, is on the podcast this week to talk with Melanie and Mark about AI and machine learning research, his upcoming talk at Deep Learning Indaba and his educational pursuit of parallel processing and computer systems was how his career path got him into AI. We covered topics from his team’s work with TPUs and TensorFlow, the impact computer vision and speech recognition is having on AI advancements and how simulations are being used to help advance science in areas like quantum chemistry. We also discussed his passion for the development of AI talent in the content of Africa and the opening of Google AI Ghana. It’s a full episode where we cover a lot of ground. One piece of advice he left us with, “the way to do interesting things is to partner with people who know things you don’t.” Listen for the end of the podcast where our colleague, Gabe Weiss, helps us answer the question of the week about how to get data from IoT core to display in real time on a web front end. Jeff Dean Jeff Dean joined Google in 1999 and is currently a Google Senior Fellow, leading Google AI and related research efforts. His teams are working on systems for speech recognition, computer vision, language understanding, and various other machine learning tasks. He has co-designed/implemented many generations of Google’s crawling, indexing, and query serving systems, and co-designed/implemented major pieces of Google’s initial advertising and AdSense for Content systems. He is also a co-designer and co-implementor of Google’s distributed computing infrastructure, including the MapReduce, BigTable and Spanner systems, protocol buffers, the open-source TensorFlow system for machine learning, and a variety of internal and external libraries and developer tools. Jeff received a Ph.D. in Computer Science from the University of Washington in 1996, working with Craig Chambers on whole-program optimization techniques for object-oriented languages. He received a B.S. in computer science & economics from the University of Minnesota in 1990. He is a member of the National Academy of Engineering, and of the American Academy of Arts and Sciences, a Fellow of the Association for Computing Machinery (ACM), a Fellow of the American Association for the Advancement of Sciences (AAAS), and a winner of the ACM Prize in Computing. Cool things of the week Google Dataset Search is in beta site Expanding our Public Datasets for geospatial and ML-based analytics blog Zip Code Tabulation Area (ZCTA) site Google AI and Kaggle Inclusive Images Challenge site We are rated in the top 100 technology podcasts on iTunes site What makes TPUs fine-tuned for deep learning? blog Interview Jeff Dean on Google AI profile Deep Learning Indaba site Google AI site Google AI in Ghana blog Google Brain site Google Cloud site DeepMind site Cloud TPU site Google I/O Effective ML with Cloud TPUs video Liquid cooling system article DAWNBench Results site Waymo (Alphabet’s Autonomous Car) site DeepMind AlphaGo site Open AI Dota 2 blog Moustapha Cisse profile Sanjay Ghemawat profile Neural Information Processing Systems Conference site Previous Podcasts GCP Podcast Episode 117: Cloud AI with Dr. Fei-Fei Li podcast GCP Podcast Episode 136: Robotics, Navigation, and Reinforcement Learning with Raia Hadsell podcast TWiML & AI Systems and Software for ML at Scale with Jeff Dean podcast Additional Resources arXiv.org site Chris Olah blog Distill Journal site Google’s Machine Learning Crash Course site Deep Learning by Ian Goodfellow, Yoshua Bengio and Aaron Courville book and site NAE Grand Challenges for Engineering site Senior Thesis Parallel Implementations of Neural Network Training: Two Back-Propagation Approaches by Jeff Dean paper and tweet Machine Learning for Systems and Systems for Machine Learning slides Question of the week How do I get data from IoT core to display in real time on a web front end? Building IoT Applications on Google Cloud video MQTT site Cloud Pub/Sub site Cloud Functions site Cloud Firestore site Where can you find us next? Melanie is at Deep Learning Indaba and Mark is at Tokyo NEXT. We’ll both be at Strangeloop end of the month. Gabe will be at Cloud Next London and the IoT World Congress.

Firewalls Don't Stop Dragons Podcast

What could be more crucial to a democracy than a voting system we can trust? Today I speak with Barbara Simons, President of VerifiedVoting.org, on why so many of our US election systems are vulnerable to hacking without leaving a trace. The solutions to these issues are well known and straightforward, and yet we can’t seem to come together in a unified way to implement them. We’ll discuss why the current systems are so bad, what needs to be done, and tell you what you can do to help. I will also tell you about a new file backup tool from Google, 14M Verizon customer records found online with no protection, why you might be wary about leaving your keys lying around in plain sight, and how to improve your privacy with Post-It Notes! Barbara Simons has been on the Board of Advisors of the U.S. Election Assistance Commission since 2008. She published Broken Ballots: Will Your Vote Count?, a book on voting machines co-authored with Douglas Jones. She also co-authored the report that led to the cancellation of Department of Defense’s Internet voting project (SERVE) in 2004 because of security concerns. In 2015 she co-authored the report of the U.S. Vote Foundation entitled The Future of Voting: End-to-End Verifiable Internet Voting, which included in its conclusions that “every publicly audited, commercial Internet voting system to date is fundamentally insecure.” Simons is a former President of the Association for Computing Machinery (ACM), the oldest and largest international educational and scientific society for computing professionals. She is President of Verified Voting and is retired from IBM Research. Get 10% off your first domain name order!   For Further Insight: Web site: VerifiedVoting.org Follow on Twitter: https://twitter.com/VerifiedVoting Further Reading: Does your state have proper voting machines? Do they have procedures for audits? https://www.verifiedvoting.org/ Google’s backup service: https://techcrunch.com/2017/07/12/google-launches-a-new-backup-sync-desktop-app-for-uploading-files-and-photos-to-the-cloud/ Change your Verizon PIN: https://www.verizonwireless.com/support/account-pin-faqs/ Copy a key with a photo: https://www.key.me/ Lose all your photos when your hard drive crashed? Did a cloud backup save your bacon when you had your phone stolen? Tell me your best backup stories for a chance to win a free copy of my book! Send them to CareyParker@AmericaOutLoud.com!

Show IP Protocols
Diffie and Hellman Receive Turing Award 2015

Show IP Protocols

Play Episode Listen Later May 3, 2016


When we study IPSec, we know Mr. Diffie and Mr. Hellman invented a method in year 1976 that is the core of Internet Key Exchange (IKE) to create mutually shared secret. We also have to specify and configure DH Group Number in ISAKMP policy sets (crypto-map in Cisco IOS).A.M. Turing Award Logo. Captured on ACM Official Website.I am not going to dig in the details about the mathematics behind Diffie-Hellman method. I just want you to know Mr. Diffie and Mr. Hellman receive Turing Award 2015 together.Photo of Whitfield Diffie, captured on ACM Official Website.Photo of Martin E. Hellman, captured on ACM Website.A.M. Turing Award of Association for Computing Machinery (ACM) is the highest honorable award in computer science just like Nobel Prize for other fields of science.This was released on March 1, 2016.One more thing…In case you want to know more about Diffie-Hellman method, I found one video on YouTube is quite helpful for you to understand it more.Have fun!

ILTA
Driving Change in a Law Firm - Matt Coatney Interview

ILTA

Play Episode Listen Later Jul 2, 2015 11:44


Join us for this series of podcasts as we discuss tips and techniques for driving change in a law firm. Four ILTA members who hold leadership positions within their law firms share their insights, strategies and best practices to build support for information governance, knowledge management and enterprise content management initiatives. Speaker: Matt Coatney is an AI software executive, inventor and developer with 15 years of experience in bringing advanced technologies to market in the healthcare, e-commerce, finance and legal industries. He creates value for organizations by timing introduction of AI just as technologies become viable, focusing on tangible business outcomes and applying design principles to maximize acceptance and adoption. Matt's previous successes include intelligent decision support tools LeadScope for drug discovery, Lexis Search Advantage and Document Profiling for legal content analysis and discovery, and predictive cost modeling for the U.S. Air Force. He writes and speaks frequently about commercializing AI advances, and his work has appeared in books, technical journals and international conferences. Matt is a member of the Association for the Advancement of Artificial Intelligence and the Association for Computing Machinery (ACM) including AI, bioinformatics and knowledge discovery special interest groups.

CERIAS Security Seminar Podcast
Stuart Shapiro, Scenario-Driven Construction of Enterprise Information Policy

CERIAS Security Seminar Podcast

Play Episode Listen Later Feb 7, 2007 57:47


Information policy at the enterprise level is invariably an exercise in gaps and inconsistencies. The range of concerns—including security—is broad, the environment tends to be heterogeneous and dispersed, the contextual scope is significant, and the stakeholders are numerous. MITRE ran headlong into this problem as it set about conceiving and implementing a new enterprise IT architecture, with questions increasingly raised regarding what policies the new architecture had to be capable of supporting. The MITRE Information Policy Framework (MIPF) is the mechanism MITRE developed to answer these questions. The MIPF supports systematic, structured analysis and formulation of information policy in five areas: security, privacy, management, stewardship, and sharing. This presentation will discuss the structure and use of the MIPF, with an emphasis on security requirements. About the speaker: Dr. Stuart S. Shapiro is a Lead Information Security Scientist and a member of the Privacy Practice at the MITRE Corporation, a not-for-profit company performing contract technical research and consulting primarily for the U.S. government. At MITRE he has supported a wide range of privacy activities, including privacy impact assessments, for major government programs. Prior to joining MITRE he was Director of Privacy at CareInsite, an e-health company, where his responsibilities included both policy and technical issues revolving around privacy and security. He has also held academic positions in the U.S. and the U.K. and taught courses on the history, politics, and ethics of information and communication technologies (ICTs). His research and writing have focused on ICTs and privacy and on the history and sociology of software development. Among his professional affiliations are the Association for Computing Machinery (ACM)—including its public policy committee, USACM—and the International Association of Privacy Professionals (IAPP).