POPULARITY
with @ilblackdragon @rhhackettWelcome to web3 with a16z. I'm your host, Robert Hackett.In this episode, we're diving deep into one of the most intriguing intersections in tech today: AI and crypto.To help us unpack it, we're joined by Illia Polosukhin — co-founder of the crypto protocol NEAR and co-author of the groundbreaking 2017 "transformers" paper that kicked off the current AI boom. Ilia has been early to some of the biggest recent tech trends, and today he brings us a rare, panoramic view of the tech industry's cutting edge.Together we explore what the phrase “user-owned AI” really means; why the so-called agentic internet — that is, a world where your AI assistant talks directly to services on your behalf — might replace the very notion of websites and apps as we know them; and much more.Timestamps:(0:00) Introduction(3:40) Centralization and Challenges of AI(6:17) "User-Owned" AI(12:14) Confidential Computing and AI(17:51) The Birth of Transformers(22:33) NEAR AI and Crowdsourcing(27:56) AI Agents and Future Applications(31:04) The End of Websites and Applications(34:08) Dead Internet Theory & Distinguishing Humans(41:49) Open Source vs. Open Weight Models(43:48) Geopolitical Implications of AI(46:55) NEAR Protocol and Blockchain Scaling(59:29) The Role of Humans in an AI WorldResources:Attention is all you need by Vaswani et al. (Conference on Neural Information Processing Systems 2017)As a reminder, none of the content should be taken as investment, business, legal, or tax advice; please see a16z.com/disclosures for more important information, including a link to a list of our investments.
AI in Public Health & Medicine For more information checkout: (1) Turing, A. M. (1950). Computing Machinery and Intelligence. Mind, 59(236), 433–460. DOI (2) Wiener, N. (1948). Cybernetics: Or Control and Communication in the Animal and the Machine. MIT Press. (3) McCarthy, J., Minsky, M. L., Rochester, N., & Shannon, C. E. (1955). A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence. (4) Newell, A., & Simon, H. A. (1956). The Logic Theory Machine—A Complex Information Processing System. IRE Transactions on Information Theory, 2(3), 61–79. DOI (5) Weizenbaum, J. (1966). ELIZA—A Computer Program for the Study of Natural Language Communication Between Man and Machine. Communications of the ACM, 9(1), 36–45. DOI (6) Crevier, D. (1993). AI: The Tumultuous History of the Search for Artificial Intelligence. Basic Books. (7) Feigenbaum, E. A., & McCorduck, P. (1983). The Fifth Generation: Artificial Intelligence and Japan's Computer Challenge to the World. Addison-Wesley. (8) Campbell, M., Hoane, A. J., & Hsu, F. H. (2002). Deep Blue. Artificial Intelligence, 134(1–2), 57–83. DOI (9) Silver, D., et al. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484–489. DOI (10) Brown, T., et al. (2020). Language Models are Few-Shot Learners. Advances in Neural Information Processing Systems. (11) Ramesh, A., et al. (2021). Zero-Shot Text-to-Image Generation. OpenAI. (12) Binns, R. (2018). Fairness in Machine Learning: Lessons from Political Philosophy. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. DOI (13) Statista Research Department. (2023). Daily Per Capita Data Interactions Worldwide. (14) "AI in Health Care: Applications, Benefits, and Examples" Authors: Coursera Team Published: October 2024 (15) "AI in Healthcare: Benefits and Examples" Authors: Cleveland Clinic Health Essentials Published: September 2024 (16) "AI in Healthcare: The Future of Patient Care and Health Management" Authors: Mayo Clinic Press Published: March 2024 (17) "10 Top Artificial Intelligence (AI) Applications in Healthcare" Authors: VentureBeat Staff Published: August 2022 (18) "10 Real-World Examples of AI in Healthcare" Authors: Philips News Center Published: November 2022 (19) "AI in Healthcare: Uses, Examples & Benefits" Authors: Built In Staff Published: November 2024 (20) "Artificial Intelligence in Health Care: Benefits and Challenges of Machine Learning in Drug Development" Authors: U.S. Government Accountability Office Published: December 2020 (21) "Integrated Multimodal Artificial Intelligence Framework for Healthcare Applications" Authors: Luis R. Soenksen, Yu Ma, Cynthia Zeng, Leonard D. J. Boussioux, Kimberly Villalobos Carballo, Liangyuan Na, Holly M. Wiberg, Michael L. Li, Ignacio Fuentes, Dimitris Bertsimas Published: February 2022 (22) "Remote Patient Monitoring Using Artificial Intelligence: Current State, Applications, and Challenges" Authors: Thanveer Shaik, Xiaohui Tao, Niall Higgins, Lin Li, Raj Gururajan, Xujuan Zhou, U. Rajendra Acharya Published: January 2023 (23) Artificial Intelligence in Medicine and Healthcare: A Review and Classification of Current and Near-Future Applications and Their Ethical and Social Impact" Authors: Emilio Gómez-González, Emilia Gómez, Javier Márquez-Rivas, Manuel Guerrero-Claro, Isabel Fernández-Lizaranzu, María Isabel Relimpio-López, Manuel E. Dorado, María José Mayorga-Buiza, Guillermo Izquierdo-Ayuso, Luis Capitán-Morales Published: January 2020 (24) Parums DV. Editorial: Infectious Disease Surveillance Using Artificial Intelligence (AI) and its Role in Epidemic and Pandemic Preparedness. Med Sci Monit. 2023;29:e941209. Published 2023 Jun 1. doi:10.12659/MSM.941209 (25) Chen, S., Yu, J., Chamouni, S. et al. Integrating machine learning and artificial intelligence in life-course epidemiology: pathways to innovative public health solutions. BMC Med 22, 354 (2024). (26) Abdulkareem M, Petersen SE. The Promise of AI in Detection, Diagnosis, and Epidemiology for Combating COVID-19: Beyond the Hype. Front Artif Intell. 2021;4:652669. Published 2021 May 14. doi:10.3389/frai.2021.652669 (27) Hamilton AJ, Strauss AT, Martinez DA, et al. Machine learning and artificial intelligence: applications in healthcare epidemiology. Antimicrob Steward Healthc Epidemiol. 2021;1(1):e28. Published 2021 Oct 7. doi:10.1017/ash.2021.192
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Jailbreak steering generalization, published by Sarah Ball on June 20, 2024 on The AI Alignment Forum. This work was performed as part of SPAR We use activation steering (Turner et al., 2023; Rimsky et al., 2023) to investigate whether different types of jailbreaks operate via similar internal mechanisms. We find preliminary evidence that they may. Our analysis includes a wide range of jailbreaks such as harmful prompts developed in Wei et al. 2024, the universal jailbreak in Zou et al. (2023b), and the payload split jailbreak in Kang et al. (2023). For all our experiments we use the Vicuna 13B v1.5 model. In a first step, we produce jailbreak vectors for each jailbreak type by contrasting the internal activations of jailbreak and non-jailbreak versions of the same request (Rimsky et al., 2023; Zou et al., 2023a). Interestingly, we find that steering with mean-difference jailbreak vectors from one cluster of jailbreaks helps to prevent jailbreaks from different clusters. This holds true for a wide range of jailbreak types. The jailbreak vectors themselves also cluster according to semantic categories such as persona modulation, fictional settings and style manipulation. In a second step, we look at the evolution of a harmfulness-related direction over the context (found via contrasting harmful and harmless prompts) and find that when jailbreaks are included, this feature is suppressed at the end of the instruction in harmful prompts. This provides some evidence for the fact that jailbreaks suppress the model's perception of request harmfulness. Effective jailbreaks usually decrease the amount of the harmfulness feature present more. However, we also observe one jailbreak ("wikipedia with title"[1]), which is an effective jailbreak although it does not suppress the harmfulness feature as much as the other effective jailbreak types. Furthermore, the jailbreak steering vector based on this jailbreak is overall less successful in reducing the attack success rate of other types. This observation indicates that harmfulness suppression might not be the only mechanism at play as suggested by Wei et al. (2024) and Zou et al. (2023a). References Turner, A., Thiergart, L., Udell, D., Leech, G., Mini, U., and MacDiarmid, M. Activation addition: Steering language models without optimization. arXiv preprint arXiv:2308.10248, 2023. Kang, D., Li, X., Stoica, I., Guestrin, C., Zaharia, M., and Hashimoto, T. Exploiting programmatic behavior of LLMs: Dual-use through standard security attacks. arXiv preprint arXiv:2302.05733, 2023. Rimsky, N., Gabrieli, N., Schulz, J., Tong, M., Hubinger, E., and Turner, A. M. Steering Llama 2 via contrastive activation addition. arXiv preprint arXiv:2312.06681, 2023. Wei, A., Haghtalab, N., and Steinhardt, J. Jailbroken: How does LLM safety training fail? Advances in Neural Information Processing Systems, 36, 2024. Zou, A., Phan, L., Chen, S., Campbell, J., Guo, P., Ren, R., Pan, A., Yin, X., Mazeika, M., Dombrowski, A.-K., et al. Representation engineering: A top-down approach to AI transparency. arXiv preprint arXiv:2310.01405, 2023a. Zou, A., Wang, Z., Kolter, J. Z., and Fredrikson, M. Universal and transferable adversarial attacks on aligned language models. arXiv preprint arXiv:2307.15043, 2023b. 1. ^ This jailbreak type asks the model to write a Wikipedia article titled as . Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Jailbreak steering generalization, published by Sarah Ball on June 20, 2024 on LessWrong. This work was performed as part of SPAR We use activation steering (Turner et al., 2023; Rimsky et al., 2023) to investigate whether different types of jailbreaks operate via similar internal mechanisms. We find preliminary evidence that they may. Our analysis includes a wide range of jailbreaks such as harmful prompts developed in Wei et al. 2024, the universal jailbreak in Zou et al. (2023b), and the payload split jailbreak in Kang et al. (2023). For all our experiments we use the Vicuna 13B v1.5 model. In a first step, we produce jailbreak vectors for each jailbreak type by contrasting the internal activations of jailbreak and non-jailbreak versions of the same request (Rimsky et al., 2023; Zou et al., 2023a). Interestingly, we find that steering with mean-difference jailbreak vectors from one cluster of jailbreaks helps to prevent jailbreaks from different clusters. This holds true for a wide range of jailbreak types. The jailbreak vectors themselves also cluster according to semantic categories such as persona modulation, fictional settings and style manipulation. In a second step, we look at the evolution of a harmfulness-related direction over the context (found via contrasting harmful and harmless prompts) and find that when jailbreaks are included, this feature is suppressed at the end of the instruction in harmful prompts. This provides some evidence for the fact that jailbreaks suppress the model's perception of request harmfulness. Effective jailbreaks usually decrease the amount of the harmfulness feature present more. However, we also observe one jailbreak ("wikipedia with title"[1]), which is an effective jailbreak although it does not suppress the harmfulness feature as much as the other effective jailbreak types. Furthermore, the jailbreak steering vector based on this jailbreak is overall less successful in reducing the attack success rate of other types. This observation indicates that harmfulness suppression might not be the only mechanism at play as suggested by Wei et al. (2024) and Zou et al. (2023a). References Turner, A., Thiergart, L., Udell, D., Leech, G., Mini, U., and MacDiarmid, M. Activation addition: Steering language models without optimization. arXiv preprint arXiv:2308.10248, 2023. Kang, D., Li, X., Stoica, I., Guestrin, C., Zaharia, M., and Hashimoto, T. Exploiting programmatic behavior of LLMs: Dual-use through standard security attacks. arXiv preprint arXiv:2302.05733, 2023. Rimsky, N., Gabrieli, N., Schulz, J., Tong, M., Hubinger, E., and Turner, A. M. Steering Llama 2 via contrastive activation addition. arXiv preprint arXiv:2312.06681, 2023. Wei, A., Haghtalab, N., and Steinhardt, J. Jailbroken: How does LLM safety training fail? Advances in Neural Information Processing Systems, 36, 2024. Zou, A., Phan, L., Chen, S., Campbell, J., Guo, P., Ren, R., Pan, A., Yin, X., Mazeika, M., Dombrowski, A.-K., et al. Representation engineering: A top-down approach to AI transparency. arXiv preprint arXiv:2310.01405, 2023a. Zou, A., Wang, Z., Kolter, J. Z., and Fredrikson, M. Universal and transferable adversarial attacks on aligned language models. arXiv preprint arXiv:2307.15043, 2023b. 1. ^ This jailbreak type asks the model to write a Wikipedia article titled as . Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Jailbreak steering generalization, published by Sarah Ball on June 20, 2024 on LessWrong. This work was performed as part of SPAR We use activation steering (Turner et al., 2023; Rimsky et al., 2023) to investigate whether different types of jailbreaks operate via similar internal mechanisms. We find preliminary evidence that they may. Our analysis includes a wide range of jailbreaks such as harmful prompts developed in Wei et al. 2024, the universal jailbreak in Zou et al. (2023b), and the payload split jailbreak in Kang et al. (2023). For all our experiments we use the Vicuna 13B v1.5 model. In a first step, we produce jailbreak vectors for each jailbreak type by contrasting the internal activations of jailbreak and non-jailbreak versions of the same request (Rimsky et al., 2023; Zou et al., 2023a). Interestingly, we find that steering with mean-difference jailbreak vectors from one cluster of jailbreaks helps to prevent jailbreaks from different clusters. This holds true for a wide range of jailbreak types. The jailbreak vectors themselves also cluster according to semantic categories such as persona modulation, fictional settings and style manipulation. In a second step, we look at the evolution of a harmfulness-related direction over the context (found via contrasting harmful and harmless prompts) and find that when jailbreaks are included, this feature is suppressed at the end of the instruction in harmful prompts. This provides some evidence for the fact that jailbreaks suppress the model's perception of request harmfulness. Effective jailbreaks usually decrease the amount of the harmfulness feature present more. However, we also observe one jailbreak ("wikipedia with title"[1]), which is an effective jailbreak although it does not suppress the harmfulness feature as much as the other effective jailbreak types. Furthermore, the jailbreak steering vector based on this jailbreak is overall less successful in reducing the attack success rate of other types. This observation indicates that harmfulness suppression might not be the only mechanism at play as suggested by Wei et al. (2024) and Zou et al. (2023a). References Turner, A., Thiergart, L., Udell, D., Leech, G., Mini, U., and MacDiarmid, M. Activation addition: Steering language models without optimization. arXiv preprint arXiv:2308.10248, 2023. Kang, D., Li, X., Stoica, I., Guestrin, C., Zaharia, M., and Hashimoto, T. Exploiting programmatic behavior of LLMs: Dual-use through standard security attacks. arXiv preprint arXiv:2302.05733, 2023. Rimsky, N., Gabrieli, N., Schulz, J., Tong, M., Hubinger, E., and Turner, A. M. Steering Llama 2 via contrastive activation addition. arXiv preprint arXiv:2312.06681, 2023. Wei, A., Haghtalab, N., and Steinhardt, J. Jailbroken: How does LLM safety training fail? Advances in Neural Information Processing Systems, 36, 2024. Zou, A., Phan, L., Chen, S., Campbell, J., Guo, P., Ren, R., Pan, A., Yin, X., Mazeika, M., Dombrowski, A.-K., et al. Representation engineering: A top-down approach to AI transparency. arXiv preprint arXiv:2310.01405, 2023a. Zou, A., Wang, Z., Kolter, J. Z., and Fredrikson, M. Universal and transferable adversarial attacks on aligned language models. arXiv preprint arXiv:2307.15043, 2023b. 1. ^ This jailbreak type asks the model to write a Wikipedia article titled as . Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
In this episode, I talk with Alexander Huth, Assistant Professor of Neuroscience and Computer Science at the University of Texas, Austin, about his work using functional imaging and advanced computational methods to model how the brain processes language and represents meaning.Huth lab websiteHuth AG, Nishimoto S, Vu AT, Gallant JL. A continuous semantic space describes the representation of thousands of object and action categories across the human brain. Neuron 2012; 76: 1210-24. [doi]Huth AG, de Heer WA, Griffiths TL, Theunissen FE, Gallant JL. Natural speech reveals the semantic maps that tile human cerebral cortex. Nature 2016; 532: 453-8. [doi]Jain S, Huth AG. Incorporating context into language encoding models for fMRI. Proceedings of the 32nd International Conference on Neural Information Processing Systems 2018, pp. 6629-38. [doi]Tang J, LeBel A, Jain S, Huth AG. Semantic reconstruction of continuous language from non-invasive brain recordings. Nat Neurosci in press. [doi]
Anima Anandkumar is setting a personal record this week with seven of her team’s research papers accepted to NeurIPS 2020. The 34th annual Neural Information Processing Systems conference is taking place virtually from Dec. 6-12. The premier event on neural networks, NeurIPS draws thousands of the world’s best researchers every year. Anandkumar, NVIDIA’s director of machine learning research and Bren professor at CalTech’s CMS Department, joined AI Podcast host Noah Kravitz to talk about what to expect at the conference, and to explain what she sees as the future of AI. https://blogs.nvidia.com/blog/2020/12/09/neurips-nvidia-caltech-anima-anandkumar/
At the G7 meeting in Montreal last year, Justin Trudeau told WIRED he would look into why more than 100 African artificial intelligence researchers had been barred from visiting that city to attend their field's most important annual event, the Neural Information Processing Systems conference, or NeurIPS. Now the same thing has happened again.
The rapid diffusion of social media like Facebook and Twitter, and the massive use of different types of forums like Reddit, Quora, etc., is producing an impressive amount of text data every day. There is one specific activity that many business owners have been contemplating over the last five years, that is identifying the social sentiment of their brand, by analysing the conversations of their users. In this episode I explain how one can get the best shot at classifying sentences with deep learning and word embedding. Additional material Schematic representation of how to learn a word embedding matrix E by training a neural network that, given the previous M words, predicts the next word in a sentence. Word2Vec example source code https://gist.github.com/rlangone/ded90673f65e932fd14ae53a26e89eee#file-word2vec_example-py References [1] Mikolov, T. et al., "Distributed Representations of Words and Phrases and their Compositionality", Advances in Neural Information Processing Systems 26, pages 3111-3119, 2013. [2] The Best Embedding Method for Sentiment Classification, https://medium.com/@bramblexu/blog-md-34c5d082a8c5 [3] The state of sentiment analysis: word, sub-word and character embedding https://amethix.com/state-of-sentiment-analysis-embedding/
How are differential equations related to neural networks? What are the benefits of re-thinking neural network as a differential equation engine? In this episode we explain all this and we provide some material that is worth learning. Enjoy the show! Residual Block References [1] K. He, et al., “Deep Residual Learning for Image Recognition”, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770-778, 2016 [2] S. Hochreiter, et al., “Long short-term memory”, Neural Computation 9(8), pages 1735-1780, 1997. [3] Q. Liao, et al.,”Bridging the gaps between residual learning, recurrent neural networks and visual cortex”, arXiv preprint, arXiv:1604.03640, 2016. [4] Y. Lu, et al., “Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equation”, Proceedings of the 35th International Conference on Machine Learning (ICML), Stockholm, Sweden, 2018. [5] T. Q. Chen, et al., ” Neural Ordinary Differential Equations”, Advances in Neural Information Processing Systems 31, pages 6571-6583}, 2018
One of most promising areas for artificial intelligence research rests at the intersection of biology and medicine. That's where we found Robert Fratila, CTO and Co-founder of Aifred Health. He and his team won an XPRIZE at the Annual Conference on Neural Information Processing Systems. He's worked on brain-state classifiers, computer vision packages for autonomous underwater vehicles, and predictive models for cancer patients, just to name a few. In this episode we dig into deep learning, neural networks, and hype-busting truths about the current limits of AI. See acast.com/privacy for privacy and opt-out information.
Melanie is solo this week talking with Anima Anandkumar, a Caltech Bren professor and director of ML research at NVIDIA. We touch on tensors, their use, and how they relate to TensorFlow. Anima also details the work she does with NVIDIA and how they are helping to advance machine learning through hardware and software. Our main discussion centers around AI and machine learning research conferences, specifically the Neural Information Processing Systems conference (commonly referred to as NIPS) and the reason they have rebranded. NIPS originally started as a small conference at Caltech. As deep learning became more and more popular, it grew exponentially. With the higher attendance and interest, the acronym became center stage. Sexual innuendos and harassing puns surrounded the conference, sparking a call for a name change. At first, conference organizers were reluctant to rebrand and they used recent survey results as a reason to keep NIPS. Anima discusses her personal experience protesting the acronym, opening up about the hate speech and threats of which she and others received. Despite the harassment, Anima and others continued to protest, petition, and share stories of mistreatment within the community which helped lead to the name/acronym change to NeurIPS. The rebranding hopes to reestablish an inclusive academic community and move the focus back to machine learning research and away from unprofessional attention. Anima Anandkumar Animashree (Anima) Anandkumar is a Bren professor at Caltech CMS department and a director of machine learning research at NVIDIA. Her research spans both theoretical and practical aspects of machine learning. In particular, she has spearheaded research in tensor-algebraic methods, large-scale learning, deep learning, probabilistic models, and non-convex optimization. Anima is the recipient of several awards such as the Alfred. P. Sloan Fellowship, NSF Career Award, Young investigator awards from the Air Force and Army research offices, Faculty fellowships from Microsoft, Google and Adobe, and several best paper awards. She is the youngest named professor at Caltech, the highest honor bestowed to an individual faculty. She is part of the World Economic Forum’s Expert Network consisting of leading experts from academia, business, government, and the media. She has been featured in documentaries by PBS, KPCC, wired magazine, and in articles by MIT Technology review, Forbes, Yourstory, O’Reilly media, and so on. Anima received her B.Tech in Electrical Engineering from IIT Madras in 2004 and her PhD from Cornell University in 2009. She was a postdoctoral researcher at MIT from 2009 to 2010, visiting researcher at Microsoft Research New England in 2012 and 2014, assistant professor at U.C. Irvine between 2010 and 2016, associate professor at U.C. Irvine between 2016 and 2017, and principal scientist at Amazon Web Services between 2016 and 2018. Cool things of the week Taking charge of your data: using Cloud DLP to de-identify and obfuscate sensitive information blog Unlocking what’s possible with medical imaging data in the cloud blog Google makes dataset of 50 million drawings available on its cloud blog Machine learning on machines: building a model to evaluate CPU performance blog Interview Anima at TensorLab site NeurIPS site Petition site Name Change (results of the poll) letter Johns Hopkins University letter letter AI Researchers Fight Over Four Letters article From the Board: Changing our Acronym letter TensorFlow site NVIDIA site Question of the week What are some actions I can take if I’m being trolled, harassed and/or bullied online or I want to be proactive about my safety? If you are experiencing harassment, tell someone who can support you, document it, and assess escalating to authorities depending on the severity. Surveillence Self-Defense Preventing Doxxing Where can you find us next? Mark will be at KubeCon in December. Melanie will be at SOCML this week and NeurIPS next week. She’ll be attending WIML, Black in AI, and LatinX.
In episode twenty one of season four we talk about distributed intelligence systems (mainly those internal to humans), talk about what were excited to see at the Conference on Neural Information Processing Systems and in advance of our trek to Canada we chat with Garth Gibson president and CEO of the Vector Institute.
The future of humanity will be shaped by artificial intelligence. Now some of the best brains working on the technology are riven by a debate about a four-letter acronym that some say contributes to the field's well-documented diversity problems. NIPS is the name of AI's most prominent conference, a venue for machine learning research formally known as the Annual Conference on Neural Information Processing Systems.
On the podcast today, we have two more fascinating interviews from Melanie’s time at Deep Learning Indaba! Mark helps host this episode as we speak with Karim Beguir and Muthoni Wanyoike about their company, Instadeep, the wonderful Indaba conference, and the growing AI community in Africa. Instadeep helps large enterprises understand how AI can benefit them. Karim stresses that it is possible to build advanced AI and machine learning programs in Africa because of the growing community of passionate developers and mentors for the new generation. Muthoni tells us about Nairobi Women in Machine Learning and Data Science, a community she is heavily involved with in Nairobi. The group runs workshops and classes for AI developers and encourages volunteers to participate by sharing their knowledge and skills. Karim Beguir Karim Beguir helps companies get a grip on the latest AI advancements and how to implement them. A graduate of France’s Ecole Polytechnique and former Program Fellow at NYU’s Courant Institute, Karim has a passion for teaching and using applied mathematics. This led him to co-found InstaDeep, an AI startup that was nominated at the MWC17 for the Top 20 global startup list made by PCMAG. Karim uses TensorFlow to develop Deep Learning and Reinforcement Learning products. Karim is also the founder of the TensorFlow Tunis Meetup. He regularly organises educational events and workshops to share his experience with the community. Karim is on a mission to democratize AI and make it accessible to a wide audience. Muthoni Wanyoike Muthoni Wanyoike is the team lead at Instadeep in Kenya. She is Passionate about bridging the skills gap in AI in Africa and does this by co-organizing the Nairobi Women in Machine Learning community. The community enables learning, mentorship, networking, and job opportunities for people interested in working in AI. She is experienced in research, data analytics, community and project management, and community growth hacking. Cool things of the week Is there life on other planets? Google Cloud is working with NASA’s Frontier Development Lab to find out blog In this Codelab, you will learn about StarCraft II Learning Environment project and to train your first Deep Reinforcement Learning agent. You will also get familiar some of the concepts and frameworks to get to train a machine learning agent. site A new course to teach people about fairness in ML blog Serverless from the ground up: Building a simple microservice with Cloud Functions (Part 1) blog Superposition Podcast from Deep Learning Indaba with Omoju Miller and Nando de Freitas tweet and video Interview Instadeep site Nairobi Women in Machine Learning and Data Science site Neural Information Processing Systems site Google Launchpad Accelerator site TensorFlow site Google Assistant site Cloud AutoML site Hackathon Lagos site Deep Learning Book book Ranked Reward: Enabling Self-Play Reinforcement Learning for Combinatorial Optimization research paper Lessons learned on building a tech community blog Kenya Open Data Initiative site R for Data Science GitHub site and book TWIML Presents Deep Learning Indaba site Question of the week If I want to create a GKE cluster with a specific major kubernetes version (or even just the latest) using the command line tools, how do I do that? GCloud container clusters create site Specifying cluster version site Where can you find us next? Our guests will be at Indaba 2019 in Kenya. Mark will be at KubeCon in December. Melanie will be at SOCML in November.
This week we are bringing you a couple of interviews from last week’s Deep Learning Indaba conference. Dr. Vukosi Marivate, Andrea Bohmert and Yasin(i) Musa Ayami talk about the burgeoning machine learning community, research, companies and AI investment landscape in Africa. While Mark is at Google Cloud Next in Tokyo, Melanie is joined by special guest co-hosts Nyalleng Moorosi and Willie Brink. Vukosi and Yasin(i) share how Deep Learning Indaba is playing an important role to recognize and grow machine learning research and companies on the African continent. We also discuss Yasin(i)’s prototyped app, Tukuka, and how it won the Maathai Award which is given to individuals who are a positive force for change. Tukuka is being built to aid economically disadvantaged women in Zambia get access to financial resources that are currently unavailable. Andrea rounds up the interviews by giving us a VC perspective on the AI start-up landscape in Africa and how that compares to other parts of the world. As Nyalleng says at the end, AI is happening in Africa and has great potential for impact. Willie Brink Willie Brink is a senior lecturer of Applied Mathematics in the Department of Mathematical Sciences at Stellenbosch University, South Africa. He teaches various courses in Applied Mathematics and Computer Science, at all levels, and his research interests fall mainly in the broad fields of computer vision and machine learning. He has worked on multi-view geometry, visual odometry, recognition and tracking, probabilistic graphical models, as well as deep learning. Recent research directions include visual knowledge representation and reasoning. Willie is also one of the founders and organisers of the Deep Learning Indaba, an exciting initiative working to celebrate and strengthen machine learning and artificial intelligence research in Africa, and to promote diversity and transformation in these fields. Nyalleng Moorosi Nyalleng is a Software Engineer and Researcher with the Google AI team in Ghana. Before joining Google, Nyalleng was a senior Data Science researcher at South Africa’s national science lab, Council for Scientific and Industrial Research (CSIR), with the Modeling and Digital Sciences Unit. In her capacity at CSIR, she works on projects ranging from: rhino poaching prevention with park rangers, working with news outlets to understand social media sentiments, and searching for Biomarkers in African cancer proteomes. Before getting into ML research at CSIR, she was a computer science lecturer at Fort Hare University and a software engineer at Thomson Reuters. Moorosi is an active member of Women in Machine Learning, Black in Artificial Intelligence, and an organising member of the Deep Learning Indaba - a yearly workshop that gathers African researchers in one space to share ideas and grow machine learning and artificial intelligence capabilities. Dr. Vukosi Marivate Dr. Vukosi Marivate holds a PhD in Computer Science (Rutgers University) and MSc & BSc in Electrical Engineering (Wits University). He has recently started at the University of Pretoria as the ABSA Chair of Data Science. Vukosi works on developing Machine Learning/Artificial Intelligence methods to extract insights from data. A large part of his work over the last few years has been in the intersection of Machine Learning and Natural Language Processing (due to the abundance of text data and need to extract insights). As part of his vision for the ABSA Data Science chair, Vukosi is interested in Data Science for Social Impact, using local challenges as a springboard for research. In this area Vukosi has worked on projects in science, energy, public safety and utilities. Vukosi is an organizer of the Deep Learning Indaba, the largest Machine Learning/Artificial Intelligence workshop on the African continent, aiming to strengthen African Machine Learning. He is passionate about developing young talent, supervising MSc and PhD students, and mentoring budding Data Scientists. Yasin(i) Musa Ayami Yasin(i) Musa Ayami is Team Lead at TsogoloTech and a certified Oracle Associate. Mr. Ayami recently graduated with a Master’s Degree in Information Technology at the prestigious Durban University of Technology (DUT) were his study mainly focused on Computer Vision and Machine Learning. Prior to him enrolling for his Master’s Degree, Mr Ayami served as an Intern Software Engineer at DUT’s App Factory where he also served as Team Lead before deciding to further his studies. He also worked as a Part-Time Student Instructor at the DUT. In 2017, he co-founded TsogoloTech. His vision has always been to leverage technology for social good. Andrea Bohmert Andrea Bohmert is a Co-Managing Partner at Knife Capital. Before joining Knife Capital, she was the Founder and Co-Managing Partner of Hasso Plattner Ventures Africa. Passionate about strategizing how to scale businesses and meeting the entrepreneurs responsible for creating them, she has been actively involved in numerous initiatives aiming to accelerate the African entrepreneurial ecosystem. What are you looking forward to this week? AlphaGo Movie site WiML: Women in Machine Learning site Deep Learning Indaba Poster Sessions site Neural Information Processing Systems site Interview Deep Learning Indaba site Deep Learning Indaba GitHub site Deep Learning Indaba Tutorials site Deep Learning Indaba 2018 Slides site Deep Learning Indaba 2017 Presentations videos Deep Learning Indaba X site Yasin(i) Musa Ayami on GitHub site and LinkedIn site Deep Learning Indaba Award Winners site and tweet Maathai Award site Xamarin site SuperPosition at The Deep Learning Indaba with Dr. Vukosi Marivate podcast Knife Capital site Investing in AI by Andrea Bohmert article 10 Defining Moments that shaped the 2016 SA startup ecosystem article Data Science Africa site International Data Week site Google Cloud Platform Credits award winners tweet Question of the week The co-hosts weigh in on our question of the week: What have you taken away from this week and will take forward? Where can you find us next? Mark and Melanie will be at Strangeloop. Willie will be teaching Machine Learning at Stellenbosch University this summer. Nyalleng will be at the Women in Machine Learning Workshop and the Neural Information Processing Systems Conference in Montreal in December.
This 69th episode of Learning Machines 101 provides a short overview of the 2017 Neural Information Processing Systems conference with a focus on the development of methods for teaching learning machines rather than simply training them on examples. In addition, a book review of the book “Deep Learning” is provided. #nips2017
In this episode, we briefly review Item Response Theory and Bayesian Network Theory methods for the assessment and optimization of student learning and then describe a poster presented on the first day of the Neural Information Processing Systems conference in December 2015 in Montreal which describes a Recurrent Neural Network approach for the assessment and optimization of student learning called “Deep Knowledge Tracing”. For more details check out: www.learningmachines101.com and follow us on Twitter at: @lm101talk
This is the third of a short subsequence of podcasts providing a summary of events associated with Dr. Golden’s recent visit to the 2015 Neural Information Processing Systems Conference. This is one of the top conferences in the field of Machine Learning. This episode reviews and discusses topics associated with the Introduction to Reinforcement Learning with Function Approximation Tutorial presented by Professor Richard Sutton on the first day of the conference. Check out: www.learningmachines101.com to learn more!! Also follow us at: "lm101talk" on twitter!
This is the second of a short subsequence of podcasts providing a summary of events associated with Dr. Golden’s recent visit to the 2015 Neural Information Processing Systems Conference. This is one of the top conferences in the field of Machine Learning. This episode reviews and discusses topics associated with the Monte Carlo Markov Chain (MCMC) Inference Methods Tutorial held on the first day of the conference. Check out: www.learningmachines101.com to listen or download this podcast episode or download the transcripts! Also visit us at LINKEDIN or TWITTER. The twitter handle is: LM101TALK
This is the first of a short subsequence of podcasts which provides a summary of events associated with Dr. Golden’s recent visit to the 2015 Neural Information Processing Systems Conference. This is one of the top conferences in the field of Machine Learning. This episode introduces the Neural Information Processing Systems Conference and reviews the content of the Morning Deep Learning Tutorial which took place on the first day of the conference. Check out: www.learningmachines101.comfor additional supplementary hyperlinks to the conference and conference papers!!
In the first episode of Talking Machines we meet our hosts, Katherine Gorman (nerd, journalist) and Ryan Adams (nerd, Harvard computer science professor), and explore some of the interviews you'll be able to hear this season. Today we hear some short clips on big issues, we'll get technical, but today is all about introductions.We start with Kevin Murphy of Google talking about his textbook that has become a standard in the field. Then we turn to Hanna Wallach of Microsoft Research NYC and UMass Amherst and hear about the founding of WiML (Women in Machine Learning). Next we discuss academia's relationship with business with Max Welling from the University of Amsterdam, program co-chair of the 2013 NIPS conference (Neural Information Processing Systems). Finally, we sit down with three pillars of the field Yann LeCun, Yoshua Bengio, and Geoff Hinton to hear about where the field has been and where it might be headed.
SAMOS - Colloquium "Statistiques pour le traitement de l'image" (Conférences, 2009)
In the first part we are interested in finding images of people on the web, and more specifically within large databases of captioned news images. It has recently been shown that visual analysis of the faces in images returned on a text-based query over captions can significantly improve search results. The underlying idea to improve the text-based results is that although this initial result is imperfect, it will render the queried person to be relatively frequent as compared to other people, so we can search for a large group of highly similar faces. The performance of such methods depends strongly on this assumption: for people whose face appears in less than about 40% of the initial text-based result, the performance may be very poor. I will present a method to improve search results by exploiting faces of other people that co-occur frequently with the queried person. We refer to this process as `query expansion'. In the face analysis we use the query expansion to provide a query-specific relevant set of `negative' examples which should be separated from the potentially positive examples in the text-based result set. We apply this idea to a recently-proposed method which filters the initial result set using a Gaussian mixture model, and apply the same idea using a logistic discriminant model. We evaluate the methods on a database of captioned news stories from Yahoo!News. The results show that (i) query expansion improves both methods, (ii) that our discriminative models outperform the generative ones, and (iii) our best results surpass the state-of-the-art results by 10% precision on average. In the second part we are interested in Conditional Random Fields (CRFs), which are an effective tool for a variety of different data segmentation and labelling tasks including visual scene interpretation, which seeks to partition images into their constituent semantic-level regions and assign appropriate class labels to each region. For accurate labelling it is important to capture the global context of the image as well as local information. We introduce a CRF based scene labelling model that incorporates both local features and features aggregated over the whole image or large sections of it. Secondly, traditional CRF learning requires fully labelled datasets. Complete labellings are typically costly and troublesome to produce. We introduce an algorithm that allows CRF models to be learned from datasets where a substantial fraction of the nodes are unlabeled. It works by marginalizing out the unknown labels so that the log-likelihood of the known ones can be maximized by gradient ascent. Loopy Belief Propagation is used to approximate the marginals needed for the gradient and log-likelihood calculations and the Bethe free-energy approximation to the log-likelihood is monitored to control the step size. Our experimental results show that incorporating top-down aggregate features significantly improves the segmentations and that effective models can be learned from fragmentary labellings. The resulting methods give scene segmentation results comparable to the state-of-the-art on three different image databases. Références : T. Mensink & J. Verbeek, Improving People Search Using Query Expansions: How Friends Help To Find People, European Conference on Computer Vision, 2008. J. Verbeek & B. Triggs, Scene Segmentation with CRFs Learned from Partially Labeled Images, Advances in Neural Information Processing Systems, 2007. Cordelia Schmid & Jakob Verbeek. INRIA Rhône-Alpes. Vous pouvez entendre l'intervention, tout en visualisant le Power Point, en cliquant sur ce lien : http://epn.univ-paris1.fr/modules/ufr27statim/UFR27STATIM-20090123-Verbeek/UFR27STATIM-20090123-Verbeek.html. Ecouter l'intervention : Bande son disponible au format mp3 Durée : 1H01 mn