Uncovering Hidden Risks

Follow Uncovering Hidden Risks
Share on
Copy link to clipboard

Welcome to Uncovering Hidden Risks, a broader set of podcasts focused on identifying the various risks organizations face as they navigate the internal and external requirements they must comply with.   We’ll take you through a journey on insider risks to uncover some of the hidden security threats that Microsoft and organizations across the world are facing.  We will bring to surface some best-in-class technology and processes to help you protect your organization and employees from risks from trusted insiders.  All in an open discussion with topnotch industry experts!

Raman Kalyan, Talhah Mir


    • May 26, 2021 LATEST EPISODE
    • monthly NEW EPISODES
    • 26m AVG DURATION
    • 9 EPISODES


    Search for episodes from Uncovering Hidden Risks with a specific topic:

    Latest episodes from Uncovering Hidden Risks

    Episode 8: Class is in session

    Play Episode Listen Later May 26, 2021 32:01


    When Professor Kathleen Carley of Carnegie Mellon University agreed to talk with us about network analysis and its impact on insider risks, we scooched our chairs a little closer to our screens and leaned right in. In this episode of Uncovering Hidden Risks, Liz Willets and Christophe Fiessinger get schooled by Professor Carley about the history of Network Analysis and how social and dynamic networks affect the way that people interact with each other, exchange information and even manage social discord. 0:00 Welcome and recap of   1:30 Meet our guest: Kathleen Carley, Professor at Carnegie Mellon University; Director of Computational Analysis & Social and Organizational Systems; and Director of Ideas for Informed Democracy and Social Cybersecurity 3:00 Setting the story: Understanding Network Analysis and its impact on company silos, insider threats, counter terrorism and social media. 5:00 The science of social networks: how formal and informal relationships contribute to the spread of information and insider risks 7:00 The influence of dynamic networks: how locations, people and beliefs impact behavior and shape predictive analytics 13:30 Feelings vs Facts:  Using sentiment analysis to identify positive or negative sentiments via text 19:41 Calming the crowd: How social networks and secondary actors can stave off social unrest 22:00 Building a sentiment model from scratch: understanding the challenges and ethics of identifying offensive language and insider threats 26:00 Getting granular: how to differentiate between more subtle sentiments such as anger, disgust and disappointment 28:15 Staying Relevant: the challenge of building training sets and ML models that stay current with social and language trends.   Liz Willets: Well, hi, everyone. Uh, welcome back to our podcast series Uncovering Hidden Risks, um, our podcast where we uncover insights from the latest trends, um, in the news and in research through conversations with some of the experts in the insider risk space. Um, so, my name's Liz Willets, and I'm here with my cohost, Christophe Fiessinger, to dis- just discuss and deep dive on some interesting topics.             Um, so, Christophe, can you believe we're already on Episode 3? (laughs) Christophe Fiessinger: No, and so much to talk about, and I'm just super excited about this episode today and, and our guest. Liz Willets: Awesome. Yeah, no. I'm super excited. Um, quickly though, let's recap last week. Um, you know, we spoke with Christian Rudnick. He's from our Data Science, um, and Research team at Microsoft and really got his perspective, uh, a little bit more on the machining learning side of things. Um, so, you know, we talked about all the various signals, languages, um, content types, whether that's image, text that we're really using ML to intelligently detect inappropriate communications. You know, we talked about how the keyword and lexicon approach just won't cut it, um, and, and kind of the value of machine learning there. Um, and then, ultimately, you know, just how to get a signal out of all of the noise, um, so super interesting, um, topic.             And I think today, we're gonna kind of change gears a bit. I'm really excited to have Kathleen Carley here. Uh, she's a professor across many disciplines at Carnigen Melligan, Carnegie Mellon University, um, you know, focused with your research around network analysis and computational social theory. Um, so, so, welcome, uh, Kathleen. Uh, we're super excited to have you here and, and would love to just hear a little bit about your background and really how you got into this space. Professor Kathleen Carley: So, um, hello, Liz and Christophe, and I'm, I'm really thrilled to be here and excited to talk to you. So, I'm a professor at Carnegie Mellon, and I'm also the director there of two different, uh, centers. One is Computational Analysis of Social and Organizational Systems, which is, you know, it brings computer science and social science together to look at everything from terrorism to insider threat to how to design your next organization. And then, I'm also the director of a new center that we just set up called IDeaS for Informed Democracy and Social Cybersecurity, which is all about disinformation, uh, hate speech, and extremism online. Liz Willets: Wow. Professor Kathleen Carley: Awesome. Liz Willets: Sounds like you're (laughs) definitely gonna run the gamut over there (laughs) at, uh, CMU. Um, that's great to hear and definitely would love, um, especially for the listeners and even for my own edification to kinda double-click on that network analysis piece, um, and l- learn a little bit more about what that is and kind of how it's developed over the past, um, couple years. Professor Kathleen Carley: So, network analysis is the scientific field that actually started before World War II, and it's all about connecting things. And it's the idea that when you have a set of things, the way they're connected both constrains and enables them and makes different things possible.             The field first started it was called social networks. This is long before social media. And, um, people were doing things like watching kindergartners play with each other, and they realized that the way which kids played with which, which kids bashed each other over the head with the, their sand shovel was really informative at effect at telling how they would actually do in the various kind of studies they needed to do. The same kind of thing was applied to hermit crabs and to deers and other kinds of animals to identify pecking orders, and, from those groups, and identify which animals had the best survival rate.             Today, of course, the field's grown up a lot, and we now, uh, talk about kind of networks+. So, we apply network science to everything from, you know, how your company ... Where are the silos in your company? Who should be talking to 'em? We also apply to things like insider threat and look at it there to say, "Ah, well, maybe these two people should be talking, but they're not. That's a potential problem," a, and we apply to things like counterterrorism. We apply it to social media and so on. So, people now look at really large networks and very what are called high-dimensional or meta networks such as, who's talking to whom, who's talking about what, and how those ideas are connected to each other. Liz Willets: Awesome. Yeah, I think, I know Christophe and I, we're very interested around that space and thinking about who should be talking to one another, um, you know, as we think about communication risks in an organization, especially in the (laughs) financial services industry. You've got things, um, that, you know, you're mandated by law to, um, kind of detect for like collusion between two parties whether it's your sales and trading group who just should not be, um, communicating with one another. So, I think that certainly applies, um, to your point earlier around the insider threat space. Professor Kathleen Carley: Well, one of the great things in, in, uh, using social networks, especially depending what data you have access to, you may be able to find informal linkages. So, not just who's, uh, formally connected because they're like in an authority relationship, like you report to your boss, but, you know, who you're friends with or who you go to lunch with or, you know, and all these kind of informal relationships. And we often find that those are as or more important for affecting, you know, house, how information goes through a group, how information gets traded, and even for such things as promotion and your health. Christophe Fiessinger: And to not only to, to add to, uh, what you were saying, Kathleen, is like the context is usually important to make an informed decisions of what's going on in that network. Professor Kathleen Carley: And then, cer- Christophe Fiessinger: Isn't that what you think about it? Professor Kathleen Carley: Yeah, certainly. In fact, the context is very important, and it's also important to realize that one context doesn't, um, capture all of s- somebody's interactions, right? So, for example, when Twitter started, people were trying to predict elections from, uh, interactions on Twitter among people. Well, the problem was not only was not everybody on Twitter, so you didn't have a full social network, not all communication even with people who were on Twitter, that's not the only way they communiticated with each other. They might have also gone to the bars together or, or whatever. Liz Willets: Um, I was actually kinda reading through some of your research (laughs) as I was prepping for this interview and, um, read, um, some of your research around the difference between social network analysis and dynamic network analysis. And so, as you think about, kind of as we're talking, contexts and, you know, it's not just maybe the social connections, but it's adding in now the organization or the location or someone's beliefs. Um, I'd love if you could just kind of, you know, double-click there for us and tell us a little bit more about that. Professor Kathleen Carley: Yeah. So, when, um, when the field started, right, people were really dealing with fairly small groups. And so, it was not unusual to say go into a small, like, startup company, and you would have maybe 20, 25 people. Um, for each one of 'em, you would know who was friends with who and who went to 'em for advice, and that was your data set, right? It was all people, and it was all just one or two types of links. Technically, we call that one-mode data 'cause there's only one type of node, and there's two types of links. So, it's t- ... It's multiplex and one mode.             Um, but now what's happened, as the field has gotten grown up in some sense, uh, we're dealing with much larger data sets, and you happen to have multiple modes of data. So, you'll have things like people, organizations, locations, beliefs, resources, tasks, et cetera, and when you have all of that, you have multiple modes of data. And in fact, this is great because you need multiple modes of data to be able to do things like do predictive analytics, but in addition, you have way ... And you have lots of different kinds of ties. So, I not only have ties between people, I have ties of people to these things like what resources they have, what knowledge they have, and so on. So, it's called by bipartite data.             But then, I also have the connections among those things themselves, like words to words, and because you have all of that high-dimensional data and you have it through time, you now have a kind of a dynamic, high-dimensional networks. And so, the big difference here is that you've got more data, more kinds of data, and you've got it dynamically. And we even talk about it sometimes as geospatial because sometimes, you even have locations and you have to take into account, uh, both the distance physically as well as the distance socially. Christophe Fiessinger: Interesting. And Kathleen, I, I, I can't resist- Professor Kathleen Carley: Mm-hmm (affirmative). Christophe Fiessinger: I mean, I got kids and, and, uh, uh, I'm originally from Europe, and the way my k- kids interact with their family non-members, grandmothers in Europe is obviously very different than how I did it when I was growing up. So, to your point on all those dimensions is you also see a difference where a person might talk one way on a channel or, uh, an app and talk another way in another app, and then layer that, you know, I would talk differently on a PC where I get a full form. I can be very verbalist in my email or whatever versus my phone wherever I'm located. Are you seeing some of those patterns as well influence? Professor Kathleen Carley: Absolutely. Yeah, and then they're ... Yeah. And you, you've probably even seen these in your own work lives because, for example, you'll communicate one way on LinkedIn. You'll communicate a different way on Facebook, a different way on Twitter, and a different way in person. So, it also matters what media you're on, and it also matters whether or what kind of others you surround yourself with. I mean, I know people who use different variants of their names on- Christophe Fiessinger: Mm-hmm (affirmative). Professor Kathleen Carley: ... different platforms to signal to themselves, "Oh, when I'm on this one, I don't talk about money," or, "When I'm on this one, I don't talk politics," you know? And so, people not only change how they talk, they change what they talk about, and they change who they talk to. Christophe Fiessinger: Yeah. And I think the personas as well. I've seen my younger one who plays, uh, who does a lot of gaming. Professor Kathleen Carley: Yep. Christophe Fiessinger: Typically, they have their own persona, and, and then obviously, there's a different realm then of, of, of a different network, but they even put a different hat going into that mode of, of talking in the context of a game. Professor Kathleen Carley: Well, and for there, it's just doing a game, right? But what we're actually seeing on social media is, you know, you do see adversarial actors- Christophe Fiessinger: Uh-huh (affirmative). Professor Kathleen Carley: ... under fake personas doing things like trying to do fishing expeditions or trying, you know, trying to convince you that they're just one of the other people in the neighborhood- Christophe Fiessinger: Yeah. Professor Kathleen Carley: ... and they really aren't, you know, and try, and trying to suck you into things. Christophe Fiessinger: Yeah. Professor Kathleen Carley: So, we see a lot of that as well. Christophe Fiessinger: Yeah. Liz Willets: Grooming. Christophe Fiessinger: I guess grooming is also not a new problem but also something that, that's present in those communities or anywhere. Professor Kathleen Carley: Yeah. Liz Willets: Definitely, and I think what we've seen especially with the pandemic is, yes, you might have these different personas, um, but now, like your, your home is become your workplace. And so, how you might have typically behaved, um, you know, when you'd come home at the end of a long day versus now, you're in the context of work. Um, you know, I think we've seen a lot of organizations think about the risks that, that that could pose, um, in addition to all the other, um, you know, (laughs) stresses that people have on their day-to-day lives.             Um, but I think it's interesting, um, to your point earlier around, you know, having all the context. Um, you know, we're seeing signals come through from Teams, email, Zoom, uh, you know, social media, et cetera, and, uh, um, also detecting for things like repeated bullying, um, behavior. And so, it's not just, uh, a way f- to your point and around using the analytics to predict something, but it's also to say, "Hey, this is a pattern, and, uh, you know, we should probably step in and do something about it." Professor Kathleen Carley: Yeah, absolutely. And I think people are becoming more aware of these patterns themselves because they're actually not just seeing their own communication. They're actually seeing their kids' communication or their parents communication or whatever. And so, they're starting to realize that the people around them may be comm- communicating in ways that impacts them, and so there's a variety of now new technologies that people are talking about trying to develop to try to help people manage this more collectively. Liz Willets: Definitely. And I think, um, you know, another area that I'd love to explore with you is just around sentiment analysis. So, you know, you have all these signals, but, um, how do you know if someone's talking about something positively or negatively, um, and g- kind of would love to kind of hear if you've done any research in that spaces? Professor Kathleen Carley: Oh, yes, we ... Yeah. I and my group, of course, we do a lot of work on sentiment. So, um, so, sentiment is one of those really tricky things when you're, uh, when you're not there because it depends on how many different modalities you have. Like, if you only have text, it's harder to detect than if you have text plus images, which is still harder than if you also have sound. So, the ... So, it's kind of tricky, and there's new techniques for all of those.             But let's just think about text for the moment. The way people often de- try to detect sentiment and then where they started out was just by, um, counting the number of positive versus negative words. Okay? And that's kinda okay, but it more tells you about overall, was the message kind of written from an upbeat or a downbeat kind of way. That's really all it really tells you, but people thought that that meant that if there was a something they cared about, like let's say I wanna know if it's about vaccines and are they happy about the vaccines or upset. Well, they would just say, "Here's a message. It has the word vaccine in it. Oh, there's more happy words than sad words, so it must be positive toward vaccines." No. Not even close.             Because locally, it coulda been, "I'm so happy I don't have to take the vaccine." That woulda come out as overall positive, but it's really negative about the vaccine. So, then, the people came up with loads. So, then, we work on locals then, but how do I tell for a particular word?             But the thing is when I make a statement like that, that's out of context still because there could've been this whole dialogue discussion, right? And in the ... And when we actually then looked at, at, at these kind of sentences within the context of the discussion, over 50% of time, we had to change our mind about what the sentiment really was in that particular and what was really meant, you know?             And then, there's issue of sarcasm and humor, which we were terrible at detecting, right? Liz Willets: (laughs) Professor Kathleen Carley: And so, peep ... And one of the ways people start to detect that is by looking at what's written and then looking for an emoji or emoticon, and if it's at the opposite sentiment of the what's written, you go, "Ah, this must be a joke." Okay? Christophe Fiessinger: Or just sarcastic again. Professor Kathleen Carley: Yeah. So, it goes cra- ... It goes on and on from there, but there's a couple of a ... There's ... That's kind of the classic line. And now, of course, we do all that with machine learning as opposed to just based on just keywords.             But there's two other things that are in the sentiment field that people often forget about. One is, um, these subconscious almost supplemental cues that are in messages. So, when you write things and use images, your reader will pick up on things in it and it will cause them to respond and with particular emotional reactions.             So, for example, you've probably gotten an email or a text from someone where it was in all caps, and your, and your initial response is, "Oh, my gosh. They must be mad at me," right? Or, "What did I do wrong now?" It's like, "Oh, okay." But that's a subliminal cue, okay? It's like things like all caps, use of pronouns. There're special words that people use that will evoke emotions in others, so we look for these subliminal cues also.             And, uh, an emergent field is looking for these in images, like the use of light versus dark images, the use of cute little kitties, right? Christophe Fiessinger: Yeah. Professor Kathleen Carley: There's a whole bunch of things that people know now make them happy. And then, so, that's another aspect of it.             And then, the third aspect of it is that, um, sentiment is actually very tied to your social networks. Your emotional state is tied to your social networks. So, the more I can get you excited either really happy or really angry, the more I can change your social network patterns. So, we can actually look at for our detections in changes in social network patterns as a way of figuring out something about sentiment as well. Liz Willets: Interesting. So, are you saying essentially that through your social networks, it kind of like reinforces or, or strengthen, strengthens your connections with that group that you're identifying yourself with? Professor Kathleen Carley: So, I'm saying that, well, it does. It's kind of a cycle because your mind likes to, um, maintain balance, okay? It likes to be emotionally balanced. You don't ... You really don't like to be overly excited in any direction, right? Most people don't. And so, if something's making you very uncomfortable, you will either ... If it, like, your connection with someone's making you, uh, very uncomfortable, you will either change your opinion to be more like theirs so you're less more comfortable, or you will drop your connection with that person. So, your affect of your emotional state modulates your social networks, and your social networks that affect what information and emotions come to you and modulate what emotions you have. So, it's kind of this cycle. Christophe Fiessinger: Then- Professor Kathleen Carley: And so, we actually can watch this happening in groups where I can form them into ... I can prime groups to be ready to be emotionally triggered simply by building up social network connections among them. And then, I can emotionally trigger them, and the people in them will either get more involved in the group or they'll say, "I'm not really feeling comfortable anymore. I'm gonna leave." Christophe Fiessinger: Mm-hmm (affirmative). I'm sure you've got a trove of data to research with COVID or with recent election in the U.S. that would- Liz Willets: (laughs) Christophe Fiessinger: ... that would prove those theories of the relationship between your social network and h- your, your sentiment, right? Professor Kathleen Carley: Yes. Yeah. Yeah. Christophe Fiessinger: Well, actually, going back, tying this to, um, to what you were mentioning earlier, Kathleen, like, sometimes, we say that the conversation at the edges are, are the one, um, are the highest risk one, and the ones that are happening on the fringes and, you know ... A- And then, you add to that like something you mentioned earlier which is a, and also looking at how you, how you are potentially detecting like social unrest and things like that. And, and because those are like at the fringes, it might start very small in a network with very few people, but it could definitely have a network effect very quickly. How do you find those needles that that did, didn't exist before the, a theory, a pattern, an opinion? Professor Kathleen Carley: So, the short answer is it's really hard, and we're not good at it yet. (laughing) Christophe Fiessinger: Okay. Professor Kathleen Carley: Um, but there's a couple of techniques that first off, sometimes, you find 'em by luck. You just happened on 'em. Sometimes, you find 'em just through, um, good journalistic forensics, um, and sometime, but sometimes, we can aid and help that a bit by actually looking for, um, critical secondary actors. Christophe Fiessinger: Sure. Professor Kathleen Carley: And these are like there's these kinda network metrics for finding these kinda critical secondary actors, and we look for those because those are the kind of actors that could emerge into leaders of these kinds of things. So, they're kind of ... It's not quite anomaly detection, but it's kind of like anomaly detection for networks. Christophe Fiessinger: Oh. Is it kind of like that secondary actor is potentially a placebo that could flip and you're trying to either a, a change compared to that, that baseline? Professor Kathleen Carley: I think that's probably the wrong, the wrong model of it. Christophe Fiessinger: Okay. Professor Kathleen Carley: Like, uh, a s- secondary actor is often someone who does things like brokerage relationships between two big actors, okay? Christophe Fiessinger: Ah, okay. Professor Kathleen Carley: Yeah. Christophe Fiessinger: So, that person would be potentially more of a f- will ... Pride, whatever, would be a fire starter and will accelerate on that. Professor Kathleen Carley: Yeah. Exactly. Christophe Fiessinger: Two people having a point of view to suddenly a wildfire is spreading out across the entire network. Professor Kathleen Carley: Exactly. Yeah. Christophe Fiessinger: Okay, I get it. Thanks. Liz Willets: Yeah, but back to your point around some of the challenges with the, for example, detecting sarcasm, and is it an emoji? Um, would love to hear your thoughts on just some of the other challenges more generally if you're thinking about building, uh, a, a sentiment model from scratch, um, whether it's for, you know, threats or offensive language, um, or things like burnout and suicide. Um, how do you go about doing that, and how do you go to do about doing that in an ethical, um, manner? Professor Kathleen Carley: Okay. So, um, so, one of the challenges is culture and language because the way we express sentiment vary, differs, even though there's like basic emotions that are, that are built in cognitively in our brain. The way we express those is socially, culturally defined. Christophe Fiessinger: Mm-hmm (affirmative). Professor Kathleen Carley: So, one of the big issues is making sure you understand the culture and the language that's associated with it. So, that's part of it.             The second, a second, uh, critical thing is the fact that, um, when people express themselves, when you're using, and if you're mainly using online data, um, people can go silent, in which case, you don't have any data. Your data could just be a sample. They could choose to enact one of their personas and be lying. Christophe Fiessinger: Yeah. Professor Kathleen Carley: So, there's lots of ways in which your data- Christophe Fiessinger: Mm-hmm (affirmative). Professor Kathleen Carley: ... itself could be wrong, okay? And that's another big challenge in the area. So, those, I would say, are, uh, so those are examples of some of the challenges in addition to having to have the whole discussion and having to, you know, be careful what you're looking at sentiment around and so on.             So, from an ethical perspective, um, I would say that part of this is, is that when you're collecting data and trying to analyze it and create, like a model for one of these issues, one of the biggest chall- one of the biggest issues is making sure that you haven't over focused on a certain class of people, like only focused on young white guys or only focused on, you know, um, agent, uh, Hispanic women. You wanna make sure that you're as k- much as possible balanced across the different kinds of publics you want to serve. So, that's, that's part and the ... That's one of the challenge, or one of the kind of ethical guidelines and challenges at the same time.             Um, the other part is if you were actually going to, to intervene, then you'd need to think about intervention from a, you know, what does the community consider appropriate ethically within that community for the way you intervene? And the answer may be very different if you're talking about, you know, intervening with children versus intervening with, uh, young adults versus intervening with people with autism. So, so you need to look at it more from a community perspective. So, those are two I would raise. Liz Willets: That's fine. Yeah. I think, um, you know, especially at Microsoft, we are committed to having unbiased, um, training data so that we aren't, you know, discriminating by against someone because they have these, um, certain characteristics, um, and definitely keep that top of mind, um, as well as, you know, remediation and, and how do you go about now that you've identified that this person is at risk for whatever, uh, reason? Now, how do you reach out to them and give them the support they might need, or how do you alert, um, you know, someone who, who might need to step in? And so, I think that's been, um, a really interesting challenge that we're digging into on our end as well.             Um, and I think to the first piece you were talking about just more generally the challenges, um, I know you've done some research around control theory, um, and would love to get your perspective on, you know, especially, uh, in some of these more granular sentiments. Like, how do you differentiate between anger, disgust, disappointment, um, and, and really, um, kind of define exactly what you're looking for in the communications to pull that out? Professor Kathleen Carley: Yeah. So, um, basically, we, we start with what are thought to be the basic emotions, the ones that are built in cognitively? So, and we would take those ones, and those, you can distinguish fairly reasonably on the basis of the cues I was talking about, and they're kind of big swaths of things. Of course, most of the basic emotions are ones that are kind of more on the negative side, so it's really more on the positive side, discriminating, you know, happy from ecstatic from mildly amused. That, it's much harder there 'cause that's, none of those are ba- basic, just happiness is basic, right? All the others are variations of happiness.             So, we start with the basic of the emotion and try to discriminate into those categories, and to go further than that, we often find we don't need to. If we need to, um, then really, it's because the context demands that you have to pay attention to a parti- ... So, you're looking for something particular in a particular environment. And so, then, we let the context dictate what the difference is that it's interested in.             Um, so, for example, if I, if I was doing this for Disney for, you know, people's response to a new ride, for example, that context would dictate that what I really wanna focus on is not just happiness but their satisfaction and pull that out. And so, then, I would actually develop my technology around that, around the, the different people who fell into the different categories, and I might do it first by getting survey data or something like that. Christophe Fiessinger: Yeah. Professor Kathleen Carley: But, you know, you said something that made me realize that I hadn't mentioned one of the major challenges- Christophe Fiessinger: That was good. Professor Kathleen Carley: ... that, um, people often overlook because we're so in love with machine learning, right? And we so think, "Training sets," right? Well, the trouble is, in a social space, your training sets are yesterday's news. Christophe Fiessinger: Yeah. Professor Kathleen Carley: They're never up-to-date. They're always, they're always a mess, and a lot of things where you wanna use sentiment and wanna look at behavior of people, you don't have time to build a training set. So, this is an area where we really need new technologies like match functions and things like that, or where you can just get the bare minimum training set and then do some kind of leapfrogging on it. Christophe Fiessinger: Yeah. I think it- Liz Willets: Yeah. Christophe Fiessinger: I think this is to, to that point. I can relate to that. I think the ... And also, what you were s- saying early on the key part where you look at demographics or what is that target audience with that pattern you're trying to detect is even that let's say that sp- specific demographics, you did a good job on day zero. We know language is this constant evolving function, and just because, to your point, you know, it was yesterday's data set. Just because you would put ball on sweats to do a white paper to detect blah for those demographics. Professor Kathleen Carley: Yep. Christophe Fiessinger: That was great at that point in time, but I'm sure it already changed rapidly because of, of the today's availability of social network and things like that, you know. My, when I was visiting Europe, my, my nephew and niece speak English from what they've seen on YouTube and Netflix. Professor Kathleen Carley: Yeah. Christophe Fiessinger: So, just it'll almost feel like language is even moving faster with that, uh, availability of, of that, all those tools worldwide that's making researchers I'm sure his job even harder to stay up-to-date. Professor Kathleen Carley: Absolutely. (laughs) Yeah. The level of new jargon and new phrases out, it's crazy. Liz Willets: (laughs) Christophe Fiessinger: Yeah. Liz Willets: And that's not just in English, too, you know? Professor Kathleen Carley: That's right. That's right. Liz Willets: We were talking last week with Christian around languages (laughs) and, you know, how many languages there are in the world and how you have to kind of build your models to be trained to kind of reason over, uh, you know, diable, double-byte characters and, um, you know- Professor Kathleen Carley: Yep. Liz Willets: ... Japanese and, and Chinese characters. And so, it just (laughs) it's never ending. Christophe Fiessinger: Yeah. Professor Kathleen Carley: And sometimes, the fact that you have the multiple character sets and multiple languages can be diagnostic, right? So, for, like, when we look at, um, response to say natural disasters in various areas, typically, people, when they communicate online, will communicate in one language with others in the same language. And there'll be a few people who will communicate in multiple languages, but they'll have different groups like, "Here's my English group. Here's my Spanish group." Okay?             But during a disaster, you'll see, actually see more messages come out where you've got mixed part English, part Spanish in the same message- Christophe Fiessinger: Mm-hmm (affirmative). Professor Kathleen Carley: ... and, and so it can be diagnostic of, "Oh, this is a bilingual community," for example. Liz Willets: Interesting. Christophe Fiessinger: Interesting. Liz Willets: Well, great. I know, um, Kathleen, I have certainly learned a lot and wanna thank you again for, for joining us today. Um, Christophe, I thought that was a great conversation. Christophe Fiessinger: Yeah. I, after that, I wish I was a student, and I could join, uh, CMU and be one of your students and write a PhD. It sounds like a infinite number of fascinating topics so and, and research topics, so it sounds- Professor Kathleen Carley: Well- Christophe Fiessinger: ... very fascinating.

    Episode 7: Say what you mean!

    Play Episode Listen Later May 26, 2021 28:53


    Oh my gosh Oh my gosh, I’m dying. Oh my gosh, I’m dying.  That’s so funny! And in just three short lines our emotions boomeranged from intrigue, to panic, to intrigue again…and that illustrates the all-important concept of context! In this episode of Uncovering Hidden Risks, Liz Willets and Christophe Fiessinger sit down with Senior Data Scientist, Christian Rudnick to discuss how Machine Learning and sentiment analysis are helping to unearth the newest variants of insider risks across peer networks, pictures and even global languages. 0:00 Welcome and recap of 1:25 Meet our guest: Christian Rudnick, Senior Data Scientist, Microsoft Data Science and Research Team 2:00 Setting the story: Unpacking Machine Learning, sentiment analysis and the evolution of each 4:50 The canary in the coal mine: how machine learning detects unknown insider risks 9:35 Establishing intent: creating a machine learning model that understands the sentiment and intent of words 13:30 Steadying a moving target: how to improve your models and outcomes via feedback loops 19:00 A picture is worth a thousand words: how to prevent users from bypassing risk detection via Giphy’s and memes 23:30 Training for the future: the next big thing in machine learning, sentiment analysis and multi-language models   Liz Willets: Hi everyone. Welcome back to our podcast series, Uncovering Hidden Risks. Um, our podcasts, where we cover insights from the latest in news and research through conversations with thought leaders in the insider risk space. My name is Liz Willets and I'm joined here today by my cohost Christophe Feissinger, um, to discuss some really interesting topics in the insider risks space. Um, so Christophe, um, you know, I know we spoke last week with Raman Kalyan and Talhah Mir, um, our crew from the insider risk space, just around, you know, insider risks that pose a threat to organizations, um, you know, all the various platforms, um, that bring in signals and indicators, um, and really what corporations need to think about when triaging or remediating some of those risks in their workflow. So I don't know about you, but I thought that was a pretty fascinating conversation. Christophe Feissinger: No, that was definitely top of mine and, and definitely an exciting topic to talk about that's rapidly evolving. So definitely something we're pretty passionate to talk about. Liz Willets: Awesome. And yeah, I, I know today I'm, I'm super excited, uh, about today's guests and just kind of uncovering, uh, more about insider risk from a machine learning and data science perspective. Um, so joining us is [Christian redneck 00:01:24], uh, senior data scientist on our security, uh, compliance and identity research team. So Christian welcome. Uh, why don't you- Christian Redneck: Thank you. Liz Willets: ... uh, just tell us a little bit about yourself and how you came into your role at Microsoft? Christian Redneck: Uh, yeah. Hey, I'm Christian. Uh, I work in a compliance research team and while I just kinda slipped into it, uh, we used to be the compliance research and email security team, and then even security moved to another team. So we were all forced to the complaints role, uh, but at the end of the day, you know, it's just machine learning. So it's not much of a difference. Liz Willets: Awesome. And yeah, um, you know, I know machine learning and and sentiment analysis are big topics to unpack. Um, why don't you just tell us a little bit since you've worked so long in kinda the machine learning space around, you know, how, how that has changed over the years, um, as well as some of the newer trends that you're seeing related to machine learning and sentiment analysis? Christian Redneck: Yeah. In, in our space, the most significant progress that we've seen in the past year, was as moving towards more complex models. The more complex models and also more complex way of analyzing the task. So if you look at the models that were very common, about 10 years ago, they basically would just look at words, it's like, uh, a set of words. Uh, so the order of words don't matter at all and that's changed. The modern algorithms, they will look at sen- sentences as a secret before and they will actually think the order of the words into account when they run analysis. The size of models has also increased dramatically over the years. So for example, I mentioned earlier that I've worked the email security at the [monastery 00:03:04] that we had shipped. They were often in the magnitude of kilobytes versus like really modern techniques to analyze the pensive language. They use deep neural nets and the models they can be the sizes of various gigabytes. Christophe Feissinger: What's driving that evolution of the models. Uh, you know, I'm assuming a, a big challenges to, uh, or a big goal is to make those model better and better to really re- reduce the noise and things like false positives or, or misses. Is that what's driving some of those things? Christian Redneck: Yeah. So at the end of the day, you know, the model size of translates in the complexity. So you can think of, um, the smaller model is basically they have very levers on how to modify their decision. If you have a very large model, it will just have that many more levers. If you wanna capture the variation that you have in your data set, often you need a lot of these levers and new models provide them. It's not just that, uh, there's one thing I didn't mention explicitly, the newer models... So traditionally old models, they were trained on the relatively small set of data that's split into two parts, the positive set, the negative set. And basically the machinery model was kinda trying to draw a boundary between them.             The more modern model affected rates factor different. Uh, we do something called pre-training, which means that we train a model on neutral data, which are neither positive, nor negative to just capture elements of language. So once the model is loaded up with like huge, huge amount of data, huge amount of this neutral data, then we start feeding into positives and negatives to draw the boundary, but it can use all this information that is gained from the general language to make that decision. Liz Willets: That's super interesting. Um, you know, when I think about technology and kind of leveraging, you know, the machine learning to get an early signal, um, you know, something like discovering a canary in a coal mine, um, you know, how do you go about, it sounds like we're feeding positives and negatives towards neutral data, but how do you go about finding like the unknown unknowns and, um, you know, maybe identify risks that you may or may not have been aware of previously, um, with these types of models? Christian Redneck: It, it's the, at the end of the day, it's the neutral. So the way you can see it as that is you feed it a few, say positives, um, known positives. And that gives you an idea of where, you know, we know that possible attacks are, but then what's happening is it's using all this language is learned from the neutral data to consider like, okay, w- we've had to state our point, but everything that is like semantically close to that is most likely also something that we wanna target. And, and that's really, that's really the recipe. I mean, th- th -the ML that we're using, it doesn't have magical capabilities. It can really detect patterns that we haven't had before. It, it's possible in other parts of the incident risk space, if you rely on anomaly detection. Um, so not only in tech, in some sense, anomaly detection is a, is not a negative approach.             So now in our approach, we have the positives and that's our starting point and for the positives for trying to see how far we can generalize from those to, to, to get a wider scop. In, um, what I mentioned in, uh- Christophe Feissinger: Anomaly detection. Christian Redneck: ... anomaly detection, thank you so much for Christophe. It, it's kind of the opposite. You're trying to learn from the negatives. You're trying to understand what the typical state of the, of the system is and everything which deviates from it is anomaly that you might wanna look into. So that has more abilities to detect things which are completely unknown. Christophe Feissinger: Yeah. Liz Willets: Love it. That's super talenting from both, both perspectives. Christophe Feissinger: That's, uh, I think, just to step back and, and to make, um, the audience appreciate, um, the complexity is, you know, a simple sentence. Like if I sent a, a team message to Liz and say, I will hurt you. Again, so first of all, there's no foul language. It's perfectly okay obviously, but the words that sentence targeted at someone else could mean a potential, uh, threats- Christian Redneck: Right. Christophe Feissinger: ... um, or harassment. And so for the audience, the challenge here is not to detect every time the word, uh, hurt, because hurt could be, uh, using perfectly acceptable context, but here targeted at someone, uh, that set of words potentially could be a risk. And I think- Christian Redneck: Right. Christophe Feissinger: ... that's the, that's the journey you've been on, uh, as well as, uh, the rest of the research team. And that's where you can just do look at single words, you get to look at a sentence, right Christian? Christian Redneck: Yes. That's exactly right. So older ML algorithms, they will just see the I, the will, and the hurt, kind of independently, and then do a best guess based on the presence of any of these words, more modern algorithms they will actually look at the sequence. I will hurt. They're perfectly capable of learning that the combination of these three words in that order is something that's relevant versus if they come in a different order or dependent of, uh, you know, in a different context, then it might not be possible. And let me pick up what Liz had mentioned earlier. So modern algorithms, if you train it or something like I will hurt you as a positive, it'll understand that there's a lot of words which are similar to hurt, which kind of have the same meaning. So it will also pick up on something like, I will kill you. Uh, I will crush you, even though you haven't fed those into the positive set. Christophe Feissinger: But that all falls into that kind of threat, which- Christian Redneck: Yes. Christophe Feissinger: ... stepping back is a risk soon as someone starts using that language, maybe, maybe they are actually meaning those things and they're gonna escalate or transition to physical threat. Christian Redneck: That's a real possibility. Yes. Christophe Feissinger: Okay. Liz Willets: Definitely. Yeah. I think it's interesting, 'cause I kinda feel like where you're headed with this is that you can't just use keywords to detect harassment. You know, it's kind of like thinking about overall sentiment and, and tackling sentiment is not, um, you know, an easy thing to do, you know, looking at keywords, won't cut it. Um, and would love to get your perspective, Christian, you know, from an intelligence and modeling view around identifying that intent versus just the keyword level. Um, you know, how do you get a string of words together that might indicate that, uh, that someone's about to, you know, harm someone else? Christian Redneck: Yeah. So first of all, you're right. Keywords by themselves they're usually not sufficient to solve this problem. They are very narrow, very focused problems where keywords might get you a long way. Like say, if you care just about prof- let's take the example of profanities. You care, just the profanities. There's a lot of words that you can put in the keyword filter, where we're gonna do a fine job? And this classifier is actually gonna do quite well. You're gonna start seeing borderline cases where it's gonna fail. So, you know, there, there are some words that are profanities in one context, but there are perfectly normal words in another context. Um, I mean, I don't wanna use profanities, but most of you might know that a donkey has a synonym, which actually is a swear word.             So if you including in your list, then obviously you will hit on the message every time that someone actually means to use the word that's if it was donkey, but from a profanity, you can get a long way. If you look at things like threat, it's pretty much what Christophe said earlier. Um, all three words I will hurt, uh, forwards. I will hurt you. Each of those words will appear most of the times in a perfectly, uh, normal context where no harassment or no threat that's present. Christophe Feissinger: Right. Christian Redneck: So you can put any of those into your keyword list. You can say, okay, I can evolve my model from a keyword list to a key phrase list. You can say, uh, I will actually take small phrases and put them into my list. So instead of just, will, or just hurt, you will put in, I will hurt you and I will kill you. But now the problem is that, you know, there's a lot of different ways in which you can combine seemingly in the normal words into a threat. And this is ext- incredibly hard to numerate all of them. And even if you were to numerate all of them, you know, the language of waltz, it, it might be something that is good today, but in maybe half a year, your list will, you know, will not update. If you have ML models, this problem gets solved in a very convenient way.             So first of all, the model by default kinda understands variations of language due to this pre-training. So we'll already capture a lot of variations that correspond to one of your input examples. And second of all, it's relatively easy to retrain these models based on new information that's coming in. So if you install like say a feedback loop, you give customers the possibility of saying, okay, Hey, look, this is another example, uh, that I've found that I would like to target. It can very easily be incorporated to the model and then not only catch this, but a lot of additional variations of a new set, this stuff came up. Christophe Feissinger: Yeah, I think, yeah, I think, uh, the, I think what's important here is this is not a static, it's a moving target because like you say Christian, language evolves, you know, there's always a new generation, there's a new slang thanks to social media that spreads rapidly and new way to hurt or insult someone or to harass or whatever it is. Um, and it evolves. So I think it's, it's, you're right. That it's a moving target. So it's all about the learning part of machine learning to either, like you say, identify new part that didn't exist before because language evolve or dismissing what we call false positives. So if I'm a seller and say, I will kill it this quota, I mean, norm, I mean, like I'm gonna exceed my quota and maybe the model caught that and we need to say that's okay. That, that, that sentence I'm gonna kill my quota is okay. Uh, hurting someone else not. Okay. Liz Willets: Yeah. And I'd love to learn a little bit more, you mentioned this feedback loop kind of, can you tell us a little bit about behind the scenes, on what that looks like? You know, how, how you might see, uh, a model improve based on those, um, feedback points that, um, you know, end users might be giving to the model? Christian Redneck: Uh, I'll try my best (laughs). So like, you know, thinking about it being a lance and lance doesn't quite hit the target. If you feed it, if you feed it a new item back, it will move this lens slightly closer to the target. And if you keep doing it, it's gonna do that until it actually hits the target. And not just the target, once again, the ball can generalize, so it will hit everything that's kind of similar. Christophe Feissinger: Yeah. Just to add to that, I think, um, in addition to the model, again, you get, uh, listeners gotta remember that it's, it's an evolving target and that Christian say you're seated with data and we do our best to have representative data. But again, the world of languages is so fascinating because the permutations are infinite. You know, we haven't even talked about multi language support in globalization, but you can imagine that, uh, even in words, a lot of people might swap letters with, uh, symbols or, or just to try to get away with, with whatever, um, things are trying to do. But it's, you can, basically where the point is, the combinations are infinite.             So the only way to, to tackle that is to continue to learn and evolve. And for us to learn, that's when we need a feedback, not just from, let's say one industry in one region, uh, but from all industries across the world, as much as, as a school district in the US has a manufacturing, a manufacturer in the UK or whatever. Um, so it's, it's definitely, uh, a fascinating field where w- you know, we can, we're continue invest. Liz Willets: Yeah. Christophe Feissinger: What do you think Christian. Christian Redneck: Yeah, no. I completely agree. And at the end of the day, the same image, so the difference is you have a target, which is moving, and you have your lens, which is kind of like trying to catch up to it. It's a bit of a curse of the mail that you always a bit behind. So you always have to rely on people giving you samples, which usually means it's violations, which have already occurred. But at the same time, the retraining cycles, they're, they're fairly short. So you can adapt quite quickly to do information and adjust to new items that you would like to catch with your model. Christophe Feissinger: Yeah. Is it, is it a good analogy Christian, to draw from things we do on the security front or a malware phishing or virus it's an evolving target? Christian Redneck: Oh, absolutely. Uh, [inaudible 00:16:22] the risks in cyber security or, yeah, the overlap is massive. If you think about it. I mean, the way I like to think about it is that security kind of deals with the external attackers versus inside the risks and do some insight internal attackers. So you can see that, that the overlap assists, you know, very big, almost everything we doing compliance, we do security is very similar way. So for example, we have a lot of ML models deployed into production. They get retrained on a regular basis with new data, but there's insecurity. You know, there's a lot of other features that you can use as attack vectors, and then we have a lot of models built around those. Christophe Feissinger: Christian, how about the, one topic that I think is also, we hear a lot is sure you get valid feedback, but valid feedback is bias and someone's trying to, instead of improving the detections, trying to take it, introduce bias, whether it's racial or, or sexual nature, whatever. H- h- h- how do you make sure you mitigate for that type of, I guess, junk feedback or bias feedback? Christian Redneck: Yeah. Junk feedback is, is indeed a problem. There, there's a few things that you can do. Uh, first of all, we don't usually accept feedback from everyone, but the feedback we accept is usually people from admins and admins, we know our understanding is that they have a certain amount of knowledge that they can use to get feedback. Christophe Feissinger: Hmm. Christian Redneck: And that's particularly true if they get the feedback you're looking on from end users. So we usually, they won't just blindly trust them, but, but they will look at it, at it, and then only if it's right- Christophe Feissinger: And [inaudible 00:17:57] trash. Christian Redneck: Right, tri- [inaudible 00:17:59] trash. Thank you. So that's one way, um, then generally we don't just, so we're not rebuilding the amount of the data and then just automatically pushing it. There's actually a whole system, which ensures that whatever new model we've built is better than the previous model. So if someone feeds in poor feedback, you would expect that the model it gets worse, does worse of the test set. And in that case, we would publish this model and just discard the feedback and move on. That might store that data that will slow down the process. But at the same time, it ensures that the models will degrade and actually get better. Christophe Feissinger: No. So again, do you think saying, we do have a rigorous process to make sure that- Christian Redneck: Yes. Christophe Feissinger: ... a blind, doesn't blindly me, uh, role in production versus the quality along the way to make sure it's converging not diverging. Christian Redneck: Yes. Liz Willets: Definitely. Yeah. And I think having those responsible AI and ML practices is again, to your point earlier, Christophe, something that's always top of mind for us, anything concerning privacy (laughs), uh, really in this day and age. Um, but to kinda just change gears a little bit here. Um, last week, when we spoke with [Grumman Tolobby 00:19:07], we got into the conversation around like GIPHYs and Memes et cetera. Um, and you know, thinking about how we can prevent users from trying to bypass detection, um, whether it's putting inappropriate language into images, um, and you know, trying to think about how you might extract that text from images. Um, we'd love to hear if you can talk a little bit to, to that side of things. Christian Redneck: Yeah. Um, I'm actually not an expert in the area, but, uh, image recognition is, is in general variable theory. It's actually a lot more involved than, than text processing. Almost everything we have done text processing we kinda stole from the people that have previously done an image processing. Like for example, the pre-training that I, that I mentioned earlier and in particular of their excellent bottles which, uh, can extract text from images. So I, I don't know what Microsoft version is called, but it is very, very good. You can almost be guaranteed that if you have an image, we can extract the text that appears at the image, and then just process it through our regular, uh, channels.             So that's regarding texts and images. If it comes to images theselves and that's something that actually our team doesn't do directly, but there are lots of models which, uh, target, let's say problematic images. So what I've mostly seen is detection of adult images and gory images. Christophe Feissinger: Yes. Christian Redneck: And usually these classifiers, they actually operate in almost the same way as to [inaudible 00:20:47] I mentioned earlier. They start, so they're usually very big models. They start by pre-training them on just any kind of images. So they use these huge collection of public images to train the model and just kinda learns patterns. And in this case, you know, patterns are literally like visual patterns, they'll understand round shapes, square shapes. It will understand, it will have a vague understanding of the shape of a human than all sorts of different configurations. And, you know, of course, it can also understand the different color shadings. So models like that, they'll probably learn that if you have, uh, from human shaped with a lot of red on it, then it's probably more likely that there's, that you've already image as opposed to a promoter human with a lot of purple on it or a green on it. Liz Willets: That just kind of reminded me of something, you know, when, when you see those images and you're extracting that text, we're also still able to provide that feedback loop. Um, because I do remember we had one case where, you know, we were working with this school district and they all of a sudden started seeing a lot of homework assignments, um, being flagged for gory images. And it came down to the fact that the teacher was using red pen to kind of, you know- Christian Redneck: Yes. Liz Willets: ... mark up the student's test or quiz- Christophe Feissinger: Yeah. Liz Willets: ... or whatnot. And so there's always, you know, that feedback loop top of mind. Christophe Feissinger: Yes. Christian Redneck: Yeah. I think that ties back to, I think, to, uh, exactly what Christian was saying that obviously with a pandemic now, everything is online and doing annotation of maths exercise with a red pen. Uh, I guess the initial training set didn't take into account that type of data like in a school district, uh, using modern digital tool to do math assignments. And so that's a perfect case that, yeah, it detected those as potentially gory because it was a lot of red inking on a, on a white background with formulas. Uh, and, but again, it gets back to what Christine was talking about. Then we pass that feedback. So pretty much like we, the text detection need to evolve that image detection of what, what is defined as gory needs to ignore forming us with red annotation and start to be a little more, to be refined to avoid that in the future, because that's what we would consider a false positive. So it equally applies that any model, whether it's text or image, there is always that virtual cycle of, of constantly learning new patterns. And this one, that's a good example of a use case that we miss when we build those models. Liz Willets: Christian, um, you know, I'm just certainly learning a lot today (laughs), um, through this conversation. Um, but love to learn what's next. Um, you know, whether that's in your role or, um, just regard to machine learning and, and sentiment analysis. Um, but what do you think kinda the next big thing will be? Christian Redneck: That's a very good question (laughs). So, uh, from our perspective, our main effort is to get other features into the system, even when it comes to text processing. So as you mentioned earlier in, um, security, we have a much richer set of features that we've been using for quite a while now. We wanna do the same journey of our text models. So if you look at the communication, for example, you can induce, uh, whether it's falls under, it, it should hit on a certain policy or not, but you actually get more powerful models if you not just look at that one message, but that the entire conversation, or at least, um, you know, like the conversation, which is near, or, your target message. Like for example, the language that is acceptable between the students and the language that's acceptable between the student and teacher put different, it might not necessarily be the same. So there's a very rich set of, um, possibility that arise from looking at all of these metadata surrounding a message. Christophe Feissinger: Yeah. I mean, it's, it's, I'm glad you mentioned that of getting more context, because we did have a, uh, um, an example from the school district where, um, a student at [St. Litery 00:24:53] something like I will kill you in, in teams. And that was detected. Then the next question was what was the context around that? And sure enough, uh, the context was two students playing a video game. Um, so suddenly I went from a high alert, you know, the student is gonna- Christian Redneck: Yeah. Christophe Feissinger: ... hurt this other student, whereas no, they're just having fun. So I definitely second that they're just adding the couple messages above and before that- Christian Redneck: Right. Christophe Feissinger: ... you see that they're just playing a video game. And even though that language might not be acceptable, it's definitely not as bad as, uh, that intent to hurt someone. It was, I don't wanna hurt that virtual character in the video games. So yeah, definitely, uh, second down more context will definitely help really decide if this is really a, a high severity and more important what to do next in terms of remediation and cursing, the one thing I wanted to, we didn't really talk briefly, but we know that angu- language is not just US English. What are we doing to, to cater to other languages that our customers speak worldwide? Christian Redneck: Right. So we started all our efforts in English, but we're currently working on globalizing our model, which means that we want to provide the same protections for users in lots of other languages. We have like three tiers of languages and we're currently very focused in the first year, but eventually what we plan to get to all three tiers. And in principle, you have two ways of approaching this problem. The simplest thing you can do is you can basically build one model per language and that's something which works reasonably well. But in principle, what we aim for is models, which can deal with all languages at once. So there's been a lot of research in this area they're called multi-language models. They used the very same techniques that you use for, um, that you use for just English [inaudible 00:26:58] but then they have a few additions that make it suitable for applying it in a context with a lot of languages.             And basically what it's trying to do is so there's very powerful models which can use, which you can translate from one language to another. And if more than a few of the ideas from these models and incorporated them, which enables the model to basically in some sense, like relate all the languages, uh, to each other at once. So these models they will understand, I mean, I understand in a, in a machine learning way of thinking about it, that one word in Eng- one word English, as long as its translation into Greek or the Spanish or the French that they all kind of are, are the same. And, and then this provides that opportunity. So particularly, it means that you can train models in, uh, say like a set of languages and you'll actually get decent performance in the other languages, even though it might have not seen these samples are generally very few samples from this other language. Liz Willets: Uh, the more- Christophe Feissinger: That's great. Liz Willets: ... the more and more I listen, the more complex it gets, you know, you're using machine learning to, you know, look at different languages, uh, text versus images, ingesting things from different platforms. It's just mind boggling (laughs), how much goes into this, um, and really wanted to thank you, Christian for taking the time to, to chat with us today. I don't know about you Christophe, but I learned a lot. Christophe Feissinger: Fascinating. Fascinating. Liz Willets: Awesome. Yes. Well, thank you so much, Christian. And, um, thank you to our listeners. Um, we have a exciting lineup of, um, podcast series coming your way. Uh, next time we'll be talking to Kathleen Carley, who's a professor in social behavior analysis at Carnegie Mellon University. So, um, definitely tune in.

    Episode 6: Cracking down on communication risks

    Play Episode Listen Later May 26, 2021 32:50


    Words matter. Intent Matters.  And yes, most certainly, punctuation matters.  Don’t believe us? Just ask the person who spent the past five-minutes eating a sleeve of cookies reflecting on which emotion “Sarah” was trying to convey when she ended her email with, “Thanks.” In this episode of Uncovering Hidden Risks, Raman Kalyan, Talhah Mir and new hosts Liz Willets and Christophe Fiessinger come together to examine the awesomely complex and cutting-edge world of sentiment analysis and insider risks. From work comm to school chatter to social memes, our clever experts reveal how the manifestation of “risky” behavior can be detected.   0:00 Hello!: Meet your new Uncovering Hidden Risks hosts 2:00 Setting the story: The types and underlying risks of company communication 6:50 The trouble with identifying troublemakers: the link between code of conduct violations, sentiment analysis and risky behavior 10:00 Getting the full context: The importance of identifying questionable behavior across multiple platforms using language detection, pattern matching and AI 16:30 Illustrating your point: how memes and Giphy’s contribute to the conversation 19:30 Kids say the darndest things: the complexity of language choices within the education system 22:00 Words hurt: how toxic language erodes company culture 26:45 From their lips to our ears: customers stories about how communications have impacted culture, policy and perception Raman Kalyan: Hi everyone. My name is Raman Kalyan, I'm on the Microsoft 365 product marketing team, and I focus on insider risk management from Microsoft. I'm here today, joined by my colleagues, Talhah Mir, Liz Willetts, and Christophe Eisinger. And we are excited to talk to you about hidden risks within your organization. Hello? We're back, man. Talhah Mir: Yeah, we're back, man. It was super exciting, we got through a series of a, a couple of different podcasts, three great interviews, uh, span over multiple podcasts and just an amazing, amazing reaction to that, amazing conversations. I think we certainly learned a lot. Raman Kalyan: Mm-hmm (affirmative). I, I learned a lot. I mean, having Don Capelli on the podcast was awesome, talked about different types of insider risks, and what I'm most excited about today, Talhah, is to have Liz and Christophe on the, on the show with us 'cause we're gonna talk about communication risk. Talhah Mir: Yeah, super exciting. It's a key piece for us to better understand sort of sentiment of a customer, but I think it's important to kind of understand that on its own, there's a lot of interesting risks that you can identify, uh, that are completely sort of outside of the purview of typical solutions that customers think about. So really excited about this conversation today. Raman Kalyan: Absolutely. Liz, Christophe, welcome. We'd love to take an opportunity to have you guys, uh, introduce yourselves. Liz Willetts: Awesome, yeah, thanks for having us. We're excited to kind of take the reins from you all and, and kick off our own, uh, version of our podcast, but yeah, I'm, I'm Liz Willetts. I am the product marketing manager on our compliance marketing team and work closely with y'all as well as Christophe on the PM side. Christophe Eisinger: Awesome. Christophe. Hello everyone, I'm, uh, Christophe Eisinger and similar to Carla, I'm on the engineering team focusing on our insider risk, um, solution stack. Raman Kalyan: Cool. So there's a, there's a ton, breadth of communications out there. Liz, can you expand upon the different types of communications that organizations are using within their, uh, company to, to communicate? Liz Willetts: Yeah, definitely. Um, and you know kind of as we typically think about insider risks, you know, there's a perception around the fact that it's used, um, and related to things like stealing information or, um, you know, IP, sharing confidential information across the company, um, but in addition to some of those actions that they're taking, organizations really need to think about, you know, what might put the company, the brand, the reputation at risk. And so when you think about the communication platforms, um, you know, I think we're really looking to collaboration platforms, especially in this remote work environment- Raman Kalyan: Hmm. Liz Willetts: ... where employees, you know, have to have the tools to be enabled to do their best work at home. Um, so that's, you know, Teams, uh, Slack, Zoom, um, but then also, you know, just other forms of communication. Um, we're thinking about audio, video, um, those types of things to identify where there might be risks and, and how you can help an organization remediate what some of those risks might be. Raman Kalyan: Awesome. And Christophe, as we think about communications risk more broadly, what kind of threats do you... have you start seeing, um, organizations being more concerned about? Christophe Eisinger: Yeah, so exactly to what you just mentioned and, and Liz, so again, there's two, two main use cases; fulfilling regulatory compliance and the regulators definitely have been putting more scrutiny and, and fining, uh, organizations large and small that don't abide by those, uh, laws, whether it's in the US, whether it's in Europe and Canada. So there's definitely an increase in enforcement, so definitely, you know, a common use case that we're seeing over is with the, uh, recent event, and the pandemics, banks wanna enable their workforce to work remotely, and one of the tools that they need is the ability to do meetings and voice and, and chat. As soon as you introduce a n- a new tool like Teams for productivity, you need to, uh, look at, uh, patterns that would, um... that fall under those regulations, things like insider trading and collusions.             So definitely, where the change in the workforce and, and as being remote has accelerated adoption of Teams, certainly people want a, uh, a way to look at those behavior and, and avoid getting fined. And then the parallel work stream, which is also what, uh, Liz was mentioning is, you know, there has been, um, change significantly and that has naturally put some stress. Uh, it could be personal stress, you know, my kids are at home screaming or the dog or whatever, um, maybe I don't have a, uh, a nice room like here today where I can have a podcast, you know, maybe I'm, maybe I'm sitting in the kitchen and my young kids don't understand what it means to hush. So I put personal stress on me.             Maybe I'm stressed because I don't know if I'm gonna have a, a job tomorrow, maybe I've already been [inaudible 00:05:15]. That potentially could trigger me to, to forget that the tool I'm using to get work done and to communicate with my peers, there are some rules of engagement, if you like, and there's things that are not acceptable per employee, uh, code of conduct. And again, all this stress and the fact that maybe I'm lying on my couch make... gives me the full sense of it's casual, but now I'm having a meeting with Liz and Raman, and there's certain language that's just not acceptable at our organization.             So I think that's, that's a new trend that we're seeing that's also backed up by, by regulation in certain countries, um, to make sure there's no abuse over language. And the most common use cases, uh, in the world of education, the, the, the district, the school, the principal are responsible, uh, if bullying is reported or, or misbehavior and to really help mitigate so it doesn't escalate in- into something bad. So, uh, those are examples of what we're seeing this, eh, um- Talhah Mir: [inaudible 00:06:19], Christophe, um, you know, you and I have talked a lot about this sort of interplay and, and looking at, um, these communication risks, it's sentiment at the end of the day. And I know when we talk to our customers, it's, it's a very common ask around being able to understand, uh, these leading indicators. Now, Raman and I talk about insider risk management as a game of indicators, and, um, the, the more leading the indicator, the more impact it's gonna have on being able to help you identify proactive issues. So talk to me a little bit more about how some of these code of conduct violations are actually sentiment that can help you identify somebody who's a potential insider risk in the organization. Christophe Eisinger: Yeah, so the, the high level is, uh, if we take a concrete example, let's say, you know, I say some, some... I use some profanities certainly with peers, and, and... or sexual content, but it's just not acceptable. And, again, assume that Christophe is stressed, just a bad day, kids are screaming, whatever, I'm just stressed in my personal life and I've crossed that line. Now, the question is, was it accidental, Christophe suddenly reached the tipping point and started using foul language, or no Christophe, uh, did use foul language today, but he's been using foul language against Liz for the past 30 days. And not just over Teams or emails, over whatever, the... all the different communication channels that my employer has given.             So I think there's that two things, is it accidental, and I think you, you guys talked about that or is it [inaudible 00:08:02]? And most of the time, you know, we're humans and we get good intent, a lot of the time it is accidental. Uh, so it's just a matter of very quickly, hopefully, uh, seeing that behavior and notifying the [inaudible 00:08:14], whatever is your, your process of telling that person's manager that, "Hey, you stepped out of bound, uh, first warning, you know, maybe retake the employee training, you reread the code of conduct, and all good then and, and move forward."             To your questions, so that's the scenario. What's hard is because of the richness of the language, and we're humans and language keeps evolving, is just looking for specific profanities, there's some usual suspects that have no room in the workplace, but there's more pattern like abuse and harassment where I might not even use profanity, but the way I, I, I, um, criticize Liz or Raman clearly is way beyond constructive criticism. Talhah Mir: Mm-hmm (affirmative). Christophe Eisinger: And then, so how do you detect that? Because it might be u- I might be using perfectly, uh, okay dictionary words, uh, but when you read it as a whole from a sentence is horrendous or is just not acceptable? Um, so that's... To the... your question, like to really get to the crux, which is the a- that intent, that sentiment, you need to certainly look at the context and the intent. You need to see, is it a one-off with Christophe against that person, or no, it has been a pattern of repeated, uh, uh, communication risk against the individual. And so that's where, um, the problem is a fascinating problem and ever evolving because human language is this dynamic dimension that keeps evolving every day. And as you can see, I'm sure you have kids, with social media, whatever's the new buzz word, that certainly is part of the common language and guess what, we need to adapt to detect those new patterns. Raman Kalyan: Yeah. That's, that's fascinating, man. I think a couple of questions, one for you, and, and one for Liz. You mentioned a couple of things. One is that there's this accidental or inadvertent type of, "Hey, I... Maybe I'm not meaning what, what you think I'm meaning." So I'd love to kind of tease that out in terms of like, how does, how do we deal with that in terms of like a privacy... from a privacy perspective, right? So, you know, um, don't... you don't assume that the individual is actually doing something wrong, you wanna investigate it further. And then... That's a question for Liz and then a question for you would be really around, okay, you talked about context, how has the technology evolved to be able to really sort of understand that context? Because I know there's a lot of tools out there that promise, you know, offensive language detection or like, you know, the sentiment analysis, but they really focus in on pattern matching. And I wanna try to contrast, you know, how are we approaching that from a, from a, uh, machine learning perspective or AI perspective. So maybe Liz, you can go first on the privacy side. Liz Willetts: Yeah, definitely. I think that's a great question. Um, you know, we at Microsoft always keep the privacy of our customers top of mind and so wanna ensure we're, um, you know, delivering solutions to our customers that really have those capabilities built in. So, you know, when we think about, um, you know, communications, we think about, um, you know, making sure that all of the, um, communications that organizations are seeing in their solution are synonymized, um, meaning that they are de-identified, and so, um, when you think too about, you know, the fact that this is on by default, um, you know, customers are opted into, um, then you have to think about those people who are actually reviewing, um, and scoping the policies out to their workers, their analysts, their investigators, and so we definitely also keep, um, role-based access control top of mind so that only the right people, um, within an organization are able to see, um, you know, certain policies, f- flagged violations, um, and then, you know, we, we have audit reports where we can ensure that those investigators and analysts aren't misusing the data that they have at hand.             But then also thinking about, you know, one of the, the more important differentiators is that insiders are actually in a position of trust. And so, you know, they're making use of privileges that have been granted to them to really perform their role, and if they are abusing them, um, you know, we definitely wanna make sure we're catching that while at the same time, ensuring that those privacy principles are in place. Raman Kalyan: Awesome, that's great. Uh, really, that's, uh, great to hear. And then Christophe, as we talk about the evolution of the technology, you know, and talk to me a little bit more about how we've evolved the technology to kind of talk about what you said, which was this context, this sentiment, like, how do we get to that? Christophe Eisinger: Yeah. Actually, I don't wanna talk about technology. I just wanna talk about the problem we're trying to solve. Now that... Uh, leaving that aside, so yeah, it's all about context because it's, it's already a challenge and I think we're... one of the future podcasts will go about that to detect negative sentiment, uh, for... is, is already a challenge in itself, but the question is then you put that into context. Was it just the first time, Christophe just having a bad day, he crossed the line, he needs to be reminded that this is, uh, not acceptable and problem solved, and he never does it again? Or no, he crossed the line and guess what? Last Friday he put in his resignation and it looks like he started downloading a lot of document that were marked as confidential. So suddenly you're getting language risks, you know, a code of conduct violations, but you add that with the fact that he's gonna leave and he's also been... downloaded things that could potentially signify, um, theft.             So certainly getting that whole context of that individual, at the end of the day, what, what all that context give you is then your remediation action can be very specific versus just saying, "Christophe, stop using foul language." You know, suddenly we need to maybe pull in our compliance team or legal team or a security team or Christophe's manager versus just slapping him on the wrist for a foul language. So context is very... uh, is hugely important to help you deal with the proper remediation and the proper process based on that initial red flag which was foul language for instance. And so obviously that's, that's the, you know, the ideal, the, the uber solution that, um, a lot of us are trying to solve because the more complex you have, then [inaudible 00:14:59], position to really find those needle in the haystack and then take the appropriate action versus dismissing foul language when this person is on the road to actually burn on the house. Raman Kalyan: Yeah, that's, that's actually a really important point. I think the whole context, it's not even just the context of the communication, it's context of the sequence of events surrounding that communication and what might've happen before mi- might be happening after. Christophe Eisinger: Yeah. And Just to add to that, [inaudible 00:15:25], to mention that one thing I wanna be clear to the audience, uh, we're fully aware at, at Microsoft that it's not just the way you communicate in, in 365 such as Yammer or on email and, and Teams, but we also potentially help you... Like I said, if you give a, a, a work phone to your employees and they have SMS or they have WhatsApp, or they use [inaudible 00:15:49], technology or professional apps like Instant Bloomberg- Raman Kalyan: Mm-hmm (affirmative). Christophe Eisinger: ... you gotta be holistic because again, you might see one thing in one channel, but it's actually probably hiding maybe the forest of abuse or maybe my initial thing to Liz was on Teams, but the really bad behavior happen over SMS. So giving you the ability to look holistically and make sure you've... you reduced the blind spots as possible is also something that's, uh, dear to our heart. Raman Kalyan: Yeah, so having that sort of one pane of glass, you don't have to have multiple solutions and platforms that- Christophe Eisinger: Yeah. Raman Kalyan: ... you're trying to manage and manage workflows, manage integration, and signals, you can actually take one pane of glass and look across multiple communications and leverage the technology to identify the risks that are most important to you, right? Christophe Eisinger: Yes. Talhah Mir: So, um, Christophe, you... Christophe, you're gonna talk like multiple times a day and, and a lot of it is words, a lot of it is passionate words, but a lot of it is memes and GIPHYs that we send back and forth, so how do you think about in the context of, um, the communications and words and whatnot, how do you think about, uh, memes and GYPHYs? 'Cause some could be funny, but some could be crossing the line, right? Christophe Eisinger: No, You're you're spot on and, and it's definitely... Back to Liz, what Liz was mentioning, we know that communication is not written, right, anymore. And, and, you know, some of us have been on the workforce longer than others but... and some of us have kids and we've seen definitely the shift- Talhah Mir: Yeah. Christophe Eisinger: ... that it's no longer just an email or a one page memo, uh, now we have the Torah of channels on how we can do work, but like you say, rules for the form on how we communicate is not written. And so for the audience, what, uh, Talhah is referring to, it could be an image and very commonly, a lot of people, um, will annotate on an image, will literally put text on an image and that text could be a risk, could be very nasty, could be inappropriate, could be containing customer information, could be containing confidential information. Um, so how do we detect that if Christophe is just sending images in Teams all day or over email but if there's actually nothing is written?             Um, so we're actually working on, on, on this problem and we have a number of solution because there's like basically two patterns. First of all, there's the obvious image, you know, maybe is, is racist or adult or, or gory in nature, and that again has no place in the organization. So just recognizing, uh, the content of that image. But like we say, in addition to that, we're also working on doing, uh, what we call, uh, in technical jargon, optical character recognition. So extracting whatever the text is, whether it's a written sketch or, or typed on top of the image, and then once you get that extracted test, run that to our detection, we say, "Is it... matches code of conduct violation? Does it match potential regulatory, uh, compliance violations?" And so forth?             So yes, we're absolutely looking at other forms of communication that are including in our... in the tools we use day in day, uh, such as images. And you're probably thinking how about video? And yes, this is also, uh, something we're, we're, um, working on in the futures. The goal is to reduce as much as possible, those blind spots. And that's what effectively we're doing, you know... If the end user thinks they can outsmart the system by just putting whatever, some social security from their favorite customers of a bank account or swear words in an image then, uh... and not in written text, then we wanna mitigate that to again, close all those bad blind spots. Liz Willetts: Yeah, and I would add there to that too, it's, it's not just... English isn't the only spoken language. So thinking about globalizing, um, some of that as well 'cause I know, um, we were talking to a customer in the EDU space and they were saying, "Hey, you know, students are trying to (laughing), bypass the system. They are writing... They are cyberbullying and, and writing harassing messages in Japanese, um, translating that through, you know, a translate app and sending that to their peers." And, um, you know, being able to detect things like that, not just in English, um, is certainly something that's also come, um, to the forefront for us. Christophe Eisinger: Yeah, that's... Thi- this is a true story that Liz is telling and it was interesting for us. And that's when you learn so much from kids, uh, are very... Creativity to abuse the system or be colorful is amazing and endless. But yeah, this is a true story of a school district in the Midwest, and, and we're definitely, to Liz's point, being Microsoft, we know we, we wanna cater to, uh, customers worldwide, and we already had strong demand in Asia that has laws to protect against harassment, so there's Japan and others, and we're, we're, we're wanting feedback from some customers into one of those customer interaction, and we asked the school district, "A, we're looking at introducing, um, the abuse, uh, in those languages, would you be interested? Including Asian languages."             And the customers to our surprise say, "Yeah, I'm very interested in that." It's like how come a customer in the Midwest in the US is interested in, in Japanese and Korean and simplified Chinese. And to Liz's point, some students might not even be native in those language but they can definitely use a search engine. And instead of saying what I think about Talhah in plain English, I'll translate it and put the translated version with the, with the Katakana or Kanji which are the alphabets in Japan, and think I can get away because no one else besides Talhah will figure that I'm, I'm being very nasty and my school administrators is definitely not fluent in that language and will think it's harmless. So yeah- Talhah Mir: Now we're gotta, gotta go back and search our chat history, man. Now, now, Japanese characters are making sense. I gotta go (laughing), translate them. Liz Willetts: (laughs). Christophe Eisinger: I mean, it, wasn't just in French. Talhah Mir: (laughs). Raman Kalyan: Now I have to look at my kid's chat history and be like, "What are you... What is that?" Christophe Eisinger: Yeah, anytime you find some language you don't speak, question yourself. Uh, it might not be love words after all. Talhah Mir: (laughs). Liz Willetts: (laughs). Christophe Eisinger: I'm just saying. Raman Kalyan: Well, as you know, one of the things that we've talked about is, uh, the importance of supporting company culture, right? And how toxic communications, um, can erode that, you know, culture and the trust in your organization. I'd love to talk a little bit more about that and, you know, get your perspective on that and also talk about how, you know, some of the remediation actions we have within, you know, this solution can help organizations really address, uh, or support a positive company culture. Liz Willetts: Yeah, definitely. I think there are a lot of cultural implications, um, for, uh, a corporation or, um, an organization and, and definitely having the ability to support their, um, company culture, but also to support their employees in times when, you know, they might be going through an external stress factor, you know, COVID being a great example. Um, you know, an organization that might be looking at, um, you know, their company culture impact in this day and age, they want their employees to have the tools and, and support to do their best work, whether that's webcams, computers, conference calls, um, and you know, now in the context of remote work, you know, you're in the privacy of your own home, um, and there are definitely distractions all around. And at the same time, you have to remember, "Hey, this is a work environment." Raman Kalyan: Mm-hmm (affirmative). Liz Willetts: Um, so there are definitely some things that you should, and shouldn't say in the context of work that might be okay in your personal life, um, but you know, in the workplace, there still is a code of conduct charter, you've signed it, um, you know, you take training, hopefully on the first day of work, um, and so in this context, how do you remind people, um, you know, that there is this change for remote work but the same standards still apply, um, you know, whether that's fostering diversity and inclusion within your company. Um, and, and you certainly wanna make sure that you're investigating and remediating something, um, that your employees know are, um, wrong, you know, something like sexual harassment, um, you know, lots of, kind of potential infractions, um, and to kind of...             One, from a brand reputation perspective, you know, this person might go off and write some social tweets or whatnot, um, and have a pretty big and bad impact for your organization. Um, so it's kind of one thing to have code of conduct, a charter, um, but another is to really live by it and, and show your people that, um, you know, it's, it's really something that you're invested in. Um, and so I think also it's not all that (laughs). Um, so, you know, we're under stress, job security concerns, scared of, um, you know, a loved one or a parent getting sick, and so maybe you're not intentionally trying to hurt your peers, um, but just, you know, perhaps used an inappropriate word or expressed your frustrations at work.             Um, and so I think that that's kind of where you can also come in and provide support. You know, maybe it's a little slap on the wrist, but just remind you what your company charter is, um, maybe, you know, encourage you to retake some of the trainings, um, and really just kind of making sure that all around, um, you know, employee wellbeing is, uh, kind of top of mind for the company. Talhah Mir: Yeah, and On that note, Liz, I know you talked to me about the fact that, you know, technology like this, solutions like these are not just about finding the bad, it's about, you know, uh, an organization using it as an opportunity to show a commitment towards a positive employee culture and saying, "We're gonna put money behind what we say is important to us, which is a positive company culture." But some of the stories that I've heard from you was just amazing where companies are looking to do, whether it's education or government or private, uh, sector, just being able to back that up and say, "We actually care, we're gonna look out for these things." And to your point, it's not just, "When we find something bad that we're gonna take some, you know, dramatic action. It's like when we find something, it's an opportunity for us to educate and kind of uplift the culture." So I think that's a, that's a really important one for you to call up there. Liz Willetts: Exactly, yeah. And I think, um, you know, especially as you think, living and breathing your corporate culture and, and your principles, um, it's important 'cause, you know, other employees are expecting you to take action on, on certain things and, um, you kind of have to uphold your standards as well to, to match their expectations. Talhah Mir: Hmm. So What are some stories that you guys have heard or come across from customers? Something, uh... And then I don't know, I don't know which one of those you can actually talk about here, I don't... You guys have shared a lot of those offline and stuff, and I talked about quite a few, but what are some, some great examples of positive impact that you've seen that you're... that you guys can share? Christophe Eisinger: Uh, I'll share one. I'm not gonna mention the customers, uh, due to sensitivity, but to your point on... and what Liz was saying that, you know, it doesn't take... You just look at the headline in the newspaper and you can see there's potential regions, potential, uh, industries that, that had bad press, and, uh, probably for good reasons because of, of not doing anything about those, um, abusive behavior. Uh, so I, I've been involved with one customer, um, I'll just say North America, but it was exactly to get ahead of that. They, they haven't been in the headlines, the industry has been in the headlines, and it's just a mandate from their leadership team to say, to your point, "We wanna be proactive so we want a virtuous cy- uh, cycle of making sure we live by, to Li- to Liz's point, live by our code of conduct." So it's more like. "Le- I wanna get ahead of the game because I wanna show all my employees I've got their back and this is a healthy environment, please don't go to my competitor. Like we've got your back and let me prove it to you that we're, um, fostering that healthy environment."             The example, that example, I mentioned earlier, it's it's not a company, but it's the same team where in Japan in April of 2020, a new law went into effect around, uh, what they call power harassment, and so the question is great, there's this new law that if your manager or your manager's manager is, is abusing you, uh, it's illegal, then the next question comes, uh, what are you gonna do about it, uh, as an employer? So in Japan, they, you know, because it, it takes time to put processes and, and solutions to look for that, initially it starts with the large corporations. I think it's like a three-year four year phase out by the time it goes to, uh, small and me- medium size. Liz Willetts: Yeah. And I think one of my favorite, um, customer stories was one that really, in my mind, helped enable their creativity. Um, you know, we were talking to a sports league kind of right at the beginning of the pandemic. You know, they knew that it was gonna be a washout season, all games, everything was being canceled basically, except for golf at that point in time, um, and there was obviously a worry around, you know, contact sports and, and spreading of the virus. And so, um, we had this one sports league come to us and say, "Hey, you know, we've got these season ticket holders, they're huge fans. We feel like we're letting them down. You know, they don't have a season to, to kind of, um, rally around this year. And so we're thinking about, um, you know, how can we get them to interact with players, coaches, um, you know, coaching staff, et cetera?" Um, and so they wanted to enable that sort of scenario but at the same time were concerned around, you know, "We need to moderate content to ensure there's no abusive language, either between fans, between players, um, staff, et cetera."             And so I think that was an interesting use case where, hey, yeah, you wanna detect certain things in communications and this might be completely out of your wheelhouse. Um, but being able to feel comfortable coming to a company like Microsoft and say, "You know, what can we do here?" Um, and so I thought that was, uh, enlightening, uh, case for, um, us as well. Talhah Mir: This is terribly exciting stuff, man. I know the four of us have talked about this quite a bit, but to me, sentiment analysis is the holy grail of insider risk. Being in this space for a couple of years now, um, the sooner you detect these things, the more impactful you will be, and it's all about the behavior. And one of the, the first areas, the first sort of physical manifestation of a behavior is in the communication of an individual. So that's why I sent them an [inaudible 00:30:59], to such an amazing, amazing people. It's also incredibly difficult if you guys don't. So you guys are on the tip end sort of [inaudible 00:31:05], sphere as it comes to this stuff, but, we're super excited about some of the opportunities that you guys are driving towards and how we can leverage that to kind of broaden our detection when it comes to identifying and managing insider risk[inaudible 00:31:18]. Thank you guys, this is very exciting stuff, looking forward to the rest of the podcast as well. Raman Kalyan: Yeah. And I was just gonna say, thank you so much for coming onto the show. We really appreciate having you here, and, Liz in Christophe, we can't wait to hear the different podcasts you have coming up, uh, like Talhah said. Exciting space, definitely, uh, space where there's a lot of innovation happening and we're excited to see what you have coming up. So thank you again. Liz Willetts: Awesome. Yeah, thanks. Thanks for having us on, and, um, we're excited to kind of... We've passed the torch from y'all and have a great lineup of speakers, um, over the next couple of weeks. Um, Talhah, to your point, sentiment analysis is definitely an area where we're gonna deep dive with, um, Kathleen Carley, a professor at CMU. Talhah Mir: Thanks. Liz Willetts: Um, we're gonna go deep on machine learning with one of our data scientists, Christian Rodnick, um, so definitely have some exciting, uh, conversations to come. Talhah Mir: Awesome, awesome. Raman Kalyan: And so thank you everyone for listening. Uh, this is another episode of the Hidden Risk Podcast. We've had, uh, some awesome guests on the, on the show today. Again, uh, Liz Willetts and Christophe Eisinger, and Talhah and I are excited to have you listen, uh, to their podcast as well as if you haven't heard our, uh, previous podcasts, you can find them on your favorite, uh, YouTube channel. So... Or favorite podcast channel, wherever you wanna see it.  

    Episode 5: Practitioners guide to effectively managing insider risks

    Play Episode Listen Later Sep 21, 2020 23:02


    In this podcast we explore steps to take to set up and run an insider risk management program.  We talk about specific organizations to collaborate with, and top risks to address first.  We hear directly from an expert with three decades of experience setting up impactful insider risk management programs in government and private sector.

    Episode 4: Insider risk programs have come a long way

    Play Episode Listen Later Sep 21, 2020 30:14


    In this podcast we discover the history of the practice of insider threat management; the role of technology, psychology, people, and cross-organizational collaboration to drive an effective insider risk program today; and things to consider as we look ahead and across an ever-changing risk landscape.

    Episode 3: Insider risks aren’t just a security problem

    Play Episode Listen Later Sep 21, 2020 31:46


    In this podcast we explore how partnering with Human Resources can create a strong insider risk management program, a better workplace and more secure organization.  We uncover the types of HR data that can be added to an insider risk management system, using artificial intelligence to contextualize the data, all while respecting privacy and keeping in line with applicable policies.

    Episode 2: Predicting your next insider risks

    Play Episode Listen Later Sep 21, 2020 30:07


    In this podcast we explore the challenges of addressing insider threats and how organizations can improve their security posture by understanding the  conditions and triggers that precede a potentially harmful act.  And how technological advances in prevention and detection can help organizations stay safe and steps ahead of threats from trusted insiders. 

    Episode 1: Artificial intelligence hunts for insider risks

    Play Episode Listen Later Sep 21, 2020 30:02


    In this podcast we explore how new advances in artificial intelligence and machine learning take on the challenge of hunting for insider risks within your organization.  Insider risks aren’t easy to find, however, with its ability to leverage the power of machine learning, artificial intelligence can uncover hidden risks that would otherwise be impossible to find.

    Uncovering Hidden Risks Trailer

    Play Episode Listen Later Sep 21, 2020 2:09


    Welcome to Uncovering Hidden Risks, a broader set of podcasts focused on identifying the various risks organizations face as they navigate the internal and external requirements they must comply with.   We’ll take you through a journey on insider risks to uncover some of the hidden security threats that Microsoft and organizations across the world are facing.  We will bring to surface some best-in-class technology and processes to help you protect your organization and employees from risks from trusted insiders.  All in an open discussion with topnotch industry experts!

    Claim Uncovering Hidden Risks

    In order to claim this podcast we'll send an email to with a verification link. Simply click the link and you will be able to edit tags, request a refresh, and other features to take control of your podcast page!

    Claim Cancel