Podcasts about database systems

Organized collection of data

  • 26PODCASTS
  • 34EPISODES
  • 44mAVG DURATION
  • 1MONTHLY NEW EPISODE
  • Mar 27, 2025LATEST
database systems

POPULARITY

20172018201920202021202220232024


Best podcasts about database systems

Latest podcast episodes about database systems

Software Engineering Daily
Turing Award Special: A Conversation with Jeffrey Ullman

Software Engineering Daily

Play Episode Listen Later Mar 27, 2025 37:45


Jeffrey Ullman is a renowned computer scientist and professor emeritus at Stanford University, celebrated for his groundbreaking contributions to database systems, compilers, and algorithms. He co-authored influential texts like Principles of Database Systems and Compilers: Principles, Techniques, and Tools (often called the “Dragon Book”), which have shaped generations of computer science students. Jeffrey received the The post Turing Award Special: A Conversation with Jeffrey Ullman appeared first on Software Engineering Daily.

Podcast – Software Engineering Daily
Turing Award Special: A Conversation with Jeffrey Ullman

Podcast – Software Engineering Daily

Play Episode Listen Later Mar 27, 2025 37:45


Jeffrey Ullman is a renowned computer scientist and professor emeritus at Stanford University, celebrated for his groundbreaking contributions to database systems, compilers, and algorithms. He co-authored influential texts like Principles of Database Systems and Compilers: Principles, Techniques, and Tools (often called the “Dragon Book”), which have shaped generations of computer science students. Jeffrey received the The post Turing Award Special: A Conversation with Jeffrey Ullman appeared first on Software Engineering Daily.

The Data Stack Show
186: Data Fusion and The Future Of Specialized Databases with Andrew Lamb of InfluxData

The Data Stack Show

Play Episode Listen Later Apr 24, 2024 58:26


Highlights from this week's conversation include:The Evolution of Data Systems (0:47)The Role of Open Source Software (2:39)Challenges of Time Series Data (6:38)Architecting InfluxDB (9:34)High Cardinality Concepts (11:36)Trade-Offs in Time Series Databases (15:35)High Cardinality Data (18:24)Evolution to InfluxDB 3.0 (21:06)Modern Data Stack (23:04)Evolution of Database Systems (29:48)InfluxDB Re-Architecture (33:14)Building an Analytic System with Data Fusion (37:33)Challenges of Mapping Time Series Data into Relational Model (44:55)Adoption and Future of Data Fusion (46:51)Externalized Joins and Technical Challenges (51:11)Exciting Opportunities in Data Tooling (55:20)Emergence of New Architectures (56:35)Final thoughts and takeaways (57:47)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

commercialrealestatetraining's podcast
Three Essential Strategies for Commercial Real Estate Sales Agents

commercialrealestatetraining's podcast

Play Episode Listen Later Apr 12, 2024 22:52


In today's program we cover three topics that are all sales related and they can help you build your momentum with clients and listings in brokerage.  These are the sections of the program: Strategies to understand if you are good enough in real estate sales, and how you can improve that. How to build an authentic database system that will actually help you find more listing opportunities. How you can be ahead of the real estate marketing game as an agent today. Take these points and turn them into real estate systems and tools that you personally use. Real estate, and particularly that of commercial real estate is immensely rewarding if you take the time to focus on your systems and activities. These three modules will help you do that.

The Data Stack Show
185: The Evolution of Data Processing, Data Formats, and Data Sharing with Ryan Blue of Tabular

The Data Stack Show

Play Episode Listen Later Apr 10, 2024 89:43


Highlights from this week's conversation include:The Evolution of Data Processing (2:36)Ryan's Background and Journey in Data (4:52)Challenges in Transitioning to S3 (8:47)Impact of Latency on Query Performance (11:43)Challenges with Table Representation (15:26)Designing a New Metadata Format (21:36)Integration with Existing Tools and Open Source Project (24:07)Initial Features of Iceberg (26:11)Challenges of Manual Partitioning (31:49)Designing the Iceberg Table Format (37:31)Trade-offs in Writing Workloads (47:22)Database Systems and File Systems (55:00)Vendor Influence on Access Controls (1:01:58)Restructuring Data Security (1:03:39)Delegating Access Controls (1:07:22)Column-level Access Controls (1:14:19)Exciting Releases and Future Plans (1:17:47)Centralization of Components in Data Infrastructure (1:25:37)Fundamental Shift in Data Architecture (1:28:28)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Live UNREAL with Glover U
How To Work Your Database with Proven Real Estate Systems | Jeff Glover & Matt Hodges

Live UNREAL with Glover U

Play Episode Listen Later Feb 8, 2024 17:08


Welcome to another episode of Live UNREAL podcast! In this episode from the Live Unreal Summit, Jeff invites Matt Hodges, boutique brokerage owner in Traverse City, MI to share about his Real Estate business and how the Glover U systems have shaped his successes.  Matt shares how he manages to have success through the use of Database Systems in what's considered a resort, or small town market. Through newsletters comprised of pertinent information, community events, and more, Matt Hodges and his team have been able to more than double their production since 2022. Listen in to see how you can too! The Glover U mission is to impact the lives of millions by helping them live their UNREAL life! We hope you are inspired by the Live Unreal Formula! Whether you're an established Realtor or new to the real estate game, this podcast is designed to empower you with knowledge and inspiration.

DevZen Podcast
Сам себе человек-сосед — Episode 0399

DevZen Podcast

Play Episode Listen Later Oct 23, 2022 64:50


В этом выпуске: Вкратце про Ubiquiti сеть дома (роутеры, точки доступа, камеры), обзор двух видео из цикла «Intro to database system», а также более подробно про доклад про Litestream для SQLite. Шоунотес:  [00:02:00] Чему мы научились за неделю UniFi Comparison Charts — McCann Tech https://store.ui.com/collections/unifi-protect/products/uvc-g4-bullet [00:26:01] [Коротко] Introduction to Distributed Databases (CMU Intro to Database Systems… Читать далее →

ubiquiti database systems
All Things Crime
Tom Myers: How Database Systems Help Solve a Case Efficiently Pt. 2

All Things Crime

Play Episode Listen Later Oct 12, 2022 27:16


Angie Dodge suffered a terrible fate in the hands of her killer. As one of the cold cases of the 20th century, the investigation led to a false accusation. The wrongly identified culprit unfortunately spent 20 years in prison before the authorities finally traced the real perpetrator. Because of an intense DNA matching technology and the convict's confession, the case was finally solved. In the second part of the two-episode interview with Tom Myers, discussions about the importance of interviews and interrogations were brought up. In a world where crimes continue to plague people, it's up to the authorities to be extra vigilant and discerning when it comes to their investigation. Liars can hide really well but authorities should know how to identify them. Technology paves the way for accuracy. Equipment that analyze genetics can trace the culprit. Skills of probing are also integral in identifying the suspect. “You can't expect to be in a bad situation and come out of that okay. You wouldn't be swimming with sharks that are really hungry.” - Tom MyersThree TakeawaysBe prepared in interviewing people involved in a crime. Database systems help narrow down the culprits of crimes. Technology continues to innovate and helps resolve investigations.In this Episode0:00 Introduction0:37 The importance of preparation7:08 Ideal environment for interviews10:08 How to indicate planned crimes16:31 The Idaho case18:26 Social conventions have changed. Resources:Check out M-Vac here: https://www.m-vac.com/ All Things Crime is a new, comprehensive video series that will explore every aspect of crime and the ensuing investigation, one video interview at a time. The host, Jared Bradley, is the President of M-Vac Systems, which is a wet-vacuum based forensic DNA collection system, and has experience traveling the world training all levels of law enforcement and crime lab DNA analysts in using the M-Vac to help solve crime.Along the way he has met people from all walks of life and experience in investigating crimes, so is putting that knowledge to use in another way by sharing it in these videos. If you are interested in more videos about the M-Vac, DNA and investigations, also check out the M-Vac's channel @https://www.youtube.com/c/MVacSystems...

All Things Crime
Tom Myers: How Database Systems Help Solve a Case Efficiently Pt. 1

All Things Crime

Play Episode Listen Later Oct 5, 2022 25:59


Two innocent girls were kidnapped at a mall and were never found. It was the ‘70s, back when serial killers could easily leave their tracks unseen and their traces undiscovered. Years later, investigation led to a culprit - a dangerous suspect who murdered the girls and burned their remains in his home. Such cases in the previous century required a lot of time and effort to arrive at a conclusion. With the advent of modern technology and techniques for investigations, similar cases could easily be resolved. Tom Myers believes that a database is just as important as tracing evidence and history. As a former FBI ERT leader, Tom has had an extensive background in law enforcement. He has been a CSI extraordinaire and a ranger. With his experience in the US army, he has witnessed what it means to protect and serve the country. Tom is impressed with the innovative tools and equipment used to conclude an investigation. In this episode, he shares his insights on the best practices conducted by detectives to trace the culprit of a crime and efficiently solve a case. A database system, for example, tracks the person's history so it would be accurate to piece the puzzle together. All Things Crime is a new, comprehensive video series that will explore every aspect of crime and the ensuing investigation, one video interview at a time. The host, Jared Bradley, is the President of M-Vac Systems, which is a wet-vacuum based forensic DNA collection system, and has experience traveling the world training all levels of law enforcement and crime lab DNA analysts in using the M-Vac to help solve crime.Along the way he has met people from all walks of life and experience in investigating crimes, so is putting that knowledge to use in another way by sharing it in these videos. If you are interested in more videos about the M-Vac, DNA and investigations, also check out the M-Vac's channel @https://www.youtube.com/c/MVacSystems...

DevZen Podcast
Изменение культуры — Episode 0385

DevZen Podcast

Play Episode Listen Later Jun 12, 2022 107:28


В этом выпуске: нитрид галлия для зарядников, отзывы в амазоне и озоне, Rosetta2 для macOS Ventura, субъективность ошибок и изменение культуры, интересные находки из Intro to Database Systems, и вопросы слушателей. Шоуноты: [00:01:47] Чему мы научились за неделю [Bug]: time_bucket_ng for daylight saving hour return time from future · Issue #4392 · timescale/timescaledb · GitHub… Читать далее →

github bug database systems
Percona's HOSS Talks FOSS:  The Open Source Database Podcast
The HOSS Talks FOSS Ep26 - David Zhao, Database Systems Expert, ZettaDB

Percona's HOSS Talks FOSS: The Open Source Database Podcast

Play Episode Listen Later Jun 30, 2021 17:32


David Zhao (ZettaDB) has been working on database kernels for years, developing code and enhancements for Berkeley DB, MySQL, and TDSQL.  David is back with a new Hybrid database called Kunlun which aims to take the best of what's in the PostgreSQL and MySQL space both and output a better database.   David sits down with Matt Yonkovit ( The HOSS) to talk about the Kunlun project, distribution, implementation, scalability, and more. If you are interested in learning more David also delivered a talk at Percona Live 2021 entitled “Performance Comparison of MySQL and PostgreSQL based on Kernel Level Analysis” which is also available now.

IGeometry
B-tree vs B+ tree in Database Systems

IGeometry

Play Episode Listen Later Jun 27, 2021 32:38


In this episode of the backend engineering show I'll discuss the difference between b-tree and b+tree why they were invented, what problems do they solve, and the advantages and disadvantages of both. I'll also discuss the limitation of implementing b-tree over b+tree and how Discord ran into a memory limitation using b-tree Mongo. Check out my udemy Introduction to Database Engineering course https://husseinnasser.com/courses Learn the fundamentals of database systems to understand and build performant backend apps 0:00 Data structure and algorithms 1:30 Working with large datasets 6:00 Binary Tree 8:30 B-tree 19:30 B+ tree 22:00 B-tree vs B+ tree benefits 25:00 MongoDB Btree Indexes Trouble 30:00 Summary working with a billion row table (Members only) https://youtu.be/wj7KEMEkMUE indexing video https://youtu.be/-qNSXK7s7_w Discord moving from MongoDB to Cassandra https://www.youtube.com/watch?v=86olupkuLlU https://blog.discord.com/how-discord-stores-billions-of-messages-7fa6ec7ee4c7 MongoDB Indexes https://docs.mongodb.com/manual/indexes/ Postgres Indexes https://www.postgresql.org/docs/13/btree-implementation.html btree code https://www.cs.usfca.edu/~galles/visualization/BPlusTree.html https://www.cs.usfca.edu/~galles/visualization/BTree.html Support my work on PayPal https://bit.ly/33ENps4 Become a Member on YouTube https://www.youtube.com/channel/UC_ML5xP23TOWKUcc-oAE_Eg/join

Microsoft Research India Podcast
Dependable IoT: Making data from IoT devices dependable and trustworthy for good decision making. With Dr. Akshay Nambi and Ajay Manchepalli

Microsoft Research India Podcast

Play Episode Listen Later Jun 14, 2021 27:31


Episode 009 | June 15, 2021 The Internet of Things has been around for a few years now and many businesses and organizations depend on data from these systems to make critical decisions. At the same time, it is also well recognized that this data- even up to 40% of it- can be spurious, and this obviously can have a tremendously negative impact on an organizations' decision making. But is there a way to evaluate if the sensors in a network are actually working properly and that the data generated by them are above a defined quality threshold? Join us as we speak to Dr Akshay Nambi and Ajay Manchepalli, both from Microsoft Research India, about their innovative work on making sure that IoT data is dependable and verified, truly enabling organizations to make the right decisions. Akshay Nambi is a Senior Researcher at Microsoft Research India. His research interests lie at the intersection of Systems and Technology for Emerging Markets broadly in the areas of AI, IoT, and Edge Computing. He is particularly interested in building affordable, reliable, and scalable IoT devices to address various societal challenges. His recent projects are focused on improving data quality in low-cost IoT sensors and enhancing performance of DNNs on resource-constrained edge devices. Previously, he spent two years at Microsoft Research as a post-doctoral scholar and he has completed his PhD from the Delft University of Technology (TUDelft) in the Netherlands.  Ajay Manchepalli, as a Research Program Manager, works with researchers across Microsoft Research India, bridging Research innovations to real-world scenarios. He received his Master's degree in Computer Science from Temple University where he focused on Database Systems. After his Masters, Ajay spent his next 10 years shipping SQL Server products and managing their early adopter customer programs. For more information about the Microsoft Research India click here. Related Microsoft Research India Podcast: More podcasts from MSR India iTunes: Subscribe and listen to new podcasts on iTunes Android RSS Feed Spotify Google Podcasts Email Transcript Ajay Manchepalli: The interesting thing that we observed in all these scenarios is how the entire industry is trusting data, and using this data to make business decisions, and they don't have a reliable way to say whether the data is valid or not. That was mind boggling. You're calling data as the new oil, we are deploying these things, and we're collecting the data and making business decisions, and you're not even sure if that data that you've made your decision on is valid. To us it came as a surprise that there wasn't enough already done to solve these challenges and that in some sense was the inspiration to go figure out what it is that we can do to empower these people, because at the end of the day, your decision is only as good as the data. [Music] Sridhar Vedantham: Welcome to the Microsoft Research India podcast, where we explore cutting-edge research that's impacting technology and society. I'm your host, Sridhar Vedantham. [Music] The Internet of Things has been around for a few years now and many businesses and organizations depend on data from these systems to make critical decisions. At the same time, it is also well recognized that this data- even up to 40% of it- can be spurious, and this obviously can have a tremendously negative impact on an organizations' decision making. But is there a way to evaluate if the sensors in a network are actually working properly and that the data generated by them are above a defined quality threshold? Join us as we speak to Dr Akshay Nambi and Ajay Manchepalli, both from Microsoft Research India, about their innovative work on making sure that IoT data is dependable and verified, truly enabling organizations to make the right decisions. [Music] Sridhar Vedantham: So, Akshay and Ajay, welcome to the podcast. It's great to have you guys here. Akshay Nambi: Good evening Sridhar. Thank you for having me here. Ajay Manchepalli: Oh, I'm excited as well. Sridhar Vedantham: Cool, and I'm really keen to get this underway because this is a topic that's quite interesting to everybody, you know. When we talk about things like IoT in particular, this has been a term that's been around for quite a while, for many years now and we've heard a lot about the benefits that IoT can bring to us as a society or as a community, or as people at an individual level. Now you guys have been talking about something called Dependable IoT. So, what exactly is Dependable IoT and what does it bring to the IoT space? Ajay Manchepalli: Yeah, IoT is one area we have seen that is exponentially growing. I mean, if you look at the number of devices that are being deployed it's going into the billions and most of the industries are now relying on this data to make their business decisions. And so, when they go about doing this, we have, with our own experience, we have seen that there are a lot of challenges that comes in play when you're dealing with IoT devices. These are deployed in far off locations, remote locations and in harsh weather conditions, and all of these things can lead to reliability issues with these devices. In fact, the CTO of GE Digital mentioned that, you know, about 40% of all the data they see from these IoT devices are spurious, and even KPMG had a report saying that you know over 80% of CEOs are concerned about the quality of data that they're basing their decisions on. And we observed that in our own deployments early on, and that's when we realized that there is, there is a fundamental requirement to ensure that the data that is being collected is actually good data, because all these decisions are being based on the data. And since data is the new oil, we are basically focusing on, ok, what is it that we can do to help these businesses know whether the data they're consuming is valid or not and that starts at the source of the truth, which is the sensors and the sensor devices. And so Akshay has built this technology that enables you to understand whether the sensors are working fine or not. Sridhar Vedantham: So, 40% of data coming from sensors being spurious sounds a little frightening, especially when we are saying that you know businesses and other organizations base a whole lot of the decisions on the data they're getting, right? Ajay Manchepalli: Absolutely. Sridhar Vedantham: Akshay, was there anything you wanted to add to this? Akshay Nambi: Yeah, so if you see, reliability and security are the two big barriers in limiting the true potential of IoT, right? And over the past few years you would have seen IoT community, including Microsoft, made significant progress to improve security aspects of IoT. However, techniques to determine data quality and sensor health remain quite limited. Like security, sensor reliability and data quality are fundamental to realize the true potential of IoT which is the focus of our project- Dependable IoT. Sridhar Vedantham: Ok, so you know, once again, we've heard these terms like IoT for many years now. Just to kind of demonstrate what the two of you have been speaking about in terms of various aspects or various scenarios in which IoT can be deployed, could you give me a couple of examples where IoT use is widespread? Akshay Nambi: Right, so let me give an example of air pollution monitoring. So, air pollution is a major concern worldwide, and governments are looking for ways to collect fine grained data to identify and curb pollution. So, to do this, low-cost sensors are being used to monitor pollution levels. There have been deployed in numerous places on moving vehicles to capture the pollution levels accurately. The challenge with these sensors are that these are prone to failures, mainly due to the harsh environments in which they are deployed. For example, imagine a pollution sensor is measuring high pollution values right at a particular location. And given air pollution is such a local phenomenon, it's impossible to tell if this sensor data is an anomaly or a valid data without having any additional contextual information or sensor redundancy. And due to these reliability challenges the validity and viability of these low-cost sensors have been questioned by various users. Sridhar Vedantham: Ok, so it sounds kind of strange to me that sensors are being deployed all over the place now and you know, frankly, we all carry sensors on ourselves, right, all the time. Our phones have multiple sensors built into them and so on. But when you talk about sensors breaking down or being faulty or not providing the right kind of data back to the users, what causes these kind of things? I mean, I know you said in the context of, say, air pollution type sensors, you know it could be harsh environments and so on, but what are other reasons for, because of which the sensors could fail or sensor data could be faulty? Akshay Nambi: Great question, so sensors can go bad for numerous reasons, right? This could be due to sensor defect or damage. Think of a soil moisture sensor deployed in agricultural farm being run over by a tractor. Or it could be sensor drift due to wear and tear of sensing components, sensor calibration, human error and also environmental factors, like dust and humidity. And the challenge is, in all these cases, right, the sensors do not stop sending data but still continues to keep sending some data which is garbage or dirty, right? And the key challenge is it is nontrivial to detect if a remote sensor is working or faulty because of the following reasons. First a faulty sensor can mimic a non-faulty sensor data which is very hard to now distinguish. Second, to detect sensor faults, you can use sensor redundancy which becomes very expensive. Third, the cost and logistics to send a technician to figure out the fault is expensive and also very cumbersome. Finally, time series algorithms like anomaly detectors are not reliable because an anomaly need not imply it's a faulty data. Sridhar Vedantham: So, a quick question on one of the things that you've said. When you're talking about sensor redundancy, this just means that deploy multiple sensors, so if one fails then you use the other one or do you use data from the other one. Is that what that means? Akshay Nambi: Yeah, so sensor redundancy can be looked at both the ways. When one fails you could use the other, but it also be used to take the majority voting of multiple sensors in the same location. Going back to my air pollution example, if multiple sensors are giving very high values right, then you have high confidence in the data as opposed to thinking that is a faulty data. So that's how sensory redundancy is typically used today. Sridhar Vedantham: OK, and there you just have to take on faith that the data that you're getting from multiple sensors is actually valid. Akshay Nambi: Exactly, exactly. You never know that if all of them could have undergone the same fault. Sridhar Vedantham: Right. Ajay Manchepalli: It's interesting that when we think about how the industry tries to figure out if the sensors are working or not. There are three distinct approaches that we always observe, right? One is you have the sense of working, but you also try to use additional surrounding data. For example, let's say it's raining heavily, but your moisture sensor is indicating that the moisture level is low. That data doesn't align, right. The weather data indicates there's rains, but the moisture sensor is not giving you the right reading, so that's one way people can identify it's working or not. The other is what we just talked about, which is sensor redundancy- just to increase the number of sensors in that area and try to poll among a bunch of sensors. That also makes sense. And the third one is what typically you really can trust and that is you deploy someone out there physically, go look at the sensor and then have it tested. And if you start thinking about the scenarios we are talking about, which is remote locations, far away locations- imagine deploying sensors across the country, having to send people out and validate and things like that. There is cost associated to sending people as well as you have that sort of a down time, and so being able to, you know, remotely and reliably be able to say that the sensor is at fault, is an extremely empowering scenario. And as we look at this, it's not just sensor reliability, right? For example, if you think of a telephone, a landline, right, you have the dial tone which tells you if the phone is working or not, right? Similarly, we are trying to use certain characteristics in these sensors, that tells us if it's working or not. But the beauty of this solution is it's not just limited to being a dial tone for sensors, it is more than that. It not only tells you whether it is working or not, it can tell you if it is the sensor you intended to deploy. I mean, think of it this way. A company could work with a vendor and procure certain class of sensors and they have an agreement for that. And when these sensors are deployed, the actual sensors that get deployed may or may not be in that class of devices, intentionally or unintentionally, right? How do you know that? If we understand the nature of the sensor, we can actually remotely identify the type of sensor that is deployed and help industries essentially figure out whether the sensor that's deployed is the sensor you intended to. So it's more than just whether the sensor is working, you can identify it, you can even figure out things like data drift. This is a pretty powerful scenario that we are going after. Sridhar Vedantham: Right, and that's a lovely teaser for me to ask my next question. What exactly are you guys talking about and how do you do this? Akshay Nambi: Right, so our key value proposition right is basically a simple and easy way to remotely measure and observe the health of the sensor. The core technology behind this value proposition is the ability to automatically generate a fingerprint. When I say a fingerprint, what I'm referring to is the unique electrical characteristic, exhibited by these sensors, both analog and digital. Let me give an example. So, think of analog sensors which produce continuous output signal proportional to the quantity being measured. Our key insight here is that a sensor's voltage response right after powering down exhibits a unique characteristic, what we refer to as fall curve? This fall curve is dependent upon the sensor circuitry and the parasitic elements present in the sensor, thereby making it unique for each sensor type. So think of this as basically as fall curve acts as a reference signature for the sensor when it is working. And when the sensor goes bad, this fall curve drastically changes, and now by just comparing this fingerprint, we can tell whether a sensor is working or faulty. Ajay Manchepalli: The interesting part about the fingerprint that Akshay just mentioned is that it is all related to the physical characteristics of the sensors, right? You have a bunch of capacitors, resistors, all of those things put together to build the actual sensor device. And each manufacturer or each sensor type or each scenario would have a different circuit and because of that, when you power down this, because of its physical characteristics, you see different signatures. So this is a unique way of being able to identify not just what type of sensor, but even based on the manufacturer, because the circuitry for that particular manufacturer will be different. Sridhar Vedantham: So, just to clarify, when you're saying that sensors have unique fingerprints, are you talking about particular model of a sensor or a particular class of a sensor or a particular type of sensor? Akshay Nambi: Right, great question again. So, these fingerprints are unique for that particular type of sensors. For example, take soil moisture sensor from Seed Studio, for that particular type of sensor from that manufacturer, this signature remains the same. So all you have to do is for that manufacturer and for that sensor type you collect the fingerprint once. And then you can use that to compare against the operational fingerprints. Similarly, in case of digital sensors we use current drawn as a reference fingerprint to detect whether the sensor is working or not, and the key hypothesis here behind these fingerprints, is that when a sensor accumulates damage, we believe its physical properties also change, leading to a distinct current profile compared to that of a working sensor? And that's the key property behind developing these fingerprints and one of the key aspects of these fingerprints is also that this is unaffected by the external factors like environmental changes like temperature, humidity, right. So these fingerprints are unique for each sensor type and also are independent of the environmental changes. In that way, once you collect a fingerprint that should hold good irrespective of your scenario where you are deploying the sensor. Ajay Manchepalli: One other thing that I want to call out there is the beauty of this electrical signatures is based on the physical characteristics, right? So, it's not only when this device fails that the physical characteristics changes, and hence the signature changes, but also the beauty of this is that over time, when things degrade, that implies that the physical characteristics of that sensor or the device is also degrading, and when that happens, your electrical signature also shows that kind of degradation, and that is very powerful, because now you can actually identify or track the type of data drift that people are having. And when you observe such data drift, you can have calibration mechanisms to kind of recalibrate the data that you're getting and continue to function while you deploy people out and get it rectified, and things like that. So, it almost gives you the ability to have a planned downtime because you're not only seeing that the sensor has failed, but you are observing that the sensor will potentially fail down the line and you can take corrective actions. Sridhar Vedantham: Right, so basically you're getting a heads up that something bad is going to happen with the sensor. Ajay Manchepalli: Exactly. Sridhar Vedantham: Great. And have you guys actually deployed this out in the field in real world scenarios and so on to figure out whether it works or not? Akshay Nambi: Yeah, so this technology is already deployed in hundreds of devices in the space of agricultural farms, water monitoring and air pollution monitoring. To give you a concrete example, we are working with a company called Respirer who is using dependable IoT technology to provide reliable high fidelity pollution data to its customers and also policymakers. So, for example, Respirer today is able to provide for every data point what they measure, they are able to provide the status of the sensor, whether a sensor is working or faulty. This way users can filter out faulty or drifted data before consuming them. This has significantly increased the credibility of such low-cost sensors and the data that it is generating. And the key novelty to highlight again here is that we do this without any human intervention or redundancy. And in fact, if you think about it, we are not even looking at the sensor data. We are looking at these electrical characteristics, which is completely orthogonal to data, to determine whether the sensor is working, faulty, or drifted. Ajay Manchepalli: The interesting part of this work is that we observed in multiple real-world scenarios that there was a real need for reliability of such sensors, and it was really impacting their function. For example, there is a team that's working on smart agriculture, and the project is called FarmBeats. And in that case, we observed that they had these sensors deployed out in the fields and out there in the farms, you have harsh conditions, and sensors could easily get damaged, and they had to actually deploy people to go and figure out what the issue is. And it became very evident and clear that it was important for us to be able to solve that challenge of helping them figure out if the sensor is working or not, and the ability to do that remotely. So that that was sort of the beginning and maybe Akshay, you can talk about the other two projects that led after that. Akshay Nambi: Right, so another example is Respirer who is using dependable IoT technology to provide reliable, high-fidelity pollution data to its customers and policymakers. So they are now measuring the sensor status for every time they measure pollution data to determine whether the data which was measured, as from a working or a faulty or a drifted sensor. This way the users can filter out faulty or drifted data before they consume them. And this has significantly increased the credibility of low-cost sensors and the data it is measuring. To give another example, we're also working with Microsoft for Startups and Accenture for a particular NGO called. Jaljeevika, which focus on improving livelihood of small-scale fish farmers. They have a IoT device that monitors temperature, TDS, pH of water bodies to provide advisories for fish farmers. Again, since these sensors are deployed in remote locations and farmers are relying on this data and the advice being generated, it is very critical to collect reliable data. And today Jaljeevika is using dependable IoT technology to ensure the advices generated is based on reliable IoT data. [Music] Sridhar Vedantham: Right, so this is quite inspiring, that you've actually managed to deploy these things in, you know, real life scenarios and it's already giving benefits to the people that you're working with. You know, what always interests me with research, especially when you have research that's deployed in the field- is there anything that came out of this that surprised you in terms of learning, in terms of outcome of the experiments that you conducted? Akshay Nambi: Yeah, so I can give you one concrete learning, right, going back to air pollution sensors, so we have heard partners identifying these sensors going bad within just few weeks of deployment. And today they have no way to figure out what was wrong with these sensors. Using out technology, in many cases they were able to pinpoint, yes, these are faulty sensor which needed replacement right? And there was also another interesting scenario where the sensor is working well- it's just that because of dust, the sensor was showing wrong data. And we were able to diagnose that and inform the partner that all you have to do is just clean the sensor, which should bring back to the normal state as opposed to discarding that. So that was a great learning in the field what we had. Ajay Manchepalli: The interesting thing that we observed in all these scenarios is how the entire industry is trusting data, and using this data to make business decisions, and they don't have a reliable way to say whether the data is valid or not. That was mind boggling. You're calling data as the new oil, we are deploying these things, and we're collecting the data and making business decisions, and you're not even sure if that data that you've made your decision on is valid. To us it came as a surprise that there wasn't enough already done to solve these challenges and that in some sense was the inspiration to go figure out what it is that we can do to empower these people, because at the end of the day, your decision is only as good as the data. Sridhar Vedantham: Right. So, you know, one thing that I ask all my guests on the podcast is, you know, the kind of work that you guys do and you're talking about is truly phenomenal. And is there any way for people outside of Microsoft Research or Microsoft to actually be able to use the research that you guys have done and to be able to deploy it themselves? Akshay Nambi: Yeah. Yeah. So all our work right is in public domain. So we have published numerous top conference papers in the areas of IoT and sensors. And all of these are easily accessible from our project page aka.ms/dependableIoT. And in fact, recently we also made our software code available through a SDK on GitHub, which we call as Verified Telemetry. So today IoT developers can now seamlessly integrate this SDK into their IoT device and get sensor status readily. We have also provided multiple samples on how do you integrate with the device, how do you use a solution sample and so on. So if you are interested, please visit aka.ms/verifiedtelemetry to access our code. Sridhar Vedantham: Right, and it's also very nice when a research project name clearly and concisely says what it is all about. Verified Telemetry- it's a good name. Akshay Nambi: Thank you. Sridhar Vedantham: All right, so we're kind of coming to the end of the podcast. But before we, you know, kind of wind this thing up- what are you looking at in terms of future work? I mean, where do you go with this? Akshay Nambi: So, till now we are mostly focused on some specific scenarios in environmental monitoring and so on, right? So, one area which we are deeply thinking is towards autonomous and safety critical systems. Imagine a faulty sensor in a self-driving vehicle or an autonomous drone, right? Or in an automated factory floor, where data from these sensors are used to take decisions without human in the loop. In such cases, bad data leads to catastrophic decisions. And recently we have explored one such safety critical sensor, which is smoke detectors. And as we all know, smoke detectors are being deployed in numerous scenarios right from hospitals to shopping malls to buildings, and the key question which we went after, right, is how do you know if your smoke detector is working or not, right? To address this, what today people do is, especially in hospitals, they do a manual routine maintenance check where a person uses a aerosol can, to trigger the smoke alarm and then turn them off in the back end. Sridhar Vedantham: OK, that does not sound very efficient. Akshay Nambi: Exactly, and it's also a very laborious process and significantly limits the frequency of testing? And the key challenge, unlike other sensors here, is you cannot notice failures until unless there is a fire event or smoke. Sridhar Vedantham: Right. Akshay Nambi: Thus it is very imperative to know whether your detector is working or not in a non-smoke condition. We have again developed a novel fingerprint which can do this and this way we can detect if a sensor is working or faulty even before a fire event occurred and alert the operators in a timely manner. So for those who are interested to understand and curious of how would you do that, please visit our webpage and access the manuscript. Sridhar Vedantham: Yeah, so I will add links to the web page as well as to the GitHub repository in the transcript of this podcast. Akshay Nambi: Thank you. Sridhar Vedantham: Ajay, was there something you wanted to add to that? Ajay Manchepalli: Yeah, in all our early deployments that we have made, we have seen that sensor fault is one of the primary issues that comes in play and that's what this has been addressing. But there are many other scenarios that come up that are very relevant and can empower the scenarios even further and that is things like, when you have the data drift or when you observe that the sensors are not connected correctly to the devices and so is some sort of sensor identification. These are some of the things that we can extend on top of what we already have. And while they are incremental changes in terms of the capability, the impact and the potential it can have for those scenarios is tremendous. And that's what keeps it exciting is that all the work that we are doing is driven by the actual needs that we are seeing out there in the field. Sridhar Vedantham: Excellent work. And Akshay and Ajay, thank you so much once again for your time. Akshay Nambi: Thank you Sridhar. Great having this conversation with you. Ajay Manchepalli: Yep, thanks Sridhar. This is exciting work, and we can't wait to do more and share more with the world. [Music]  

IGeometry
Optimizing Communication and Networking in Database Systems

IGeometry

Play Episode Listen Later May 18, 2021 41:25


In today's show, I discuss the nature of communications in database systems and how the pattern completely changed with 3-tier web architecture. I also discuss whether multiplexing protocols such as HTTP/2 and QUIC can help elevate some of the inefficiencies introduced. * Intro 0:00 * Communication Protocols 2:00 * 3 Web Tier Architecture 8:00 * Connection Pooling 14:50 * Database Connection Multiplexing 23:40 * Will Databases handle high concurrency 32:00 Support my work on PayPal https://bit.ly/33ENps4 Become a Member on YouTube https://www.youtube.com/channel/UC_ML5xP23TOWKUcc-oAE_Eg/join

IGeometry
Write Amplification Explained in Backend Apps, Database Systems and SSDs

IGeometry

Play Episode Listen Later Apr 5, 2021 22:22


Write Amplification Is a phenomenon where the actual writes that physically happen are multiples of the actual writes desired. In this episode, I'll discuss 3 types of write amplifications and their effects on performance and lifetime of storage mediums. 0:00 intro 2:00 Application write amplification 4:30 Database write amplification 9:30 SSD Disk write amplification 16:00 SSD hates BTrees 20:00 summary Resources https://en.wikipedia.org/wiki/Write_amplification https://www.cybertec-postgresql.com/en/hot-updates-in-postgresql-for-better-performance/ https://youtu.be/5Mh3o886qpg --- Send in a voice message: https://anchor.fm/hnasr/message

BI or DIE
56. Insights SAP and Strategy SAP DWC - Talking with Mohamed Abdel Hadi, SAP | EN

BI or DIE

Play Episode Listen Later Jan 28, 2021 51:23


As Global Vice President for SAP Data Warehouse Product Management & Strategy, Mohamed Abdel Hadi (Mo) is responsible for the entire data warehouse and on-premise portfolio. He has been focusing on analytics and data warehousing at SAP for more than 11 years now. With this background, he provides a strong technical, market and customer focused vision for new products and technology driven development projects. Leading a global team, he focuses on delivering best-in-class analytics and data warehousing products to customers and strengthening SAP's position in the converging data and analytics markets. Mo graduated from Flensburg University of Applied Sciences with a degree in Business Intelligence, Database Systems and Algorithms. He lives in Frankfurt am Main, Germany and enjoys traveling to exciting countries in his spare time. Furthermore, he performs magic and magic tricks, challenges friends in table tennis and loves spending time with them. In our chat with Mo, we discussed cultural specifics of projects around the globe. For whom do references count, who is into comprehensive POCs, who are early adopters of the latest products etc.? In addition, it becomes clear why SAP has taken the path towards the Data Warehouse Cloud (DWC), why SAP focuses especially on customer use cases and shows openness towards other product vendors. There are clear project recommendations for customer who want to use SAP Analytics Cloud and Data Warehouse Cloud. But we also talk about existing customers who are using SAP BW or older frontend tools and discuss possible implementation approaches for the new SAP products.

IGeometry
Pessimistic concurrency control vs Optimistic concurrency control in Database Systems Explained

IGeometry

Play Episode Listen Later Aug 20, 2020 16:00


In this video, I discuss the different concurrency control at database transactions, specifically the pessimistic vs optimistic concurrency control. and the pros and cons of each. 0:00 Intro 3:00 concurrency Control 5:30 Pessimistic concurrency Control 9:20 Optimistic concurrency Control Resources https://en.wikipedia.org/wiki/Optimistic_concurrency_control https://www.baeldung.com/java-jpa-transaction-locks https://docs.oracle.com/javaee/7/api/javax/persistence/OptimisticLockException.html https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use https://www.2ndquadrant.com/en/blog/postgresql-anti-patterns-read-modify-write-cycles/ --- Send in a voice message: https://anchor.fm/hnasr/message

How I Launched This: A SaaS Story
Open Source In-Memory Database Systems with Redis Labs Chief Product Officer Alvin Richards

How I Launched This: A SaaS Story

Play Episode Listen Later Aug 16, 2020 50:07 Transcription Available


We're back with How I Launched This: A SaaS story! This week, Stephanie Wong (@swongful) talks with Alvin Richards about Redis Labs, a company optimizing the Redis open source in-memory database system to build better managed tools for enterprise clients.Alvin begins the show describing how his love of solving complex development problems and great people skills have put him in a unique position to act as intermediary between engineers and clients, gaining insights into real-world problems and how to solve them. Looking to the future, Alvin's team also anticipates client needs, creating database products that will continue to help clients as their projects evolve.Later in the show, Alvin describes how the Redis system built in the cloud was reworked to also provide on-prem offerings. We learn how Redis Labs was able to fill a gap in the market by offering a database product that both developers and clients could understand, adapt, and use. Alvin introduces us to other Redis Labs products, including Redis for Enterprise which allows tiering between memory forms, in-memory caching, scaling, and more for a flexible database experience.We wrap up the show with a discussion of what it's like coordinating the development of such a large open source project and why Redis Labs supports open source. Alvin offers advice to other companies, stressing the importance of building solutions with both the creator and client in mind and educating clients and developers to use the software effectively. We talk about the future of open source in SaaS companies and how important it will be for scaling SaaS technology. Alvin concludes by encouraging everyone to ultimately find joy in what they do.Episode Links:RedisRedis LabsDockerMemorystoreMongoDBElasticRedis University

The Women in Tech Show: A Technical Podcast

Many applications we use are distributed systems. Some examples are ATM systems, social media and currency trading. Depending on the type of application, we might need to process system events in a logical order also known as causal order. Aly Cabral, Lead Product Manager at MongoDB, explained how causal order is implemented in a database system. We talked about several design implementations that she and her team at MongoDB considered. Aly explained concepts like distributed systems, clock synchronization and dependency tracking. MongoDB is a sponsor of the show.

ERP-Podcast.de
#98b - Eine kurze Geschichte der Datenbanken: Ein Interview mit Prof. Dr. Gottfried Vossen

ERP-Podcast.de

Play Episode Listen Later Jun 18, 2019 41:29


Datenbanken sollen große Datenmengen permanent und möglichst effizient verwalten. Was vorgestern mit hierarchischen und relationalen Datenbanken begann, ist heute längst in verteilte Datenbankkonzepte, Big-Data-Ansätze und In-Memory-Technologien eingeflossen. Mit meinem Kollegen Gottfried Vossen spreche ich im ersten Teil über die historischen Herausforderungen und den aktuellen State of the Art. Im zweiten Teil diskutieren mein Kollege Gottfried Vossen und ich neue Ansätze, aber auch das Potential für Handy-Datenbanken, Künstliche Intelligenz und Datenmarktplätze. Viel Vergnügen! Verwandte Folgen: #25: Einblicke in die SAP #34: Der Goldfisch im Internet #58: Technologiemigration #97: Das Kettengespenst - ein Einblick in die aktuellen Entwicklungen im Bereich Blockchain Literaturempfehlung: Lemahieu, W., vanden Broucke, S. & Baesens, B.: Principles of Database Management. 2018 https://www.amazon.de/Principles-Database-Management-Practical-Analyzing/dp/1107186129 Elmasri, R. & Navathe, S. B.: Database Systems. 2016 https://www.amazon.de/Database-Systems-Ramez-Elmasri/dp/1292097612/ref=sr_1_1?qid=1559588206&refinements=p_27%3AShamkant+B.+Navathe&s=books&sr=1-1&text=Shamkant+B.+NavatheChen P.P.: The Entity-Relationship Model. 1976 http://bit.csc.lsu.edu/~chen/pdf/erd-5-pages.pdfCodd E.F.: A Relational Model of Data for Large Shared Data Banks. 1970 https://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf Wenn Ihnen unsere Folgen gefallen, dann freuen wir uns über eine 5-Sterne-Bewertung, damit auch andere auf diesen Podcast aufmerksam werden und wir das Angebot weiter verbessern können. Zeitaufwand: 1-2 Minuten. Link zur Seite hier. In diesem Sinne: keep connected. Herzlichst Ihr Axel Winkelmann

ERP-Podcast.de
#98a - Eine kurze Geschichte der Datenbanken: Ein Interview mit Prof. Dr. Gottfried Vossen

ERP-Podcast.de

Play Episode Listen Later Jun 11, 2019 39:11


Datenbanken sollen große Datenmengen permanent und möglichst effizient verwalten. Was vorgestern mit hierarchischen und relationalen Datenbanken begann, ist heute längst in verteilte Datenbankkonzepte, Big-Data-Ansätze und In-Memory-Technologien eingeflossen. Mit meinem Kollegen Gottfried Vossen spreche ich im ersten Teil über die historischen Herausforderungen und den aktuellen State of the Art. Im zweiten Teil diskutieren mein Kollege Gottfried Vossen und ich neue Ansätze, aber auch das Potential für Handy-Datenbanken, Künstliche Intelligenz und Datenmarktplätze. Viel Vergnügen! Verwandte Folgen: #25: Einblicke in die SAP #34: Der Goldfisch im Internet #58: Technologiemigration #97: Das Kettengespenst - ein Einblick in die aktuellen Entwicklungen im Bereich Blockchain Literaturempfehlung: Lemahieu, W., vanden Broucke, S. & Baesens, B.: Principles of Database Management. 2018 https://www.amazon.de/Principles-Database-Management-Practical-Analyzing/dp/1107186129 Elmasri, R. & Navathe, S. B.: Database Systems. 2016 https://www.amazon.de/Database-Systems-Ramez-Elmasri/dp/1292097612/ref=sr_1_1?qid=1559588206&refinements=p_27%3AShamkant+B.+Navathe&s=books&sr=1-1&text=Shamkant+B.+NavatheChen P.P.: The Entity-Relationship Model. 1976 http://bit.csc.lsu.edu/~chen/pdf/erd-5-pages.pdfCodd E.F.: A Relational Model of Data for Large Shared Data Banks. 1970 https://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf Wenn Ihnen unsere Folgen gefallen, dann freuen wir uns über eine 5-Sterne-Bewertung, damit auch andere auf diesen Podcast aufmerksam werden und wir das Angebot weiter verbessern können. Zeitaufwand: 1-2 Minuten. Link zur Seite hier. In diesem Sinne: keep connected. Herzlichst Ihr Axel Winkelmann

LeadGenius Radio
How to create products you have complete control over.

LeadGenius Radio

Play Episode Listen Later May 16, 2019 22:57


Mark Godley, CEO of LeadGenius, chats with Maria Grineva, CEO and Co-Founder at Orb Intelligence, about her experiences as a serial entrepreneur in the data space over the last decade. What sets Maria apart from the pack is that she has built her companies without venture capital funds. This has allowed Maria to create companies at her own pace, grow organically and create products that she and her team have complete control over. Maria has a perspective on the industry that we all wish we had...pure and free! Join us this month to get the scoop on how Maria is growing and expanding Orb Intelligence on her own terms. About our guest, Maria Grineva Maria Grineva (Grin-yella) is a computer scientist and tech entrepreneur. Over the last 10 years, she has built two successful technology businesses. Currently, as a Founder and CEO of Orb Intelligence, Maria's mission is to bring high-quality firmographic data and advanced tools to use it to fuel B2B advertising, marketing, and sales. Before starting Orb, Maria worked as a Senior Scientist at Yandex, where she led a team of engineers and scientists to build a social search product called Wonder. In 2009, Maria co-founded TweetedTimes, a personalized news service which was acquired by Yandex in 2011. In 2010, Maria co-founded Teralytics, a big data consulting services for Swiss corporations. Maria has a Ph.D. in Computer Science from Russian Academy of Sciences with specialization in database systems, followed by two years in ETH, Zurich as a postdoc in Database Systems group. She has published her research work at prestigious academic conferences: WWW, SIGMOD, VLDB, and others.

GALs  (Audio) - Channel 9
Interview with Harini Gupta, Senior Program Manager at Microsoft

GALs (Audio) - Channel 9

Play Episode Listen Later Dec 12, 2017 20:23


In this episode of GALs, Soumouw interviews Harini Gupta, Senior Program Manager at Microsoft. Harini has been working at Microsoft for 11 years with plenty of experience in developing and shipping software that is used by billions of people. She has spent most of her career as a software developer/engineer. Switched roles in the last 2 years and is currently a Program Manager within the Database Systems organization. Her team is responsible for building tools & services that enable Enterprises to modernize their data platform. She has been invited as a speaker at the STARWEST/STARCANADA software quality engineering conferences to talk about various software agility topics.In addition to Harini's day job as a software person, she is actively engaged in many hobbies and interests. Harini is very passionate about having an impact for a humane cause and giving back to the community that we live in. Since 2015, she has been involved in helping out with Making Connections at the University of Washington Women's Center. This program is aimed at increasing college enrollment and career interests in STEM (Science, Technology, Engineering, and Math) fields for underprivileged children. Some of her responsibilities include mentoring students, MS campus tour guide for middle/high school students, and attending annual fundraising events.Harini also has an immense love for animals since she was a little girl. She has expressed her caring and tenderness for animals in many ways over the years – whether it was growing up with a dog/pet, nurturing and feeding lost kittens, returning a lost dog to its family, dog sitting for friends, or volunteering at animal shelters. She was also fortunate to have several volunteer opportunities at animal shelters, where her responsibilities included dog walking, and cleaning up. Currently Harini serves as a Board Member on one of the local non-profit animal shelter (Homeward Pet Adoption Center).On the personal front, Harini lives in Redmond with her husband who is also in software industry; two children who are 9 and 5 years old; a furry child Golden Retriever puppy and 2 fishes. When She is not working at Microsoft or volunteering in my activities, her time and energy is spent with my family. They love biking, reading, gardening & grilling (love Seattle summers!). Harini also enjoys any and all cardio activities such as Zumba, Yoga, Running, and other exercises. Follow Soumow and Harini on twitter:@gupta_harini@soumow

GALs   - Channel 9
Interview with Harini Gupta, Senior Program Manager at Microsoft

GALs - Channel 9

Play Episode Listen Later Dec 12, 2017 20:23


In this episode of GALs, Soumouw interviews Harini Gupta, Senior Program Manager at Microsoft. Harini has been working at Microsoft for 11 years with plenty of experience in developing and shipping software that is used by billions of people. She has spent most of her career as a software developer/engineer. Switched roles in the last 2 years and is currently a Program Manager within the Database Systems organization. Her team is responsible for building tools & services that enable Enterprises to modernize their data platform. She has been invited as a speaker at the STARWEST/STARCANADA software quality engineering conferences to talk about various software agility topics.In addition to Harini's day job as a software person, she is actively engaged in many hobbies and interests. Harini is very passionate about having an impact for a humane cause and giving back to the community that we live in. Since 2015, she has been involved in helping out with Making Connections at the University of Washington Women's Center. This program is aimed at increasing college enrollment and career interests in STEM (Science, Technology, Engineering, and Math) fields for underprivileged children. Some of her responsibilities include mentoring students, MS campus tour guide for middle/high school students, and attending annual fundraising events.Harini also has an immense love for animals since she was a little girl. She has expressed her caring and tenderness for animals in many ways over the years – whether it was growing up with a dog/pet, nurturing and feeding lost kittens, returning a lost dog to its family, dog sitting for friends, or volunteering at animal shelters. She was also fortunate to have several volunteer opportunities at animal shelters, where her responsibilities included dog walking, and cleaning up. Currently Harini serves as a Board Member on one of the local non-profit animal shelter (Homeward Pet Adoption Center).On the personal front, Harini lives in Redmond with her husband who is also in software industry; two children who are 9 and 5 years old; a furry child Golden Retriever puppy and 2 fishes. When She is not working at Microsoft or volunteering in my activities, her time and energy is spent with my family. They love biking, reading, gardening & grilling (love Seattle summers!). Harini also enjoys any and all cardio activities such as Zumba, Yoga, Running, and other exercises. Follow Soumow and Harini on twitter:@gupta_harini@soumow

Data Skeptic Bonus Feed
Rohan Kumar, GM for the Database Systems Group at Microsoft

Data Skeptic Bonus Feed

Play Episode Listen Later Jun 12, 2017 28:48


Discussion of database as a service, database migration, threat detection, R/python in SQL Server, and use cases

CTO Connection
From Startup CTO to Acquired by Google — Anant Jhingran, CTO @Apigee

CTO Connection

Play Episode Listen Later Feb 16, 2017 39:33


Anant Jhingran has been in the tech industry for 27 years and has a Ph.D. in Database Systems from Berkeley.  He’s been at IBM, CTO of Apigee, and then after Apigee was acquired, at Google.Apigee manages an API & data layer for various large enterprises.  They look at the data exhaust of those systems and understand patterns so people can improve what they’re doing through the APIs.  Apigee was recently acquired by Google, and Anant is working on integrating its technologies into Google’s infrastructure.

Three Devs and a Maybe
102: Postgres Performance Tuning and Query Planner with Bruce Momjian

Three Devs and a Maybe

Play Episode Listen Later Jun 11, 2016 54:48


In this weeks episode we are very lucky to be joined by Bruce Momjian to discuss Postgres Performance Tuning and Query Planner. We start off discussion around how Bruce got interested in Database Systems, a brief history of Postgres and his involvement with the project over the years. Following this we highlight the three main areas which affect database performance - hardware, server configuration and SQL/indexing. With this knowledge in hand, we then delve into the Query Planner, demystifying some of the terminology and concepts used (i.e. cost, scan methods and join methods). Finally, we summarise how these concepts are used by Postgres to decide which query plan to pick for a supplied query.

planner sql query postgres performance tuning database systems bruce momjian
Rebuild
127: Post-mature Optimization (omo)

Rebuild

Play Episode Listen Later Jan 19, 2016 95:04


Hajime Morita さんをゲストに迎えて、f.lux, Michael Stonebraker, GitHub, JIRA, 高速化などについて話しました。 Show Notes Facebook admits that its app is draining your iPhone's battery 意見を持つ iOS 9.3 f.lux Sherlock Twilight Night Light mode in Google Play Books Readings in Database Systems, 5th Edition Michael Stonebraker dear-github: An open letter to GitHub from the maintainers of open source projects JIRA vs GitHub issues Python Dev Moving to GitHub Performance is a Feature John Rauser, "Investigating Anomalies" The Tail at Scale Accidentally Quadratic 開発環境のデータをできるだけ本番に近づける - クックパッド開発者ブログ Fastly | Real-time CDN Velocity 2011: Artur Bergman Systrace | Android Developers Dianne Hackborn (Architect of Android platfrom) ChakraCore/perftest.pl

IS 301 course podcasts & RSS feeds by Prof. Ed Nickel
Database Systems and Business Intelligence Week 5

IS 301 course podcasts & RSS feeds by Prof. Ed Nickel

Play Episode Listen Later Sep 23, 2013


Databases, data warehouses, data marts, and data mining as information sources to make intelligent business decisions.

Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Similarity search and data mining techniques for advanced database systems.

Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02

Play Episode Listen Later Dec 21, 2006


Modern automated methods for measurement, collection, and analysis of data in industry and science are providing more and more data with drastically increasing structure complexity. On the one hand, this growing complexity is justified by the need for a richer and more precise description of real-world objects, on the other hand it is justified by the rapid progress in measurement and analysis techniques that allow the user a versatile exploration of objects. In order to manage the huge volume of such complex data, advanced database systems are employed. In contrast to conventional database systems that support exact match queries, the user of these advanced database systems focuses on applying similarity search and data mining techniques. Based on an analysis of typical advanced database systems — such as biometrical, biological, multimedia, moving, and CAD-object database systems — the following three challenging characteristics of complexity are detected: uncertainty (probabilistic feature vectors), multiple instances (a set of homogeneous feature vectors), and multiple representations (a set of heterogeneous feature vectors). Therefore, the goal of this thesis is to develop similarity search and data mining techniques that are capable of handling uncertain, multi-instance, and multi-represented objects. The first part of this thesis deals with similarity search techniques. Object identification is a similarity search technique that is typically used for the recognition of objects from image, video, or audio data. Thus, we develop a novel probabilistic model for object identification. Based on it, two novel types of identification queries are defined. In order to process the novel query types efficiently, we introduce an index structure called Gauss-tree. In addition, we specify further probabilistic models and query types for uncertain multi-instance objects and uncertain spatial objects. Based on the index structure, we develop algorithms for an efficient processing of these query types. Practical benefits of using probabilistic feature vectors are demonstrated on a real-world application for video similarity search. Furthermore, a similarity search technique is presented that is based on aggregated multi-instance objects, and that is suitable for video similarity search. This technique takes multiple representations into account in order to achieve better effectiveness. The second part of this thesis deals with two major data mining techniques: clustering and classification. Since privacy preservation is a very important demand of distributed advanced applications, we propose using uncertainty for data obfuscation in order to provide privacy preservation during clustering. Furthermore, a model-based and a density-based clustering method for multi-instance objects are developed. Afterwards, original extensions and enhancements of the density-based clustering algorithms DBSCAN and OPTICS for handling multi-represented objects are introduced. Since several advanced database systems like biological or multimedia database systems handle predefined, very large class systems, two novel classification techniques for large class sets that benefit from using multiple representations are defined. The first classification method is based on the idea of a k-nearest-neighbor classifier. It employs a novel density-based technique to reduce training instances and exploits the entropy impurity of the local neighborhood in order to weight a given representation. The second technique addresses hierarchically-organized class systems. It uses a novel hierarchical, supervised method for the reduction of large multi-instance objects, e.g. audio or video, and applies support vector machines for efficient hierarchical classification of multi-represented objects. User benefits of this technique are demonstrated by a prototype that performs a classification of large music collections. The effectiveness and efficiency of all proposed techniques are discussed and verified by comparison with conventional approaches in versatile experimental evaluations on real-world datasets.

Microcomputers and Applications CSIS1180007 Fall 2006
Lecture Twelve - Database Systems 120506

Microcomputers and Applications CSIS1180007 Fall 2006

Play Episode Listen Later Dec 8, 2006


Database and Information Systems  What is a database and why is it beneficial to use databases Components of a database Database management systems Data Warehouses and Data Mart                     Running time: 39:48  Listen To The Mp3 Here! This lecture was given on December 5, 2006

CERIAS Security Seminar Podcast
Marianne Winslett, Traust and PeerTrust2: Applying Trust Negotiation to Real Systems

CERIAS Security Seminar Podcast

Play Episode Listen Later Apr 20, 2005 54:02


Automated trust negotiation is an approach to authorization for open systems, i.e., systems where resources are shared across organizational boundaries. Automated trust negotiation enables open computing by assigning an access control policy to each resource that is to be made accessible to "outsiders"; an attempt to access the resource triggers a trust negotiation, consisting of the iterative, bilateral disclosure of digital credentials and related information. In our recent work in applying the TrustBuilder system for trust negotiation to real-world systems, we have encountered the need to make trust negotiation facilities available to legacy peers, which has led to the development of the Traust system. We have also encountered the need to include helpful third parties in the negotiation process, such as credential wallets, remote authorization servers, and brokers. PeerTrust2 is our effort to design a language that allows us to reason about trust negotiations involving helpful third parties, while supporting exposure control, delegation, proof hints, declarations of purpose, sensitive policies, and other potentially useful aspects of access control. In this talk, I will demonstrate Traust and describe its internal design, and then describe PeerTrust2. About the speaker: Marianne Winslett has been a professor at the University of Illinois at Urbana-Champaign since 1987. Her current research interests include security in open systems and data management for high-performance parallel scientific applications. She was an editor for ACM Transactions on Database Systems from 1994 to 2004, and has been the vice-chair of ACM SIGMOD since 2000. She received an NSF Presidential Young Investigator Award in 1989.