Microsoft Power BI: AI powered analytics - BRK2015
Good. Morning, my. Name is Bogdan I am running the engineering, team which is building the AI and data flows features in power bi and, I have my colleague you see not you dick. Thank. You very much for showing up early in the morning understand. That probably half of you live west from here so it must be even earlier at this it is for us we. Love seeing so many people here we love the interest that we've seen in the community for the EA features, and we hope that you are going to find something useful in our presentation, today. So. What. We'll cover today we'll, talk just a bit about AI. And power bi and what it means we are going to show you a few demos for what. We intend to offer the analyst and then we'll show a few more demos for end-users. I. Don't. Know if you are familiar with this but Microsoft, is investing, in machine learning and AI for for bi, for, about 20 years the first features that are related to machine learning and AI appeared, in 1999. In a product called sequel server 2000, Microsoft. Has one of the best research, organizations, in the world we have some of the most talented data scientist across everything AI we. Have a very successful bi product, and we decided to bring this together we decided to give the BI users the power of everything that Microsoft Research produces, so. This is the main reason for our investment, here we also believe that AI is transformational. For everything particularly for the bi industry, because we try. To enable you to do more with your data and to find insights, and to find things that are difficult to discover otherwise. Cool. Before, I start can you guys hear us okay because I saw some people yeah awesome, all right thank you, okay. So just. To kind of give first of a kind of bit of an overview of a deep dive into what. Kinds of capabilities we provide, and which personas, we cater towards and from. The, power bi world, we really try to cater, towards the, whole spectrum of users starting, from our, business users to our analysts, to our data scientists, but, cater. To them in the slightly different ways so. For if we start with the end users all the way here on the left hand side what our mission really is is how can we just provide, insights. For, our business users how, can we allow them to just proactively, find insights without having to slice and dice all, the data themselves and just, be able to learn new things from their data. Similarly. From a natural language perspective, we, just want it to be very very easy intuitive, to, be able to again, get answers out of your data so, if I just want to type in that question that should just be super seamless, for me inside power bi moving. On to the analyst persona it's really about providing our analysts, the, toolkit of capabilities. That they can do more with things like machine learning and power bi but. Our analysts, are not our data science persona it's not about you, know enabling, them with Jupiter notebooks, and having them write some Python code it's, really about how can we bring, particularly. The power of Azure services, we have so much going on in Azure with cognitive services with. Automated, ml with Azure Mel how, can we bring a lot of those capabilities, to our analysts, but, in a ways that are familiar, and intuitive to them so how, can they just do one-click functions, and operations, to, do things like textual analytics, or to build their own machine learning models directly, inside power bi, for.
Our Data scientists. We're not really trying to compete with tools, like Python, or R or again, with things like the Azure machine learning service. We're, using power bi more as a way to compliment, those. Users so, being able to do things like integrate, with R and integrate, with Python, and give data scientists, an, ability to be able to share their findings with business users but, leave them in the tools that they're familiar with but, allow them to work with power bi too, so. For the rest of the presentation we're really going to be mostly doing a deep dive into, demos. So it's gonna be very demo driven and it's gonna be split into two Bogdan is gonna be covering a lot of the analyst capabilities. And we're, really gonna do a lot of focus on the automated, machine learning capabilities, out, of just curiosity. How many of you have, heard. About the automated ml capabilities, and power bi. Ok. Quite a few of you have had any of you played around with it yourselves, ok. Fewer hands so what we want to do today and what Bogdan will be going through is really taking an end-to-end scenario, not, just to show you the capabilities. Of automated. Ml power bi but, actually take you through an end-to-end, use case and I would love for you guys as well and that's talking to think about how this relates to your business and the types of problems you solve but, really we want to show you not just hey you know this is how you can there's, a wizard and this is how you build, a machine learning model but, how you can really, get business, value out of automated, machine learning and. Then from my perspective we're going to go through more, of the end-user persona, some of the more fun. Visual, demos, I may be biased. Then. A bug that's gonna do great I promise you but. You know we're gonna look at things like again. A little bit of what we went through the keynote demo so there'll be a little bit of repetition but, I want to give you guys a little bit more context, into you, know what is our thinking behind things like you know the cue a visual, or thinking, behind a decomposition, tree so. Talk more about our AI visualizations. Investments. And what, we're going to be doing over there so. With that I'm gonna hand over to bog down to go, through some of the more elements a cure so chemists there are two things that I'd like you to remember if possible at the end of my 15 minutes demo first we, have amazing automation. Capabilities in, power bi to get rid of the most daunting. Tasks, related, Martian learning everything that you. Have to do anyway everything, that is routine, we automate it for you and I'll go through those but. That's just a nice part of power bi we try to take care of your needs there is a second part which is more important here Marshall, learning besides being a nice catchy. Demo besides being fun and everything else with, regard to your data it's a powerful, tool for solving problems so. I'm going to switch to the demo now. What. I'm showing you here is a real problem. I'm. Going to tell you where what is real and what is not truly real about what I'm describing here Tanzania. It's a great country in Africa it's one of the largest countries in Africa it has lots of water but, it's not uniformly, distributed many. Of the people in Tanzania like 50% of the people in Tanzania do not have access to clean water there. Is a data set that was offered publicly, by the ministry. The water ministry of Tanzania. The data set contains, records, for about 60,000. Pumps their. Water pumps that are installed in villages. And various, places and they're trying to offer clean water to people there so. They. Have pumps that are likely to break in the future and they would love if they could predict. Which pumps are likely to break in order to intervene Indian order to fix those pumps it is much cheaper to send somebody to fix the pump before it breaks then. It is to leave a whole community without water so this is a real problem, now. The fictitious part of this is that I'm going to pretend that I'm working for a foreign, organization, that has a limited budget of 1 million dollars I'm going to pretend that fixing one pump takes $1,000 I made, up these numbers but, the problem that I'm trying to tackle is how.
Should I focus my efforts if I have a limited budget and this is a problem that appears frequently in marketing, in any kind of a business settings the, purpose of the demo is to show you how I learn from the historical, data to figure out where should I focus my attention and then, how I apply, those learnings directly. On to, the problem in order to do as much good as I can with. A limited budget, so. The. Data that I'm using the, historical, data is, stored. Right now in a power bi data flow and, excuse. Me for a second for a shameless plug for data flow data flow it's another product that we are working on it's. It's. A great product it's basically power query that, you know in desktop running, in the sky and. There. Is convenience, there because if you build your retail subservice ETL in the cloud that means many people can use it in their reports but. From an air perspective, it means that instead of running power query on your laptop you are running power query on Azure and that means that your computing, power behind, the, thing is, significantly. Larger, than what you get on your desktop this, is what allowed us to do machine, learning very, easily in data flows how, many people here use data flows or heard about data flows. Fantastic. So I don't need to go too much into details now, the data set that I have, again contains, historical pump, records for. Each pump I know the amount of water they're pulling when this information, has been collected, I know who installed it what, is the population that, it serves and so on and so forth I know. Graphical information like what, kind of land is there is it dry or not and I, know whether the pump is functional, or not. Just. For the sake of simplicity I, have created a calculated, column which, tells me this is not functional, and, this. Is, the. Column that I created as you see a. Non-functional. Pump is marked as true these are the pumps that I'm looking for these are the pumps that I'm trying to detect the pumps that are not going to work anymore. Now. In. Order to launch machine, learning on top of a data flow I need, to go on this icon. Here. The. Small brain icon and. If. I press the button I can add, the machine learning model at this point I'm instructing, power bi to create a machine learning model on top of that data. Flow on top of that entity the, first thing that I have to do is to specify what, is that I'm trying to predict, now. In. This case I really want to predict whether a pump is gonna be not functional, I want, to predict if the pump is going to fail. When. I hit next power, bi is going to do a few things the first thing is that is going to look at the data inside that column there are many mushroom learning techniques based. On the data inside, power bi may decide to suggest, me to use classification, regression. Forecasting. Or different other techniques, in this case power bi detected, that my current column is a boolean it has two states true and false and this suggests that I'm doing binary prediction, I'm not limited to that I can, go here click. This link and pick a different model and the. Data type behind, this column supports a few classes of algorithms and, the. One I choose to do now is binary prediction, because. Indeed. It's a it's a two-state problem, now we. Do support regression, very soon we are going to add forecasting, to the OT ml toolkit. We, do support general classification. And we do support binary, prediction, and I'd. Like to just describe, for, a second, the difference between general classification. And binary predictions, general classification. Is the problem, of differentiating. Fruits between orange, apples, bananas, and, whatever I want. To have a clear demarcation line, the, classification, models are going to truly try to separate all the classes best, the binary prediction, models are in a sense very similar to general classification.
Now First they're focusing on two states so I can say that I'm looking for apples and not apt but. The second thing about binary prediction, is that in, binary prediction, I'm kind of hinting, the model what I care about I want, to find apples apples are important to me I want to throw away known apples and that, means that the report that we are going to produce allows, me to assign a value to a, prediction, to correct, finding of an Apple just like in this case I'm going to assign a value to identification, of a pump that doesn't work so. I'm going to use binary prediction, as I said for this task because. I do have a value for the, pump that I'm detecting I'm. Hitting next the, next thing that power bi does is actually just, a bit of magic it will look at all the data in my data flow and it's going to analyze it it's. Going to analyze the correlations, between the, fields in my data flow and my target and if. We look here. For. A second we. Can see that, this. Is a recommended feature among psh, it. Means that our analysis. Found a certain connection between that feature and my target we. See that funder, for instance, is. Not a recommended feature funder. Is not a recommended feature because it has high cardinality, that, means that I have lots of distinct values there that do not seem to connect. In any way to my data there's not much to learn from those distinct values, I'm. Going to scroll. Down to, one, of my favorite, pieces, of information found, by this analysis. Which. Emphasized. One of my mistakes because I rushed a bit for this demo but, everything is for the better look, at this high, correlation, so, the. Bulin column that I created I created that by putting an if statement in power query I'm basically, saying if the status group is not functional, then true power. Bi looked at this it detected, that from a statistical perspective, the, values in my target are very, tightly correlated with, the status group so it detected, this as what is called the leaky predictor, a column, that foretells, the future and it, shouldn't be there because probably I do not have access to it I may disagree with power bi I may go and check that box but, power bi decided, that. This is going to give me a false, accuracy, in the model. And, just to kind of add to this this is an enrichment in addition we made two automated ml and it's actually a direct result of the feedback you you, know as our, users gave us where, he told us you know we want to begin taking on more of a journey as you use automated, amount you want power bi to actually help you figure, out you know what is a good feature what's not a good feature many analysts. Are just starting out with machine learning and power bi so, this is also our way of just using, power bi to try and help you learn, what are good features in the machine learning model versus no and we're, gonna be trying to add more and more of the sort of guidance. And prescription, along the way to help you on this machine learning, journey in power, bi thank. You as you know we are shipping every month we're shipping actually the service we are shipping every week every week we are going to try to add something new to this experience based on your feedback now, again I selected, my features now, there.
Are A few other things that I'd like to show you if if you feel that data is missing from your model you. Can go to this advanced configuration. And you. Can, switch immediately to. The rest of power query experience, you can bring more data in to my knowledge this is the only tool that allows you today to shape your data bring your data from different places and model, it in the same time so we really need to to. Integrate, the machine learning experience, with the fantastic, data shaping, experience, you have in in power query. Let's. Assume that I'm happy with the data that I have here and I want to move to creating a machine learning model I need, to name my model predict. Will. Fail or something like this I can. Okay, her name exists, I. Can. Assign, a label to, my. Target. I'm going to say true and false just to avoid any confusion please keep in mind true, means it will fail false, means it mode and then, if I hit save and refresh we are going to start training, the machine learning model, I'm not going to train it right now and I'll explain in a second why but, the result of training the machine learning model is a report. Which. Los. Monos ago. Looks. More or less like this now. I'm. Going to go through a few important, sections of this report the, the top of the report tells me that power bi did what, is to be expected from, a standard machine learning, development process, it tells me as you will see here that. We. Split the data into training and testing we picked an algorithm, we did I mean any statistician, will tell you that we need to do that and I'm not gonna spend too much time explaining why we did this happy to take any questions later but then. Barbie. Eye tells me which, are the came for answers it tells me which of the features inside, my data actually, strongly. Impact, the outcome. If. You see here we see that the quantity of water that the water point type and everything else do, impact, the target the. Tornado chart that is it tells me which are the most important, features if, you look at this quantity. Field for just a second on the right hand side we tell you how it impacts the target and yellow. Means lots of true so yellow means that you are going to get lots of pump failures and the way I interpret this is that if the land is very dry, then, the pump very very likely to fail now this is kind of the obvious but there is an. Easy solution there, that means that if you put a sensor, on that, pump you can actually prevent it because somebody in knowing that the land is very dry yet decided to install a pump because it's needed so, this is a very actionable piece of information that I'm getting here. And. Step. Out of this and go one step further. And show you how. Power bi tells me something, about the accuracy of the model so a, model, is, never perfect it may be very useful the. Model is going to make predictions but. You have to keep in mind that the model does not know, anything about or false the model is a mathematical construct which is going to yield the number it's, going to produce a number between 0 and 1 and we choose to call that number of probability, now, in real-life situations, we tend to believe that 50% means, something we tend to believe that everybody everything about 50%. Is. Like a good candidate for true and anything below 50%, is not such a good candidate for true but in reality that, 50% cutoff, is a bit arbitrary now.
If You think of this and the model is it's a good mathematical, construct, it's optimized, for something, if, you think of this 50%, threshold as a bar and again. We are looking for pumps if. You raise that bar it, means that everything, that exceeds, the bar has, a very high likelihood of, being a correct prediction, if. You think in terms of you know. Correlating. Test scores, with, children's. Knowledge. Of a subject it's. Never perfect but, you can say that you know among, people who score 90%. Or higher there. Is a higher probability to have good. Mastering, of the subject, than among people who have zero to 10% scores, again, there, are always mistakes, but raising, the bar increases the chances, increases. What is the purity of results above the bar and this purity, of results above the bar correctness. Of true predictions above the bar is called precision, now. The problem with precision is really nice to have precision is really nice that you know all my predictions are true the problem with prediction, with precision is that if you raise the bar too high you. Are missing many children, in my previous example who. Are really good but just did not do well, on one test and. The. Measure that tells me how many of the really good children I've identified with. My bar is called recon and as you imagine, now playing, with this bar is basically always a trade-off between precision and recall now. If I'm going to use my model, in, a marketing, campaign if, I erase the bar too high it, means that all my customers would respond positively but. I'm going to miss many opportunities, if I lower the bar then. I'm going to find all the customers who are likely to respond but I'm going to spend lots of money on marketing campaign, if, I'm going back to my pumps situation, if i erase my bar all, the pumps, that I identify, as needing, repair will, be true defective, pumps going, there will not be a waste of money but there are many pumps that I will miss if, I lower the bar too much and I will have fixed all the pumps but, I'm going to run out of budget very soon so, this is. What. Is shown in this matrix. The. Precision area code numbers are here and, this. Slider, allows me to, model, the trade-off if. I raise the bar to let's say 77. You. See that my precision moved from 81, to 95. Precision, but, my recall is 55, that means I'm only finding about half of the entities that I'm looking for of the pumps that are likely to break if, I'm lowering the bar to. 20 something. Then. My, precision drops but my recall is almost 100% and. We. Listen to you and the gaza amazing feedback on how to make this better, and I'm going to show you in a second how to quantify, the, impact of this threshold but, before that I want to tell you exactly what happened if, you are a statistician if, you expect some specific, results.
Specificity Metrics, for your report you will find those in our second page our product, is not mainly intended for statisticians, but for those of you in the room who have a statistical, background you'll find in the second page everything you are looking for and last. But not least I mentioned when I launched. The wizard that I will not train the model right now the, reason I did not train the model is that behind wedges in a technology called Auto a now automated. Them out this technology comes from a journal, now. Many. Of you may have read in papers about machine learning or about how, machine learning is taking over the world that there are many, many algorithms, that can be applied to Marshall learning trees regression. Networks. Deep learning deep belief networks Bayesian, lots of them all. Of them have tons of parameters, and configurations, and settings that you need to tune for your problem. It, takes probably, a PhD, degree to understand in depth one or two of those algorithms, and we, don't expect our users, to have the time or to have an army of PhDs working for them so, Microsoft Research produce, this automated, ml, algorithm. Which basically, you, can think of it as a supermodel that. Learned, from all the models built by max of research and that supermodel, is, guiding, our search through the space of algorithms, and parameters, and it takes a very very short path but, the short path in this case is fourteen nine iterations, or 39 iterations, and as you can see on the screen the, accuracy goes up and the, automated ml system always tells us try this try, this and it leads us into the right direction but it takes 39, interation that's about 25, minutes of compute time over. This 60,000. ROS dataset. If. You are interested in what else automated, ml did for you we can go down and see lots of technical specifications, but, again I promise that I'm going to focus on the business outcome so let's go back to the first page and, let's, see how we can tie this precision, and recall. Trade-off. To, the real business problem. Now. We. Had some, very interesting conversations, with some of you in the room discussing. How to make this more, easy. To use for the business user and we had very. Strong suggestions, to associate a cost and the benefit to the deployment of the, machine learning model and this is the feature that would come out in a few weeks is called our cost-benefit analysis, let's, assume for a second that I'm going to use this model for a marketing, campaign, let's.
Assume For a second that I have a list of my customers, and I have 10,000, customers that I may reach out to with this marketing campaign that is the number on the bottom part of the screen. Let's. Assume that you know when, I reach out to the customer there, is a cost I don't know maybe I make a phone call maybe I try to sell them some sort of magazine I don't know but there is a marketing, cost for that and let's assume for a second that is just $1 because, we can put any number that you. See here that there is a unit cost field where. I can specify exactly, how much does it cost me to act, on a prediction, again, that is to say if my model tells me this guy is going to be true how, much does he cut me to act on that prediction then. The second parameter is the benefit if, this, guy is really true if my model was correct, what. Is my benefit, how, much do I expect to make out of that marketing, campaign and in this case the number is 2 so, if, we do this and, if. You think again at my metaphor, of a bar that is moving if, my body is very very high if, my probability cutoff, point is at 1, here. Then. I, will. Not find enough customers nobody, is good enough for me if. I lowered the bar to zero then, I'm going to spam everybody, all the 10,000 customers I'm going to pay lots, of money on the, marketing campaign but, I'm not going to get money back except from those guys who are truly paying. Who, are truly, going to be correct predictions, and there, is this curve which, very, frequently, in machine learning has this shape and this, curve has a sweet. Spot a point, where the profit is maximized, and power bi looks automatically, at this and tells you you know under, this modelling if you are targeting a 10,000, people population, and, if. You are going. To pay this cause then. The. Minimum probability threshold required, for maximum profit is 0.51. So. It found automatically. Under. This cost modeling. Conditions. Which is the best cutoff point for me but, again going back to the example that I mentioned I'm not, trying to make a profit here what, I'm trying to do something else I'm trying to maximize, the impact that, I have with the limited budget so I'm going to go to a cost only version. Of this chart, and. Again I have a cost the cost is let's say $1,000. For. One, pump. Now. You, can see that if I try to fix all, the pumps I have to spend $10,000. That's, lots of money my budget is 1 million so what can I get for 1 million I'm going to go here you, know what minutes probably in the middle and you.
See That if. I'm using the probability cutoff. Of let's. Say 0.97. Then, my costs are about exactly. 1 million so. I'm going to take this. 0.97. Threshold, I'm going, to go back here I'll. Say point 97. And. I'll. See how my model looks well, I will. Have a, 99%. Precision, that, tells me that I'm not going to waste almost any money or. If you want I'm going to waste 10,000, out of my million and, I'm going to reach 25, percent of the broken pumps so that means that with 10%, of the budget required, to address all the pumps I would have fixed 25%. Of the problem that's. A multiplier, that that that, has business impact well now, that I've set up my threshold I'm, instructing, my model to say every, pump that has the probability higher. Than 97%. Should. Be labeled as a candidate, for repair everything below should not be marked as the candidate for repair and I'm going to apply the model. When. I apply the model I, choose. One. Of the entities in my data flow new pumps you, see the threshold, from my modeling is copied in this UI right now and if, I hit apply model a new. Entity, will be created in my data flow, which. Is called, new. Pumps and rich with the predict you'll fail model. This. Contains. Pretty much all the information that I have the body pumps information, that is in prediction, and it contains three more columns the. Outcome which, is will break or not according, to my. Threshold the. Actual prediction score which, is a number between 0 and 100 it reflects the probability, multiplied, by 100 to make it easy to report and the, prediction explanation, that. Is a big, piece of information that tells me why this particular, pump, is likely to break and I remind you that at the very beginning of looking. At the training report I showed, you something called the key influencers in the model which, are the columns that actually impact the target now. The. Minds of research technology, that we are using to, produce explanations. Produces, global explanations, which is what matters most across all the historical, pumps but, also has the capability to produce local explanation, why is this particular, pump going to break now. With, all this information I can go back to my original report, and I, can blend the predictions, in my report, and. Then I see this. 258. Pumps in my, data that are really likely to fail again at, risk here, means, at risk according to my modeling it means that they have very high probability of failing I. See. Here how they're broken down by region and I'm going to pick this first particular, pump here it's a pump in that asana which is the capital of Tanzania. And, we. Look at the explanations, and the explanation, I mean this pump has the probability of failure of a. Hundred percent if you see a score and. If. We look at the breakdown. We. See how we look at most of the features describing. That pump we. See how the 20, t of water. Pushes. Up the risk, of failure we. See how the GPS, height pushes. Down the risk of failure because most. Of the pumps at that, altitude actually not likely to fail but there is something awkward about this one we. See how the extraction, pipe also, pushes down the risk and we see how all the features of that pump are adding, up to my score. So. With this I know. Which are the pump that I should address which are the 258th, prices that are going to just, allow me to maximize the impact of my budget and, with this I know why, each and every pump is likely to fail so when I'm sending a person there to fix it I can give them indication, so I can tell them what to look for like in this case I probably should tell them install a sensor because the land is dry.
So. Everything. I've shown you here like I said, it's. Based on a real problem the data is real the accuracy, that auto ml generated, for us is absolutely real the cost and benefits and the budget constraints, are kind of made-up but reality in any business problem, where you intend to use predictive analytics, you do have such constraints, you, do have a budget, that you can associate with your marketing campaign you are looking for a sweet spot to figure out which, of these customers, I should target and I'd. Like you to to retain two things out of this demo. Again, the. Way we automate, things in power bi we, are automating really really, powerful technology. We are harnessing lots, of things from acts of research where, we are. Trying to really make it easy for you but we are not making it easy just for the sake of making it is if we are making it easy because this this technology, has a real, impact on your business problems. Thank. You so I, mentioned. That this. Is the automated about technology, you, will hear about automated, amount in. Power bi you will hear about automated, mail in Azure and I, just want to explain. A bit what is common what is different between this. Automated. Amount of power bi and inertia it's the same technologies, literally the same code, if. You choose to use it in Azure, it. Means that probably you're closer to being a data scientist, you have more flexibility, in Azure you can choose how long you want to run your algorithms, you can choose even the algorithm, that will be picked by automating amount it's, it's. A more flexible way. Of using automated, amount if you sit in power bi we, are focusing, on the business user if, you are going to use it in inertia, you are going to need to have your own as your subscription, if you see a power bi you don't need to pay anything because if you have a premium capacity, we, host it there for you it's, part of the value that you are offering your premium customers you buy your course you're free to run whatever you want, the. Machine learning model that you build in Azure, using automated ml, can. Be hosted in Azure you can share it with other applications the machine, learning model that you are building in power bi is initially. Hosted in power bi I'm going to talk in a second how you can move things back and forth, you. Do need a larger subscription, as I said or you do need a premium subscription in power bi because in both cases it, takes significant. Computation power to, get this model and to optimize the accuracy. Now. For the future. Today. If you have any model, in in Azure ml you, can I think we switch, oh yeah, sorry if any model in Azure ml you can bring it the power bi but for the future we want to enable you to take models built in power bi and push them to Azure the reason you do this is either. That your model is very important for the business and you, want to upgrade it from a business analyst. Problem. Problem. Solver to something that actually is used. Across the business or you want to publish it as a web service all. You want or. You may want a data scientist to curate your model and refine it further so. Probably. By the end of the summer we are going to have the ability to export the machine learning models created by automated. Ml in power bi to. To. Jupiter notebooks so we are going to add this export model verb here, we. Are going to produce a jupiter notebook which encodes. Everything, that we done automatically, for you and you can hand this over to a data scientist. For. The second, part of the demo i'd like to talk a bit about our, thermal and cognitive services under milan cognitive services the, integration, of power bi with these features goes public this week in general. Availability you can use it already i think and if, not you are going to be able to use it tomorrow, this. Is one of the easiest. Way to use machine learning this. Is about using machine learning models that are be but some ideas if you have data scientists, in your organization. Who are building our thermal models you can very easily use, those models to enrich your power bi data if. You don't have data scientists, but you have a rich. Data unstructured data, inside. Power bi you, can obvious cognitive, services to extract value, from from. That data. We've. Shown this demo many many times so probably many of you have seen it but i just want to show you how easy it is to, invoke machine learning model in power bi. Of. Course and. Because. I've shown you in the first part of my present, data flows. What. I'm going to show right now it's an upcoming feature this is not yet there it's going to show up in a few months the experience, is the same but I want to show you that we're working in that direction this, is invoking cognitive, services and, invoking rational, models from the desktop so you will have the ability to do this without any kind of data flows you are going to do it directly on your data so.
The Data set that I've loaded here contains, information about hotel reviews I know, the location I know the. Comment, that customers, made about, their room some. Comments are positive, some of them are negative, I have. Even things like latitude. Longitude and the URL of the picture that the customer may have associated with that corner with, this upcoming feature, again. This is going to be available in a few months in desktop and this is available today in data flows you, will be able to click, insights, in order to what to, enrich your data set with new columns, when. You click the air in size button, we. Are giving you access to machine, learning models that are already hosted, on one of your premium capacities, such, as the cognitive services, and. We. Are also giving you access to any our thermal model that you are entitled to to, look at so we are going to scan, a dremmel figure out to which models you have accessed, we. Are going to get. A list of those models and we are going to wrap all of those models in functions, for, instance I have an animal model here that, can be can, do image classification I can. Pass anything, from my data flow or my power, query as an argument to this model and if, I hit ok then, we are going to invoke that model and do image classification over, the information from power query or. If I'm doing cognitive, services, I can. Choose to score sentiment, and I'm. Going to hit. Run at. This point all the guest comments in my data that are going to be pushed to, cognitive, services we are going to do sentiment, analysis, and yield back a number if the number is closer to 0 that means the comment is mostly negative if the number is closer to 1 that means the comments is mostly favorable now. Cognitive, services are. A set of machinery models built across Microsoft, properties they have these, models are trained on top of Word. Documents on, top of big searches, so there. Some of the best models that you may find for sentiment, analysis for image tagging and for other tasks like this and we try to make those available to you for free in the in the premium subscription, so, as you see here I have, some. Scores, associated, with comments and I have let's pick this year 0.04. That's a very negative sentiment. If. I go here I see that the comment is that the first impression was not great.
So. With. This I can now use. This, quantification. Of sentiments, in the report and. I. End. Up with very rich reports, like this which, allow me to see aggregations. Of sentiment, by hotel and so on and so forth you probably seen this demo a number of times already, again. Thank you very much you. Want to talk a bit about the services. So. Just to wrap up that bit really quickly I'm just gonna put up the cheat sheet for cognitive, services because, I know people like to take pictures of these slides but, again differences, between cognitive, services in, power, bi cognitive. Services and Azure very. Similar to what we talked about with automated ml it's a different persona one is really the, data scientist. Data. Developer, leveraging. Tools inside, as you're coding. Using, things like c-sharp, or, Python, power. Bi is really designed for business analysts as bob dunne showed you you know there's no coding required at all it's just one click operation, to do things like textual analytics, image. Detection, just, coming out of the box. Again. One is living, completely, inside power bi so it makes a lot more sense if everything is constrainted power bi if you want to use these cognitive services I don't, know in parts of your workflows, you, know other, parts of your maybe a web app that you've built our will give you more versatility because, it's you know kind of as separated. Away from that kind of just power bi land and once. Again requiring power bi premium, to do the all the computation, for Azure it requires, an azure subscriptions, to slightly different. Models. For pricing as well, so. With that let's. Shift gears let's talk a little bit about our, end users, and. You, know if you went to the power bi sesh yesterday, you will have seen some of these demos already but. I want to talk a little bit about deeper, about you know why are we investing, in these capabilities, what's, the rationale and also talk, a little bit about what else we're going to be adding to some of these features too so, the really big area of investment for us from an end-user perspective can, be all encapsulated. Under. This concept, of AI visualizations. We. Really want to be, able to democratize.
You. Know getting insights, for both analysts. In terms of preparation of data and business. Users for consuming, that data, in. A very kind of easy and intuitive way and you. Know analysts, know how to build visualizations in, power bi today and users, know how to consume them they do this every day when they look at their reports and dashboards so. Our thinking behind this as well if we can build intelligence, into a visual but, keep the constructs, and paradigms the same where an analyst can just author a visual, an end-user, can consume it then we can bring some of these capabilities, and democratize, them in, ways that more. Accessible than if we added maybe you, know not, to bash the wizard with automated ml or anything but you know it's it just opens it up to more people this way and so. I want to really just dive into this, sort. Of box over here and just go through a loop a couple of demos so. First of all how many of you have, seen demos of key influencers. Are. Familiar, with it okay, so about half of the room actually so it's, worth gonna be talking a little bit about what the visualization, is this is the first AI, visualization. That we worked on it's. In private preview at, this point in time and I'm, gonna jump into. An. Environment, over here to just do a demo and. What this visualization is, all about is it's. Really about allowing you to figure out if. There's, a metric, you're interested, in what, actually influences that, metric so, maybe I'm interested in looking something like churn what, influences, someone. A customer, of mine to churn or Noster I. Can also look at numeric data maybe. What influences. A house price to increase or decrease measure. Support is something that we were working on and it's gonna be coming soon but. For the purpose of this demo I'm actually going to be looking. A categorical. Field, and so, jumping, back you, know Bogdan, briefly touched on the, hotel. Review scenario, we didn't want to go, into too much detail because we've done this demo a lot but I wanted to show you a new function some new functionality that's coming to key influencers, as a result, of the feedback again that you guys have given us but. In this case I have a data set that's looking, at, hotel. Guests so let's imagine I'm a guest a, hotel, manager for a second and I have a data, set of looking, at all the other guests, who visited my hotel and who have come back the next year so, if we jump into the data set of view just for a second, over here we. Look at our customers table, what, we have over here is a field, that tells us whether a customer, who visited us last year, whether they came back this year or not so it's literally. Just appeals saying not returning, or returning and, we have a lot of information, about the customer so just whether maybe, they purchased the spa versus when they were here whether they rented sports equipment, which country, are they visiting us from it's lots of potential attributes, that could be influencing. Whether they came back or not so, I've started authoring this visual here already and the, key influencer is visual you can find over here it's this, little icon that you see over here, it's. A private preview a public. Preview feature sorry so if you go into your, desktop you enabled under your features, previews, then you can start using it it's not available, just if you open it up without enabling it. But. What I can do here is I can select which field I want to analyze so. I've already gone ahead and added that over here you can see I'm analyzing my customer, field and that's, what's yielding me this question to come up on the screen which is saying what, influences, customers, to return and. Now I can just start dragging various, fields, that I think might have an influence, so maybe what was their initial interest, of visiting my hotel in the first place and I have a couple of options here and we can see the visualization, automatically. Behind the scenes is actually running a logistic regression, and it's telling me well we found based on your data that. Customers, who come and visit you to relax are more.
Likely To come back than those, who come to visit for sightseeing for, sports activities, for honeymoons so, what. We're doing in the left-hand side is actually giving you the output of what the model has found in term the ranked list of influencers, on, the right-hand side again you know we during, our user studies, we heard great. This is useful but I need to kind of validate this with my own data so, this is really about helping you get to insights quicker it's not about building a predictive. Model there's nothing you, can see here it's all going to be descriptive, we're not giving you predicted. Probabilities, like we do with automated, ml it's, just helping you figure out what's important, in your data and giving, you a visualization so you could have built yourself like the one on the right hand side but. Doing it quicker and helping. You find those insights faster so. What I can start doing is just ringing in Moorefield maybe if they've purchased a spa visit, maybe their, reservation, type maybe the you know country that they're visiting from and every, time I bring in one of these fields and this is what's super important about all the AI visualizations, you're gonna see coming, out from us is the, model just reruns, every time it's, fully interactive, we just throw away the model we just built and we completely, rebuild it on the fly because, this should just behave like any other power bi visualization. That you're used to so. We can see the, list of influencers, has actually changed now, the, tough factor is now whether you're visiting us from the UK you're. Over two times more likely to come back the next year and if you are visiting us from any other country, so. What we've actually heard in terms of a lot of feedback from customers, is you, know this is really really useful but the piece of information I'm missing, is. Understanding. How many customers does, this actually impact, in my dataset I'm going to use a different example someone came up to me after a session I did and billed, and. They were talking about NPS score and they're telling me about you know hey you know we're using key influencers, and it's telling us if. You know the customer, hits this bug for example, you. Know they're like 50 times more likely to. Give, us a bad NPS score but. There's about 10 customers in my data set who have hit this now, we do run significance, tests, on top of the data so we use industry. Standards, p-values.
Of Point zero five so. You know if you've done statistical before that's the kind of benchmark. For you know the industry but, we the the kind of important message areas we do filter out things that are not significant, but, again different, customers will have different. Ideas. Of what significance means, in terms of their own business, value so, what we've added to, the visualization and, this is coming soon is if we jump into formatting, options you're, going to be enable able. To basically enable counts, on top of the visual so, I'm going to just toggle. This over here and I'm going to enable these counts, and what, that does is it's, a little it's very subtle but we don't want to overwhelm the visual but, you can see there's a little bit of a thin overlay, on top of the bubbles that, gives you actually an indicator, of how, many users. Is this impacting in my data set so, you can see for the UK it's really, really small you know it's just about valuable it's just about over here but, if we look at something like maybe purchasing. A spa visit just, by looking at the bubble you can see that's about half of the population and, what. You can also do is you can sort by impact, or count if you enable counts, so, if you search by count you, can immediately get an indicator, of you, know which particular, influencers. Impact, the most people and my data set so you can kind of toggle between both views, and we're, hoping that's gonna help you, know basically allow you to focus on the influencers, that really matter for you in my, case maybe you know even, though over here down there I see countries Japan is an influencer, and actually. Countries UK's and the influencer - but they're, you know they're there only and if you hover over do we've also actually given you. Tooltips. Which, hopefully will load there, we go that give you actually an, approximation. Of how much data this contains. So, we can see over here i'm if i zoom and i think it's gonna get messed up that, we can try because.
I Know it's very hard to read now we can zoom in with tooltips but it's about 5% of your data you can believe me you. Know over here for the UK this is about 2% of the data so, maybe for me if the influencer, I really want to focus on is this, one over here purchasing, a spa visit because that actually the, amount of people have purchased a spa viz that maybe my hotel, has this really amazing spas that's about 48 percent and, the people have done that are much more likely to come back so if these two pieces of information we, hope that we can make this visualization. You, know a lot more valuable for you guys, so. This is gonna be coming soon, and, then I want to jump into some of the capabilities again that we haven't, released yet you probably saw these at, the keynote, but. The first one of this is the, Q&A visualization. So, again a little bit of the, reasoning why we're investing in a visualization. If you, are familiar with QA and power bi currently. You know you can use Q&A and a dashboard you. Can also add a Q&A button directly. Inside your report but, some of the feedback we heard again from you guys is with. A Q&A button it, takes you kind of out of the context, of the report you know you click on this button and, you ask your question, and then you kind of have to close the experience, and go back into your report so it's not a seamless. Experience that's, just part, of the whole report and it's. The analyst doesn't have much control over this experience, they add the button they can maybe customize, and sell the button but once Q&A comes up that's that's it that's you, know controlled by us basically so. By adding a Q&A visualization. As. You can see over here this is just gonna be another visual. Inside. The, list of visualizations, as we can see over here this. Gives the analyst full control to make this visualization just, an integral part of the report like. I showed in, the. Session yesterday you. Know there's going to be a lot of customization. Options where you can change you, can make this ugly very very quickly I know I'm just gonna change around some of these things I'm gonna revert to default in a second but, you can see now you can change the font colors, the, underline, colors the background colors, the suggestion, boxes, the, font colors of things and you, know, backgrounds. You can make it yeah look very unlevel. Very very quickly so I'm just gonna revert, back to defaults, again. Theming if you apply themes the, visualization, will just update so again. From if your analyst and you're building this visual you can just make it seamless. Experience, and that's what we're really aiming for. You know we're adding enhancements such. As these suggested questions, where. Again I can just select. You, know various questions and again these are generated, by power, bi so. It helps your business users just get started what, kinds of things can I ask you a this. Really helps you figure, out how to get started with it if it's the first time that you're seeing it and again. Because it's part of just your report, the, cross filtering, works if you remember with QA as a button, it's, that out of the bar out of body experience, almost where you use it you close it here, if I only want to look at let's see Electronic Arts the.
Visualization, Is just go to cross filter and that's gonna work both ways so, if I select something from the queue a visual, it's gonna cross filter the rest of my report, and if. I select another visual is gonna cross filter to cue a visual so, it just makes it so much more seamless as an experience, if, you, have ever used how many of you guys have used QA tooling. Okay. A couple, of you you know thank you. We're gonna make this experience, a lot easier again, you know if you're not familiar with the Q&A tooling today. It's, basically. Editing, a Yama file so there's a lot of you. Know not really a nice UI experience, a lot of manual, work involved. With. The new Q&A tooling, over here that you see this is gonna be completely, UI, driven, you're going to be able to do a ton of things just by really. Really simple, gestures, and you, know the the thought around here was to just use. Q&A, to train Q&A itself so, if we ask a question like we let's say go for global sales. By. Let's, say. Something. Like awesome. Publisher. Let's, use an example from yesterday, and we submit this question, you, can firstly immediately, see that awesome was, not recognized, we don't know what awesome means we don't have anything to find around awesome in the data model so, there's. Not much QA can do to help us here because it needs to understand this vocabulary so. When you submit this question, again we want to make its really simple for you to figure out how, you can add new vocabulary, new synonyms, in. This experience, so, we actually use Q&A to almost. Give back a statement that, you just have to answer so. You need to tell us what publishers. That are awesome half they, must have some sort of attribute that makes them awesome that. You can help, us define so. In this case I can do something like sentiment, if. I can smell spell, is over 0.5. And if. I apply this oh man my, my, spelling, is really, off today there, we go and, if I apply this fix, that's. Going to automatically, show me what my, new preview of the results looks like so you can see the visualization, has updates it and now this definition just lives inside. My data model QA will just know how to treat the word awesome.
From, Now on. Similarly. If. I want to see what kinds of question, my users are asking because, that's ultimately what we want to do right as an analyst you want to make sure that when your business users are using QA they. Can have an experience where they can just have all. Of their questions answered but you might not be the subject matter expert you might not have all of their rich vocabulary and you can't necessarily anticipate. What kinds of questions they'll be asking, so. If we jump into this fix misunderstandings, dialog. This. Is all the questions, that are being asked, inside, the service inside the power bi service as, a, user is interacting with QA so, there's gonna be a way to link up your power. Bi report, here with the data set that's published in the service I think in the long run the, this, should be done automatically, but, in the first iteration you'll probably be able to just link it and. You can see you know all of the questions, that were asked, and specifically. All the things that QA didn't understand, and, this is really valuable for lots of different reasons one. Thing is you know there's a question that your users are continuously, asking maybe, you should think about adding a visualization, to the report to just help them answer it because clearly that's something that's important, to them but. For mccue it purely from a QA perspective. It really allows me to figure out what other vocabulary I need to add and again, if I click on something like indie publisher, so, here you know maybe I want to fix, the word indeed that's a word that's being used a lot in various questions, I can just submit this and. Again I can, define maybe an indie, publisher, as has published maybe less than two games and I'm. Going to apply this definition and. Again from now on we know what how to treat indie publishers, when our business users ask us about this question you. See a preview of your result and if, we jump back hopefully to Internet's a little bit fast for today yes it is we can see this has been applied to all of the different definitions that have used to word indie so from now on we just know how to treat this and, the final thing I wanted, to show you is imagine. I didn't ask about publisher, but said I use the word producer, so, now let's we can use this you. Know new, word we've used so maybe we can use global sales by indie. Producer. In these now recognized, producer. Is not but we still get the right answer, and again, this is going to be super, cool this is going to be a huge improvement to QA because. Basically from now on we're, gonna be integrating, with office. This, with the office synonyms, so, that means there's a whole lot of new vocabulary that's, just automatically, added to your model we, don't understand the word producer, we don't have that defined at the data model that we know from the office synonyms that producer.
Is A synonym for publisher, so. We've actually answered the question as you can see over here global. Sales by nd publisher, I can add that over here and automatically, update the question, so, again there's just gonna be an whole enhanced, new vocabulary, that, your business users are going to just be able to use out of the box so. These are capabilities, that. Are gonna be coming soon. I'm. At the end of the presentation. I'm gonna have a slide with, our emails because we really really value your feedback if, there's pain points use cases. Features. That you want to see just, seriously just email us we really really love talking to customers it makes helps us make these products better but, there's also gonna be a private preview for the new Q&A visual, so if you're interested, in being, participating. As part of that again just get in touch with us I'll have the emails up at the end, so. The other visual, that again is very exciting, for us it's now number 2 on. User. Voice is of course the decomposition, tree again. If you went to James's keynote or a Mary neroon session yesterday you will have seen this and. You know this is a visualization, that has a lot of history if you're not familiar with it there, was a tool how, many of you are familiar, performancepoint. For clarity ok, a couple of people not as many people but, there is a lot of just to give you the rest of you the context, this. This used to exist Microsoft. Acquired. A company that, had this visualization, we, kind of didn't, end up doing much with it and we made a lot of people very unhappy so. There's a subset, of the community that's super super excited to get this visualization back, but, the participe, who haven't used that before it's, a great way to be able to explore, your data it's a great substitute to things like the matrix visualization. Because, what this allows you to do similarly, to let's say a matrix is you know break things down by, various different dimensions, but, it's much more visual, experience. And was. Also special about this you know is you can just easily explore, different paths and have the visualization update. If. You again select a visual let's say as from. The left hand side sales by developer, I want to just my Ubisoft, games again, the whole path just updates and you can very easily and. I've just intuitively, track you know that path but. Let's, jump into again just, looking at how you can alter this from, scratch and I, just want to highlight a couple of things here, so. How do you set up this visualization again this is just going to be a, visualization. In power bi you can set up you, select a measure for now it's gonna be measures we're, gonna be adding support, for categorical, fields, as well but for now it's gonna be measures and. In this case again this is the videogames dataset from yesterday I'm interested, in where, my users, disagree. With critics so, the bigger the discrepancy. Between scores, that. Means that users rated games very differently from critics if it's a negative score critics. Preferred the game if it's a positive score users. Preferred the game and I've already said I want to potentially. Break this down by various. Different dimensions, in my. Dataset so things like your of release month, rating, publisher, name and so on, usually. In power bi if I want to break, something down this is basically me. Defining a hierarchy, so we'd start with year of release and we drew it down into maybe you. Know 2011. Then we drill down into a month and so on but. What's special about the decomposition, tree is you can break down into any dimension. That you want so, you're not bound by that so it's really a great tool for exploring your data so, if I want to look into something like genre, I can, start with that dimension, and this, capability is going to be available for both analysts. Of course but, also business, users so it gives your, business user so much more exploration power, now.
Based, On my conversations, so far with users. You. Know it sounds like you, want versatility, in terms of what how much control you want to give your users sometimes, you want them to just explore. The tree in its entirety you don't want them to change and drilled into various dimensions sometimes. You want to just give them a blank canvas and let them explore whatever they want so, we're gonna also be give. You the ability to lock levels, so, if you want to say my users have to start, with, year but. After that they can go into whatever they want you'll, be able to control that for your business users if you want to build a whole tree and say no you have to explore. This particular tree maybe I've built this and, I've locked all of these levels I just want you to be able to explore various paths you able to do that too so, there's gonna be a lot of versatility for your business users with this and, now we're does the AI bit come in so. You know one way for me to explore, the tree is to figure out if I'm interested in maybe my sports games and then want to see where does this gap, grow basically, I can, try out your release and month and kind of just explore this data but, we just want to make it so much quicker for you to explore this with. Power bi and the sprinkling, of AI are going to introduce so. We're going to give you a couple of different types of paths that, you can basically explore. Maximum. And minimum splits are basically where I want to find I want to follow a maximum, path where I want to follow a minimum path so, in this case if I want to see where does the discrepancy, is growing where does the score become more negative I'll. Basically say hey help me follow the minimum path what's the next I mentioned I should look at if I want to maybe find an interesting maximum. Within the minimum and, that could select maximum path can I use this for root cause analysis yes. This is exactly what you should be using, this for this was you know great visualization, for root cause analysis, for. Best split this, is going to really give you kind, of where do I see a lot of spread, in my data where do I see a lot of variation, so, it's not necessary maybe the maximum, or minimum but, you know what's kind of an interesting dimension to look for me to look at in terms of variance so, there's going to be various different ways for you to explore this data if I, select something like again minimum split it tells me platform, is interesting, and specifically.
PC Sports games this was kind of an interesting subgroup, we found and, it's up to me how, how far down I want to go with this tree I can just ask power bi to keep following this path, and, in this case again if you saw, the keynote demo what. Does this great, path on avail we see it's all the different FIFA games where critics and users continuously, disagree. Critics, give scores in the 80s users give scores in the 40s. Seems, like a continuous, pattern that's, that, seems to be going on now. Over, here you can see that the various. Various. Paths. That, are using the a I suppose are gonna be marked in this little light bulb and in. The longer run because I've actually done three AI splits, all of them will have been this is still a preview build so this is a little, bit you. Know just behind, in terms of
2019-06-15 19:54