Microsoft Power BI: Connect and transform data from hundreds of data sources using - BRK3003
Good. Afternoon everyone. How. You guys doing. Having. Fun at the conference, so far. Alright. Great well this is gonna be the best session of the day I'm just warning you. It's actually happy hour starts, a little bit early today so take, this as the first happy hour of the day we're. Gonna be talking about connecting and transforming, to data, using. Power query how, many of you have some data, right. So you need to bring it in to either, power bi excel, power apps and. Many other places right so this is the right session you've come to the right place my name is Miguel Joby's I'm the program manager lead, for power query M and data flows and I have my good friend Matt Mason downstairs. Who is going to come later on to do some advanced M demos, as well. What. Are we gonna do we're actually going to talk a little bit about why, data preparation is important, we're gonna talk a bunch about power query we're gonna do overview we're gonna do a bunch of demos and then we're gonna show you recently, shipped capabilities. And we're also going to talk about some things that are coming soon how, many of you are already familiar with power query. Okay. Bunch of you so for those of you who are not we're, gonna start with an overview and we're gonna kind of get, you back, on track with the rest of the room and then we're gonna go into more of the deep dive before. We do that let's just talk about why, data preparation, it's important. Let's. Think about in today's bi and analytics well. What's. The data preparation, process, look like so first of all you need to find and connect to some data that's, easy, once, you find and figure out which data you need to connect to you got you got a really, nice experience across, tools to connect to data doesn't. Really matter where you're always were you're connecting to because data, is always in the right shape no, matter what you're gonna do it's ready for exactly what you need to do in terms of analytics and consumption if, you ever need to do some reshaping, of the data that's easy, and quick it can be done without effort regardless of which, data source or a bunch of data sources you might want to combine that's, super easy as well, the. Volumes of data are relatively small easy, to manage there's never problems, with that and. Usually, data only comes from a single source so you know refreshing, and aggregating, and normalizing data across sources, it's, right. Alright, this was my test I see some, of you are already like what is this guy talking about reality. Is quite the opposite, finding, and connecting to data is really hard for, end-users they never know where is the right source of data once, they figure actually what the right source is they don't have credentials so they need to go call somebody else even. Once they have the credentials and they connect to the data understanding, the data that's there and browsing, and exploring, it it's actually, you. Know it's not easy you have many different experiences, across, products, they're fragmented, they're inconsistent. Data. Very often an actual you strike this and say always needs, to be reshaped, so that it can fit what you're trying to do you. Know there's never a one-size-fits-all. Schema. For your data as much as we try you know there's always gonna have refinements. You have to make on top of that the. Reshaping, of data is very. Hard to make repeatable, you, need to take care of a few things in order to make sure that your your. Your, queries and your transformation steps are future-proof and. Also, combining data across multiple sources, is difficult, because sometimes. You don't have the right joint Keys across or, even if you do there are slightly different variation, so you have to handle things like fuzzy. Matching logic and a bunch of stuff like that D to paint values and many things the, last point here is about Big Data so, when somebody thinks about Big Data you think data volume but, it's not only about the the data volume itself the para bytes of data it's, also about the variety, the different, types of data sources that you're connecting, to the different shapes and having, to bring them all together as well, as the velocity and, not just how quickly something changes, but also the. Mismatch at which, different data sources changed, there may be data sources that change only once a month or once a quarter think you know financial, results, and things along those lines there's, data, sources that change every single second, with IOT and many other types, of data out in the wild so how you bring it all together how you reconcile, it and make it fit what, you're trying to do it's where most of the complexity, is furthermore.
Here's. A quote from garner, you know when, you're trying to analyze some data 80%, of the time is spent on just preparing, the data so you can analyze, it isn't. That crazy, what if we gave you some of that 80% of a time back, well. That's what we're trying to do so how does power query help you with that if, you think about those three weeks that we were talking about the, variety, of the data the different, set, of data sources that you need to connect to that's, something that we are trying to help you with because we have now over, a hundred different data sources, built. In but, we also have extensibility, mechanisms so you can actually plug in many, other things so we do support things like REST API or, data feeds ODBC. Already we drivers, so, you can plug literally hundreds. Or thousands, of data sources and we also have a full extensibility. Model, we call it the power query SDK, so, that you can build your own custom connectors. If you wish to actually simplify, some, of that underlying logic and, you can easily redistribute. That to other users either within, your organization, as a custom connector or if you actually want it to make this available to the wall at large to every Power bi customers, say you own data. Service, or the data source that you'd like to bring, in to power bi you can actually use our certification. Program to actually bring, it all into, power bi. In, addition, to that we also provide a highly, interactive, and intuitive user, experience, that's called the power query editor it's, all point-and-click, you know interact, with your data manage, data work, with a preview of the data that gives you enough. Of a sovereign understanding. Of your data so, that you can apply the right transformations. To fix the schema, to fix outlier, duplicated, values errors and so on and so forth this, experience, that I'm talking about is actually consistent. Across, all of the data sources and the capabilities, are also consistent, so it doesn't really matter if, you're connecting, to a smart data source back-end such as say a sequel. Server database, or an Oracle, database where, you have compute. You have a sicko endpoint, behind the scenes that you can query and aggregate. And filters data for you so you can delegate, the, execution, of those queries or whether, you're connecting to something as simple as a text file well, we give you the same capabilities, we're gonna do a group by over a text file will allow you to do it will compensate, will run those queries locally, for you. The last point but certainly not least combining. Data across data sources is super easy with power query because of the same reason it's just a single, experience where we allow you to do things like merge, effectively. A join in database terms appending. Data or Union. In database terms but we also allow you to call and functions, and parameterize, queries, across and I will get into that as part of some of the demos power, queries the uniform, get data experience, and technology underlying. Is the mashup engine the engine and the M formula, language that powers connectivity, across power, bi analysis. Services, data flows and many other products, across Microsoft. Let's. Do a little bit more of a power query overview, per se. We. Do have a power query desktop, component, that's probably the one that you're the most familiar with because it's been around for a few years now we, actually just, a couple months ago we passed our five-year, anniversary, on. Power query desktop, so, it's available within. Power bi if you use power bi desktop to author datasets and reports you're using the power query desktop, get data experience on the par query editor experience. To, reshape, and transform, and combine data.
It's. Also available as part of analysis, services, you can author analysis. Services models, both on-premises. As well as in the cloud with our analysis, services, and finally, within Excel, is the default, get data experience in Excel through the data tab its power bi power query. As. We were saying earlier we actually have over 100, data sources now lots, and lots of transformations. Over 300. Transformations, allowing, you to join the area combine reconcile, clean data and we, compensate, whatever the, underlying data source doesn't support those capabilities or we, rely on the underlying compute. Engine such as sequel, or spark, or any others to push down on execution, wherever possible. So. A little bit of walkthrough. Across the different. Integrations. For power query power, query in excel how many of you use power query in excel all right most of you great. Power. Query in. Power. Bi desktop with. The new theme that which is launched, today. Power. Query within. Got. Ahead with in secret Server analysis, services and, Asherah here so here is the sequel server data tools within visual studio where you actually, use, the power cord experience to define, those. Large enterprise, data models in analysis, services and. The. Last common, piece across all of them is the query editor which, is where you will do the bulk of the transformations. And reshaping of data within, power query. Now. With all that said the, underlying, brains behind, power, query is the mashup engine the mashup engine is the query runtime, that, powers the underlying data. Operations that, power power query it, provides, those uniform, capabilities, no matter which connector, you use you're going to be able to run the same set, of M functions, and transformations. To. Operate on top of that data whether, we, can compensate or, we can actually push down the execution, to the underlying data source really depends on the data source you're working with the, formula. Language power in power credits call em M, is the formula, language that allows you to define all of those transformations, and, operations, so in a way you, can think about the power query editor user. Experience, as a design. Time tool to generate M code, that, then gets run through the mashup engine to, actually pull data across all of these data sources and combine it together that's. Pretty, much all it is we already talked about most of the other things here including the custom. Connector SDK, and the number of transforms, so, with all that said I must confess I don't like slides, so, I love demos let's actually go do a few demos. So for the first demo I'm. Actually going to be showing you, getting. Started of power bi desktop with, with power query so. Let's, actually look at a very interesting example. We. Have a data set like this one this. Data set it's interesting. Let. Me zoom in a little bit. So. You know. The. Data is in a particular, shape here and we as humans, could, actually guess, what this whole thing is about let's, actually think about this highlighted, cell thirty. Three thousand two fifty two so if. We look from the top we see this row here that's the CRL and portland and then it has some empty, cells and. Okay. So those are cities, or regions great, now, let's look at row two beverages, condiments. Produce, seafood. Confections. Produce, again over there so. Great so those those sound like product, categories, great. Now, let's look at the values in row in the column one, under or column a they're.
Integral, Numbers but. They're actually not accidental. Numbers they start somewhere around 2000. Past 2000, all the way to 2013. And they're, actually incremented. By one, row. By row so we say, okay these are years perfect, so, now we can understand, that this, value here this 33,000. Value is probably something like total. Sales in the, year. 2008. In. In. Portland, for, the seafood product, category, right okay. So now we actually need to massage this data using. Power bi so, that we, can actually build a real report, on top of this this data is pivoted. Is transpose. It has a lot of built-in. Knowledge, that we need to unravel so, that we can actually normalize. These into a table that then we can actually build a chart to say well, tell me total sales in CRO, across, all the product categories or tell me the average. Total. Sales by category on a given here across all of our locations, let's. Actually go and build that the, first thing I'm going to do is grab the, file. Path here, all. Right let me go back copy, the path and. Now I'm going to transition back to power bi, stop and we're. Actually looking at the start page, dawn, means that, call. Out here for the business application submit, you guys should go to that conference I heard there's some real. Nice sessions, over there we'll, actually go get some data so I'll click get data and now, we're looking at those hundred, plus connectors, that we were talking about we're, actually going to connect to an excel file and. Okay. So here I have my file, the multi-level, spreadsheet. So. Now we are connecting, and analyzing, that spreadsheet. And we're going to return a. List of things that we found that are interesting, here there's actually just one thing there is my spreadsheet. Number one and. We're, gonna go edit because, we need to do some transformations, within the power query editor. In. This case we actually tried, to guess but we didn't guess right so. You can actually delete a couple of steps to go back to the raw data state. Now. Within here there's a few things that, we need to do the. First one that we need to do is actually transpose. So. That we turn rows, into columns. Now. We do have this first row with the years we, can actually turn that first row into the headers so we're going to use the user service headers, transformation. Wait. And, now we have this first column that holds, the the regions so remember we said the. Blank, cells were actually, Seattle. And then Portland, and then Vancouver, so we can fill those by using the field down, transformation. Within the ribbon and that. Starts looking much better so now this column is actually our. City. And. This. One is the product categories. And. Now we do have different. Columns one for each year with. The specific, data for that year now, here's the interesting. Transform. We're, gonna select these two columns and then we're going to say well those are my two dimensions. And then I have a third dimension which is the year that right now is on a per column. I need to UM pivot that data so. You can select this too and then say and, pivot, now, there's three ways you can do it you can unpeople, the ones that are selected, now which is not what I want to do here I could, unpeopled, the other columns, and I, couldn't be what only the ones that I've selected right, now so what I'm gonna do is on people the other columns, now, the way I've done it with on people the other columns ensures. That next. Year when i refresh, this report again this data set again and somebody, added a new row of data to the excel spreadsheet or a new column I guess with, with, the actual sales numbers. For say 2019. The data will come through and it will be automatically, and pivoted for you we thought you having to come back and make any changes as, stupid. As it sounds it's actually huge differentiators there's no other bi tool out there that, knows how to do this thing that I just mentioned here, anyways. We're. Actually going to clean up these column names so this is my year and these are our total, sales numbers.
And. Now we do have data in the right shape so, that's all we wanted so it's that easy for us to just create a complex, set, of transformation. Steps look, at how many steps we played here on the right on the apply the steps, area. So, we do have everything, that we did we can always go back and modify steps, again so for example I could come back to transpose, table, or come back to promote headers and make some changes there as needed at, the end of the day what we're doing is just as we said. Using. The power query editor UI, to create M code, or. M language code which, is what eventually gets runs through the massive engine once. You select one of these steps you can look at the formula, bar to see the, equivalent m so in this case use, first of all as headers in M is the table dot promote headers function, you. Can also see the full advanced, editor view with the entire set. Of M statements. Here with all the different. Steps one per row and. Actually. Even make changes here if you want so we'll towards, the end mat will actually do a deep dive into M but. Just wanted to give you a quick high-level okay. Let's. Actually do a little bit more of a calm a more complicated demo, so. In this case let's, say that, you'd, like to actually build an analysis, on top. Of this, table, here sitting on a website this. Website holds, data for the domestic. Grass. For. Domestic, movies for. Year 2019. Let's. Actually grab this URL and. We'll. Transition to a new bar where does the report we're going to go get data and, we're going to use the web connector. So. I specify, my URL and kind, of similar, to the previous example. With, the excel file now we're gonna go analyze this website, and, this website it's, written with HTML. HTML. Page so we're actually going to try and parse and understand, the HTML, tables, within, that page. All. Right so one of these tables I believe is table 5, or. Table 2. Table. 3 is the, one that actually holds, the data that I'm interested, in there's also a web view so you can actually see this table in context, I'm. Gonna go select that one and. I'll click Edit. Alright. And. That's actually giving me a preview here so I need to do a few things the first thing I need to do is actually promote. The first row to become the headers and. Then. I'm only interested in a few of these columns so I can select the, ones that I want and I could say remove other columns, so, again another one of those future-proof. Operations, because if a new column showed up in that HTML, table in the future I just studied that I'm I don't care about it I'm just excluding, everything except what I just selected. Let's. Actually clean up the titles as well the, column headers so this is the studio, and. Then. Here we have total, gross. And. Then. We have the number of theatres where that movie is currently, being displayed now. There's something interesting here, we didn't get those, columns promoted, we didn't get types detected, and you might be wondering why why didn't we automatically, detect those things well, turns, out there's. Issues here. And if I scroll all the way down I'm gonna see a. Number. Of rows here there's actually four rows that. Contain a, different type of data within, each of these columns so what's going on there if, I switch back to this page and we.
Scroll Down all. The way to the bottom we see that actually, there, is this rows here towards the bottom where, actually although, they've been formatted, as part of that HTML, table they don't belong into the regular table schema that we have for each of the rows with, movies information, that we saw what. We need to do is actually exclude, those so. We can use within, the Home tab the, remove. Rows. Remove. Bottom rows operation. The. Reason why I'm using remove, bottom. Rows is actually, that turns. Out we, have different, pages with, different sets of movies here so we have 1 through 100 100. 200. But the last page only has 39. Movies, so, if I were to just say keep the top hundred that wouldn't work because guess what the last page actually has fewer movies on it. Great. So, now that we've pulled the information that we want it for, that single, page it's, all about making. These pull the other pages we're gonna go turn, this into a smaller problems, so the first one that we're going to solve is actually, make this URL not. Hard code the parameters, so there's a parameter. Here within within, the URL that's the page and the number of that page page equals 1 we're. Actually going to switch to the advanced be on the web connector, and we're, going to bring break, this into multiple, fragments. Or, parts. For that URL. So. Now I've turned that into two parts and then I'm gonna actually, turn this into a third part which. Is just going to take the rest of the URL so, now say. That I wanted to switch this to 2 and. Click. OK. The. Courrier still works but when I move to the last step you see that now we're pulling movies starting from a hundred and one because, we've moved to the second page on. That on that specific, table, now. Let's actually make, this a parameter, would it be nice if it was not hard-coded, right I. Could. Switch this to a new parameter. And I'm gonna call it page, and. It's gonna be a text but up in parameter, because i'm concatenating it. With the rest of the URL now, let's say that i wanted to give it a current value of three and. I'm gonna click okay so, a few things happen, the first one is we now see a parameter, page with the value of three and if I switch to the final page then of course now we're getting movie starting at 201, because it's another. Page after that great. Now. That we have this query that understands. A parameter, that's, called page and pulls, data from a different page from that website, based on the value let's, actually turn, this query into, a function, it's. Really easy right click. Create. Function and because. There was a parameter, already define we actually grabbed that parameter as the function. Parameter, so, let's actually create a parameter the function, name get paid and the parameter is actually just page. Quit. So now we've turned this into a function great and, I can invoke this function let's, say I wanna pull page, number four invoke. That so we get a new query that actually is the result of invoking that so I have movie. Starting a 301. Rating, now let's, actually do. Something else we're, gonna do a blank query. Because. Now we're going to pull all the pages now, I don't know top of mine how many pages there are but, I kinda know that it's, probably, not gonna be more than a thousand, movies that year so, let me actually create, a list, of values between 1 and 10, and. What. We're gonna do we're gonna turn this into a table. And. Now that we have a table here let me assign it a type text because that's the input parameter, type that we need now, I'm gonna go and invoke that get, paid function, over, each of these rows so. We can go add a new column and we're gonna use the invoke, custom, function experience, which, effectively, will allow me to actually, invoke, any, of the other custom, functions that I've defined within this power, bi desktop file in. This case I just have one get page it, asks me for which parameter, I would like to use I'm going to say use the value from column 1 for this specific, row and hit, OK and now, we actually invoke and. Get a bunch of results back so you see on a row by row basis if I click into the cell background you, actually see the contents, for that specific. And. You actually make that larger for that specific, page so, we do see all those hundred movies now, we don't have a problem here as you can see with. Profiling, we, actually have errors, those, errors if I click on the background are, because, they, actually we could then find that page because there's only four pages that, whole data remember we had only about three, hundred and forty movies so what I can do at this point is to select this column and say I'm gonna remove, the, rows that contain error, values, and, when I do that dynamically, we'll just down set or, downsize, to only, the rows that actually, contain the, the right results, now from here I can expand this.
And. That will just give me the full set so now we. Actually get to see the full count of movies here so. We have three hundred and thirty nine values. Which, are the three hundred thirty nine movies that we had all. Right is, that good for an overview of power, query let's actually now go, talk about. A few more things. So. Because one one power query was not enough. We. Actually. We. Actually came up with a second power query power query online now, power query online is exactly, the same experience. On the same principles, but it's a web-based experience. It. Supports, all the same set of transformations, 300-plus, transformations. That we've have, in in the EM engine in. Terms of connectors, we are only are about forty five connectors, right now we're playing, catch-up with power query in the desktop but, you know we keep adding more and more every month cutting. Up to those hundred, plus that we have in desktop for. Those connections. That go against on-prem, data sources we use the on-premises, data gateway that, you probably already used to from. Using the rest of a power bi stack we've. Integrated power, query online with a few different products at this point data. Flows in power bi which we do have a deep dive session, tomorrow afternoon, the. Common data service so you can actually ingest. Data using, power query and the on-premises, gateway into, the common data service then it you can create power-ups. And flows on top of that data flow. Within. The sequel server connector, which I'm actually going to demo in. The next 30 seconds and assure. Data factory for transformations, which is currently in private review that will be going into public preview later this calendar year. Let's. Switch to a demo I'm going to show you them off power query in flow to, give you an idea of the kinds of things that you can do there. How. Many of you use flow. Alright. One. Of you should use it, check. This out work let's do more that's. What got me in. Alright, so with that said we're looking at flow so flow is basically. End. User, or business, user friendly. Experience. This. Means that to. Create automation. So it's a kind of you know orchestration. Logic, very, lightweight, so, that you can do things responding, to events and kick off new actions. Based on those events so, for example, let's, go and create a new flow that's you know based, on a schedule so we could say schedule from blank I'm gonna, call these and bus 2019. Starting. Right now at 10:00 a.m. and, we'll, run these once an hour we're. Gonna create a new flow so now we're looking at the flow designer, the, flow design time experience. Because. I'm gonna show you power query although there's three 200-plus. Connectors, here but I'm gonna show you power query which is bound. To the sequel connector, to. Allow you to reshape, and transform data on top of cycle server I'm, gonna use the transform data using power query action. Which, within the sequel, server connector, now. I'm already using a connection, that I have already configured, this is actually, going against, one, of my other sequel, databases if. This was an on-prem connection. To a non-prime sequel server you could have done, that as well through the on-premises, data gateway, now. I'm going to go into the query editor experience. Where. I'm gonna be able to very similarly to the experience that you saw earlier. Be able to pick a few tables from this sequel server database, or a, short sequel database and I'm gonna be able to apply transformations. So, what we want to do actually is create, this flow that once, a month is going to look at our monthly sales numbers, and it's gonna identify our.
Top, 10 customers, and it's going to automatically, send them a thank-you email that's. A flow that we're gonna build so, for that we need the customer, table and we. Also need the order detail table, where. We have all of our lines. Of orders and then, the orders, header or, the sales order header table. If. I click Next we're gonna go into the query editor again, you'll feel right at home here this is the power query online query, editor and we're, gonna go and fix a few things so. First of all we, go to the order detail table this is where we have for. Each line of order we're gonna have the, Product, ID that, was associated with that order there's multiple, of them it's a one to many relationship between, order, header and order detail and we, have the, product ID the unit price and the line tool there's. A unit price discount, but I don't feel like giving this count to anyone today so just, live. With that and, we're, gonna say you know select these two columns and I'm gonna add a new column that's. Gonna do a standard operation to multiply. So. It's just unit, price oh. Sorry. I screwed my, but I, wanted. To actually pick unit price and quantity, you. Can actually remove the steps if you do something wrong. Take. Unit price and. Order. Quantity, and now. We're going to multiply them. Adding. A new column. Great. So now that's our new line total it's, called multiplication. And now what we can do with the order IDs we can do a group I, to. Calculate, the order total. And. The. Operation, is going to be a sum of the, new column that we just created which, is multiplication. All, right so now we have our sales. Order IDs with the order total now if we move to the second table which was the order header here. Really all I care about is the the. Sales order ID and. The. Customer, ID column. Which. Is right here so I can select these two and remove the other columns and now, we're gonna do a merge so. We're gonna merge with the order, detail. Query. Matching. By the order ready. We. Select those two we can specify the join kind that we want and we support many of them will, just do a left outer. Click. Okay so now that's gonna give us our. Extended. Table where we have a new column that we can expand. And. I'm gonna just expand the order total column so, that's going to give us for each order. We're gonna get the order Toro and. Now we can do a group by customer, ID. So. We can calculate total, sales for. That. Specific, customer again, it's going to be the sum of the, order total that we just pulled from the other table. Great. So now we have total. Sales by customer now, let's actually go to the customer, table which is where we have all the actual contact info for these customers we're. Going to keep the customer, ID and then, we're going to keep oh wait, look at this how. Many columns can you have to actually represent a customer. Name yes, five. Of them would, it be nice if I just had one right I can. Easily merge columns and my separator, is the space and I'm gonna call this full. Name and. Hit. Okay. So. There is my customers full name column great. So now I want the customer ID and the full name and I also would like to have their, email, address.
Remove. The other columns, let's actually bring the, total sales from the other query which. Was the sales order header matching. By customer, ID on both tables. Click. Okay, so. Now for each of those customers we, do have their, total sales number let me expand, here. And. I just want total sales. Now. We have them but we have them sorted by customer, ID really, what we would like to do and many of them have haven't, done any recent. Purchase is to just sort descending, on, total sales. And. Now we'll have the actual, total. Spent from, those customers, now we said we only wanted to send a thank you email to the top ten so, we can do keep, the top rows and we'll, just keep the top ten gross. The. First ten rows and that. Actually give us our, we. Said these the top. Customers. Great. So, now we create this we're, just loading that one top customers, table. And. Now that I have it in context I can actually go and create another, step, within my flow to, say send. Email. Send. An email and. Then. Within here I can actually liberate, the dynamic, content that I had from the other table so I have, the, email address that I can use here. The. Subject would. Be thank. You for, your money. And. The. Body because, we actually have, also. Alright. We. Can just say thank you I miss. The quantity. Cause I wanted to say ten. Thousand times thank you given, the actual total spent number, anyways. But you get the drill so you basically use, power, query here we use it to do a three-way join with multiple aggregations, we also clean. Some columns to actually then feed subsequent. Steps within flow think about how, powerful, that there is and I did it without a, single line of code right amazing. Let's. Actually switch gears now. That. Was the overview demos that I had now let's actually go talk a little bit more about. Other features that, we've recently, shipped. Recently. Release features, this, is just a high-level view of everything that we've done since, the beginning of the year there's. Many features here there's 20 or so new connectors, many of them actually built by third parties and went through our certification. Program. So things like MarkLogic, SurveyMonkey. Guidance. VI connector, for OB IE and many more the. Things that I've highlighted here in purple are, actually, besides. EM intelligence, which Matt will show later today the other things are actually what, we call a smart data prep features, and. We're actually going to go and do a demo of those but. Before that there's something else we wanted to announce which is a new, power query website called. Power query dot-com, where. You can go and find everything about power query across all products. That we integrate with so let's actually go take a look at that. So. I'm switching again here to my browser. Or. Query. Dot-com. And that will actually put you on to our new website which which is launched today. Here. You can find everything you need to know about power query including, recent, news recent. Releases, what's. All about you know what connectivity, options we have what products we integrate with so, you can go for example learn about our integrations, within power bi everything. There is to know you can find here we, also have a. Lot. Of information about connectors, some of the AI features, that we're going to be demoing later. In these sessions things, like profiling. By example, capabilities. The. Resources, page is probably the most useful in my opinion so here we have not, only resources, provided. By Microsoft, but also community. Resources, books that are available out there that, other people are writing about power query forums. Community, resources, links. To all the relevant help. Pages and community, forums about all the product in the relations that we have I. Actually. Wanted, to do, a little of a little, bit of a tribute to this website so my next demo is actually going to be scraping. Data from the connectors, page here. We actually have information, about all of the connectors available within, power query and I'm going to bring that data into, power query and then we're going to do some analytics. So. Let me switch to one. Of these reports. Go. Edit queries. Let. Me start fresh. So. Many. Intermediate. Queries, in there so, I actually start a new desktop report so again we're going to be pulling data from that connectors, page we're going to be using the web. By example, connector. Because. With weapon empal besides what I showed you earlier on the other web, connector. Demo that I did where it was all nicely formatted as an HTML table the, reality, is there's many websites like this one where there, isn't such thing as a table so, for that we're going to be using web by example, all. Right switch, here. So. Let's go get some data. We. Use the web connector again.
Provide. The URL to the connectors, page and then. We're gonna go analyze, that page. We're. Gonna find the few suggested, tables actually. And. Then. I'm gonna be able to go into the full web by example, experience, as well, so. Right now we are analyzing, that page we're pulling some HTML tables. We, found some suggested, tables I actually wanna go other, table, using examples. And. That. Puts me into this interactive. Mode where I do have or, I will have the, upper half of this dialog is actually going to give me a preview of this website and the bottom have is an empty table, so. Let's actually look at it here. Scroll. Down a little bit more. Run. Into some projector, issue. So. Now we have the upper half here where we can actually look at examples. All. Right and then. The bottom half I can start just typing in whatever I want so let's say that I wanted to pull that first connector, name that's, access database and. The. Second one is active directory and, the. Moment I move to another cell, you see what happen here we. Actually brought in all of the other values, automatically. We just understood. What you were trying to do what what I was trying to do and we brought them in but, we can do much more than that so let's say that's the name column let's, actually also bring the author. And. That's. The by Microsoft, value that you see in there. And. The moment we do that we just fill all of them because at this point we already understand, the repetition, pattern based on the connector. Name so for example if I keep going down we're, going to find the VI connector, by guidance it's here as well now. Let's actually do one more column this, one is actually going to be with products, it's available in and I, don't need to type all of the same samples for the same row so in this case because, I'm lazy I'm just going to type power, bi next. To the Adobe Analytics connector because as, you can see I'm here that's the only one where it shows up right now and we, get all of the other ones so this is actually going to be the products. And. The. Last thing that I wanted to pull from here is the, connector. Category, so here's the top left corner there's data base versus. Other versus online services, so, I'm gonna call this the category and. This. One is data base and. Just, hit enter and we get all of them great, so at this point I'm ready to just go click OK and we're, gonna come back to here and I do have my table now with that preview let's, actually go and do a little bit more of transformations. We'll. Switch back to. My, query editor here, okay, so that's our one table great, so let's actually go and do a few things the first one by. Microsoft, so actually I don't like the by thing in there, it's, actually going to strip it up with. By example, you can not only extract data from an external web. Page like I showed you but you can also extract. Data from other columns, within your existing tables, in the quarry editor so if I come to the, out column tab we do have column from examples, that, will put us into the same mode so here I can just start typing what. I want extract in this case Microsoft. And. It automatically, detects, what we were trying to do which is get. Me the text after the limiter and the delimiter, is the space. So. Now I can call this column the owner. And. Of course I can get rid of the original column which I, didn't want anymore now, another, thing that I wanted to do actually we, don't have the connector name column, and. If we scroll down a little bit you see some. Examples like Dremel which, has better in parentheses so it's basically is telling us if we. Do have better in parentheses the connector is better if not, it's actually a generally, available connector, or a DA connector, so I'd like to propagate. That value as a new column I'm going to use conditional column, to, just define that as a rule so, I'd say my. Column I'm going to call it maturity, level. If. The. Column name, contains. A. Value. Of beta, in parentheses, then. The connector, is in Vera else.
It's, Actually, GA and, that's. Our maturity level column now alright, super easy what. Else can we do okay so we do have this column with products, here now, I'm, actually going to create a, little, tooltip for each of these connectors, using by example. So. If I come in here and I'm going to type something on the Adobe non-latex one so we could say Adobe. Analytics. Built. By. Microsoft. Is. A. GA connector. In. Power, bi. Period. When. I move to a next row we, actually, got the. Full. Expression. Here so we're actually constructed, that sentence, based. On just that one example so I can actually see for. All the other ones that, it actually pulled the right values in they're super. Easy so I'm gonna call these comments. And. I'm gonna hit okay so. Now we do have that new comment column and. Now that I have that these, products, column is actually not in a right shape so that I can aggregate David again would like to build a report that allows, me to slice by products, and see the, different connectors that are available in each of the products to, do that we're actually going to use use, split. Columns, by. The limiter, and. The, delimiter here is a comma, but. Instead, of a splitting into new columns I'm gonna split into new rows. So. Now we've expanded, and, whoops. For. Each other connectors, we're gonna have a different, row for each of the products that it integrates with now, we split it by comma and we do have a spaces, and so on so what we need to do here is within the transform tab we're actually going to go format. This to, trim values so, now we remove the leading and trailing white, spaces so, now we do have a, much, cleaner. Data. Set that we can use and we can start we could start doing our grids in the interest of time I'm not gonna do that but I'm just going to do some quick things. Right here. So. We can look at, profiling. So. We can turn on profiling, with column quality all are valid we, can look at the distribution of values and, we can also look at profiles so. I guess okay we could move to a previous step before my split so, we can look at those values again. Let. Me actually change this column type to text. So. Now we can look at yes we have a hundred and nine connectors, as we were saying most. Of the connectors, have been built by, Microsoft. Actually, when, we look at the deep-dive, sorry. The wrong column but. The owner, most. Of the connectors and being built by Microsoft, 85, out of those 109. Plan view is actually built 3 and then a bunch of other folks have built at least one connector, which, is great to see maturity. Level most of our connectors, are generally available 75. Of them there, is 34, of them who were actually at still. Better. And. In terms of the categories, that we support most of them are online based connectors, a bunch of databases other. Connectors, file a couple, of connectors, for power bi with power bi data sets and data flows. So. Just wanted to show you a quick demo of a, bunch, of things first one of all which, I want you to go back to is power query com go check it out we're. Super excited about the new website web, by example, data. Profiling, and a bunch of transformations, along the way let's. Actually come back here. And. Run through the rest, of. The. Presentation, so we talked about the power query overview, we've talked about some recently shipped features, we're, actually going to talk for the rest of the session about what's coming next or. Coming. Soon so. The first one not coming soon but actually coming today which was already released, is, the power bi desktop June. Release we've, made a couple of connector, enhancements. They are including the, common data service connector, is now generally available several.
Enhancements To, the cosmos the V connector, performance, improvement, more pushed down and falling improvements. On that connector, and. Hanshan's, to M intelligence, with parameter, hints in the formula bar a, new certified, connector, released, as. Part of a product enter soft built by a third party as we who wanted to bring their, connector into power bi and last. The. Power bi desktop theme, which is the new like theme is, now also integrated with the power query editor now. What. Else is coming later this year and this category I like to split in three main buckets, the first one is new connectors. Then what enhancements, are we making to existing connectors, and then what else are we doing in terms of data preparation capabilities. And general query, editor enhancements. In terms of new connectors, we're, working on a new assure derelict storage Gen 2 connector, we do have a Gen 1 connector now this, connector. Will go into public preview next, month as part of the July power bi desktop release, the, at scale connector which has been around for the last few months is going general, availability, later, this year we're, working on a new hive, ll, AP connector, as well as a new Athena connector, amazon athena connector, and i'm sure there'll be many others built by. Parries and certified, through was just like based on the last few, months trends, over there and the healthy pipeline in. Terms of enhancements, to existing, connectors, the, cosmos the B connector, will go ga in the next couple months as well we're, working on direct query over over Postgres equal s. AP HANA connectivity. Will be enhanced will support HCI containers, which is a new 2.0. HANA concept. We're. Going to add support to connect gateway for the db2 connector there's. No flick connector is getting Asha Active Directory or AD sign-on. In. Single sign-on capabilities, as well in. Terms of data prep enhancements, more, stuff around data profiling that would work in towers better filtering. Visualizations, the ability to interact with those column charts and apply operations, from them we're. Actually going to do Bend diagrams, within the merge dialogue for the join types so. Let me show you that so, the left is the current dialogue with that little drop-down at the bottom for the join kind we're, actually going to do the icons for better clarity on the different types of merge. Fuzzy. Clustering so we, did, release fast emerge early this year fuzzy clustering will actually allow you to cluster, values within, a single column similar, values based on the fuzzy or. The similarity, score, koala, diagnostics which is a feature that will, allow you to get better insight, into how, your queries are performing, how many or. What steps have been pushed down to the underlying data source, how long it takes to run according the underlying data source versus locally and will allow you to better troubleshoot, and diagnose performance. Issues within your queries with. That said the. Last demo it's. Going to be a M deep dive and M intelligence, demo by mr. Magnuson. He. Said great thank you very much you know we need help with the mic and is this on now ok, great. Thank. You Miguel Miguel, is I always enjoy watching Miguel's demos even though I've seen them so many times they, are always, so, interesting. And perfect, and so. My demo today is gonna be pretty quick so. Hi. Everybody Matt Mason's my name I've been demoing power, query and M for a while. One of the things I first. Time I saw a power query first time I saw the language I fell in love with, M and I've. Been heavy M user since and, one of things I like doing is promoting the use of M and one of the things that we've been doing to further promote M is to make it a better experience. So. As you saw on some of the demos today we. Now have intellisense. Fram in power bi desktop right. So that's in the advanced editor in, the formula bar in the add column, dialog, we're, now bringing intellisense, to all the M surfaces, and so what I was going to show today as, you see this is Excel. Which. Is also has a power query experience. And. An upcoming feature, which will be going out sometime, after July depending, on what Excel ring you're on is intellisense. In Excel so, I can type this is a string. You. See the nice highlighting. And. I can do number. Two. Text. And. I get syntax. Highlighting, and everything, and I, can zoom in just to show you guys so, yeah now you have full intellisense, with a drop-down. Like. Any other programming, language one. Of the new features that we've just added in recently. I believe we went out in the June release is, in. The formula bar you get full intellisense. Drop down hover, capabilities. As well so you hover your mouse over you get, context.
About What the different values are. And. As, a, says I have a couple of minutes I was just gonna do a quick. Building. M from. Scratch, type thing so, one, of the useful, functions, we have is list dot, dates. I. Can. See. The arguments, for list dates, is start. Date count and steps so my start dates will be two. Nineteen, one, one my. Count that's two 365. 365. Days. And. My. Duration. Step, will be one day whoops. Wrong. Argument there zero, zero, zero. And. So now I have basically. Automatically, generated the list of dates. And. What I'm gonna do is actually build my own date dimension using purely with M rather than importing from a table or anything like that right creating a date dimension a custom date dimension you can add your custom attributes, to very, useful thing and I'm just gonna do that purely nm so. I'll start by converting this to a table. Rename. This to, date I, will. Change the, tranquilly. Transform, the type to date as well and, bring. Up our handy, add custom. Column dialog, alright, so custom column lets me type in an expression. And. I can start. Typing, in date functions so date year. For example. And. My. List of available columns. Comes up I have date. And. Let's call this. Year. So. It extracts the Year value I. Can, add another column for. Let's. Call this month, name. So. Dates dot, month, name. And. I. Could still double, click my, columns, and like that but my, columns also now, come up as, part, of my. Intellisense. Suggestions. One. Of the neat things a month, thought name as I can see here, I. Can, let month date date. Dot month name can take in a culture so if I wanted to. Pass. In a culture value, like. French. I, now. Get French. Date names alright, so. There's a lot of I, could keep going building. My entire date dimension I will copy and paste. As. A quick, and cheap. Bring. Up the advanced editor. Well. Dennis I want to call out some of the advanced editor features as well so. We have these display options, so, I can do things like, have. Line numbers. Enable. A word wrap so values wrap around, even. Display, a little mini-map, in the in the corner there for very long queries it lets you see a miniaturised view of your query you can scroll. Around, I'll. Replace this with it you can also copy paste and, I can copy and paste. So. I have copy and pasted in an existing function. Handy. Function that, lets me actually generate, my, full.
Date Dimension now. Because. My function as you see here I've declared that, the, start date parameter, is a date and end date is a date the, power query you I automatically, generates, the appropriate, controls, for you all. Right so rather, than I can just enter text but because, power, query knows it's a date it also gives, you a parameter. Prompt here. So. I could pick that's. Optional. Culture I'll leave it as my, default and, now. I've generated a full date dimension alright with all the different values all the different calculations, that have put in so. This. Is just scratches the surface of the things that you can do with em Miguel, said there's over 300 transformations, built into the product we try to expose as much as you can with the UI there's, a bit of a discussion on Twitter during that box office mojo demos showing that we've been doing the demo for years and every, time we do it it, gets less. And less type II and more and more just UI clicky, because we start, adding, more and more functionality exposing, more functionality, in the UI making, a product, easier to use. So. Yeah Excel intellisense. Coming, soon next. Up is power query online, intellisense. - so. Yeah stay tuned for that thanks. Matt, so we are almost at time just, wanted to point you guys to the related, sessions done by our team, between. Today and tomorrow many, of them happened in the past so you should either, check out the recording, or take. A time machine up to you. Either, one of the two works if you, guys like my jokes, too more on doing another session. In the afternoon about. Data flows and. We'll have some big, announcements, there around data flows as well so to recommend you guys to go there. Besides, that just all the other sessions for the team around the gateways CDM do. All right capabilities, building. Connectors, both on power bi as well as power, absent flow and many more sessions. We, do have three minutes left so we'll open up for questions and. Whether. You, like the jokes or not just don't forget to evaluate, the session will tell jokes up or down for next time accordingly. Thank you Thanks so. Questions. First question is always how so let's actually go with the second question. Any. Second question. All. Right well if there are no questions we could go on location so, my question is why did we do power for e-comm and not get in transform datacom, oh. Sure. We should buy that domain as well, question, over here is, it possible to specify. Login. Credentials. On the web. Connector. Sorry. I didn't hear, Billy, the question was is it possible to specify login. Credentials, as part of the web connector so, for example your connection to a web page that's protected, by a basic author things like that. No. It is not, especially. If it's form based off of cookie. Based off we, don't maintain that browser session, so. Those those, types of pages currently, are not supported. Okay thank you. What. About fiscal. Time. Periods. I. Do. So built-in function, built-in end functions for generating fiscal, time periods I'm. Not sure about that. If. You want to come by the data integration booth and tell me exactly what you're looking for we can find the matching function. Question. Over there. I don't. Know if this is too early but can, you elaborate any, more on what the way, your fuzzy matching. We'll look like, yeah. So the question is if. We could go deeper into fussy matching and describing, the capabilities there we. Do have plenty of dogs there but it's basically if, you look at the merge dialogue, experience, in power bi desktop, we. Actually provide, not only strict matching but also fussy matching and you can configure many knobs including, what, similarities dress code you'd like to have whether, to consider multiple words, a single wore uppercase, or lowercase, sensitivity.
And Also synonyms. And things, around that level so, we. Do have a lot of dogs on that and the other thing we will need to say is that that feature is also coming to itself later this year but yeah go go check them out and if you have a specific questions that we can chat, in. Terms of actual implementation, it's a derivative of MSRs. Fuzzy logic libraries, so if you've ever used the fuzzy logic libraries. For Excel or add-ins for Excel or an SSI ass it's, all as an evolution, of those technologies, okay, thank you. Question. The. Data quality function. I've used it a lot but it'd be great if it you could actually. Export. That. Attribute, immediately. In order to be able to communicate with data providers, about. How their quality, is so. The, comment if I understand right is about the the all the data profile in charts and that data being able to export, that in a in a way to. To a report or something that you can save off yes, in fact maybe even just, a button to push it to positive there's actually, if you yeah, check. Out a blog post that Reza, Roth wrote, actually a couple of weeks ago a magician, underlying, function. Table dot aggregate, profile. That allows you to pull all of that information I actually saw that but. The, ability actually, did get a graphical of, the data to maybe, even a few more statistics, around it would be great yeah, yep. Yeah. That's great feedback, just. A question cite you, did so many stab in, a power query right. If. I have to repeat the same steps to. Create to clean out a different set of data is. It possible for me to SAP save, those steps and, then, I reapply. It yeah. So the question is you, were, using the same steps across. Different, reports and so so, today we don't have like a built-in way to you, know peanut, function or save your own function somewhere else we do support copy-paste across, but, that's great feedback and something we'll look at in the future go, to ideas. Or powerbi.com to, vote for that capability. Across. And. So within ya within. A single report you can always take, your logic and say create function save it as a function and then call that function again but yeah when. It comes to across, sources. That's something that we'd, love to add thank. You yeah thank you. When. I create a custom. Function can. I put it in an auto refresh. Within. Power bi, yeah. So the question is if we use a custom function can I use it on our road refresh and power bi so, the answer is functions. In general yes. This. Particular, case is there where sometimes it would work sometimes it won't particularly, things that dynamically configure, data sources are not supported, today in power bi service refresh but that's something we're working on with, the power bi service team, to address in the future. Well. There will, there ever be a feature to be able to query. A power bi data model. Where. You could get your table with tax calculations. And and, everything you've built in another report yeah. The question is whether there'll be a feature, to query a power bi data model, it's. A great question for the power bi data modeling, team because, with the custom connector SDK, they could definitely build that connector, I don't think there's anything in the works right now I think. That that's one of those things that needs to be voted, on ideas the power bi com I think that's it like, from from the technical perspective the power bi data model is is a tabular. Model and. We've just never had to build the connector in power query to connect directly to the model we have an ASM, and BX, connector, right we have we, have the Q base connector. But. Power, bi already speaks stacks already knows how to talk the tabular model we just never had to do it so that's one of those features that enough. Votes, enough voting on ideas, that power bi calm we could definitely do it it's just not on the current roadmap. Okay, thanks questions. Any, suggestions, on training and certification. Sorry. Any, suggestions, on training and certification. Training. Around, certification. We do have a dark side that has over a hundred dogs, its, dogs.
At Microsoft that come slash power query and. Yeah. Go check it out one of those pages. Is specifically, about the certification process the. Rest are all documentation. About how to build custom connectors. Yeah. Yeah. Oh. Oh. I see. Like. Like I started that Microsoft, cert thing. It's. Been a while since I've looked at it the last last I saw the. Power bi certification, usually includes some form of power. Query stuff, in their dedicated. Power query courses, or there's, a dedicated, power query one there's power bi training, courses that will walk you through the data and there's also of course the other webinars. And so on and you know getting started with power query and all the books that we have on the site Thanks. My husband pointed out but the, resources page that you saw on the power query dot-com site go check them out there's a lot of books there and we also have links, to external blogs, and folks, were doing power poly training not Microsoft but power query training, run by the community and the MVPs which are excellent. When. You're you when you're using the URL to pull. Data on the movies for 2019, is it possible to have two parameters like, one for page number and one for a year to pull all the movies from like, 1990. Through 2019, yeah, so the question is if I were able to have multiple parameters, in that URL so you saw how I was constructing. Multiple parts, you, could further break down that URL as long as the URL accepts that here and in this case it did so, yeah it's definitely something you can do all right thank you can, run into some complications, when it comes to refresh in the service depending on how dynamic your, URLs are yeah I talked, about that my, previous session around custom connectors if it gets to a point where you're just generating, dynamic URLs all the time. It might simplify your life if you just build a custom connector like you could have a box, office mojo connector, that, automatically, pages users don't have to mess with with, page. Numbers properties, and like that it just knows how built into the connector it will know how to do all that paging for you so. Regarding. Your custom, day table made in M language, is it, possible, to graph that and have it in an act x axis, will, it be categorical. Or continuous. I. Understood. Half, of those words so. Let's talk about that after it unless you understate it I, think. You asked about a. Date. Table and whether you can configure the the access to the categorical or contiguous, yes, so currently the auto date table allows you to have a line, graph and the x-axis will let you go through the hierarchy and it always keeps everything in view you.
Create Your own date, table and drop through a custom. Hierarchy, it defaults. It to categorical. So you never see everything in total view it'll then just viewed, by category. So. I'd, okay, now I'm sorry I do I understand. You. So, if you auto generate and you'll get the same behaviors, if you're importing it from an external source like, it will behave the same way so it's not gonna add any new capabilities, there okay, thank you okay. Last. Question I believe. Another. Question. Within. Power query um what kind of performance. Issues, we have writing a lot. Of Dax functions, and formulas within there. So. Well M formulas, and power query so okay. And yeah, we have em and in power query side has the data transformation, and then once. Your data is loaded into the model then you do your Dax stuff so which which level is your question for M. The, M side yeah so, I mean overall performance it, really depends on the data you're connecting to the data sources and things like that one, of the magic things that power query does is we try to push down as much as the work as possible to the backend source right. So if you're working with terabytes. Of data and, a, relational, data system, data, source and your transformation is gonna be fully pushed. Then, your performance, is limited to whatever your back-end system performance is if you're, trying to do tons of transformations, over a hundred terabyte CSV, file your performance is going to be terrible because there's no for us to push to write there's no magic we can do and would you say that doing that on the back end is better than doing it on the front end it. Really depends on the back end all right and it depends on the capability, of the, source right if you have a, so. If ultimately, your data source is a large, CSV, file and you put a web service in front of it I don't, know if you'll get any better experience. Than bringing. The entire file down and pr