Data Transformation with Power BI

Data Transformation with Power BI

In today’s post we will examine the data transformation capabilities of Power BI. We will be using a sample dataset in excel to get you started and then we will see what are the possibilities

Getting your data into Power BI

In my last post, I discussed that various data connections available in power BI to use as data source. In case, you have missed the various data types power BI supports. do check this post out.

when you open Power BI, you are greeted with a welcome message, here you can select the get data button

Power Bi Welcome screen

after you click the Get-Data Button, you will be get the list of data sources power BI can use for data input

Data types in Power BI

you can start by selecting the data input, in my case I am using an excel file so I will double click on excel or clicking on connect button below.

After you have browsed for the file you want to load, you will get to the loading screen as below

components of loading screen

The loading screen is home to several components which are explained below.

  1. If you excel file contains more than 1 sheet, all the sheets will be listed here and can be selected by checking the check box next to its name.
  2. A preview of the data being loaded
  3. Load button to load the data into model
  4. Transform Data to get into query editor to transform the data before it can be modeled
  5. Cancel button to abort the loading operation

You can either click Transform data button here to get to the query editor or alternatively if you feel the need after you have loaded the data and Power BI have classified the data in wrong data type, you can open the query editor as follows

Clicking on the Transform data button in the Home menu.

Once you click the transform Data button. you will be greeted with the query editor

With no data connections, Power Query Editor appears as a blank pane, ready for data.

A blank Power Query Editor screen

Once a query is loaded, Power Query Editor view becomes more interesting. If you connect to the following Web data source, Power Query Editor loads information about the data, which you can then begin to shape.

Here’s how Power Query Editor appears once a data connection is established:

  • In the ribbon, many buttons are now active to interact with the data in the query.
  • In the left pane, queries are listed and available for selection, viewing, and shaping.
  • In the center pane, data from the selected query is displayed and available for shaping.
  • The Query Settings window appears, listing the query’s properties and applied steps. Elements of the Power Query Editor

The following sections describe each of these four areas—the ribbon, the queries pane, the data view, and the Query Settings pane.

The query ribbon

The ribbon in Power Query Editor consists of five tabs—Home, Transform, Add Column, View, and Help.

The Home tab contains the common query tasks, including the first step in any query, which is Get Data. The following image shows the Home ribbon.

The Home ribbon

To connect to data and begin the query building process, select the Get Data button. A menu appears, providing the most common data sources.

Get data from a new source

The Transform tab provides access to common data transformation tasks, such as adding or removing columns, changing data types, splitting columns, and other data-driven tasks. The following image shows the Transform tab.

The Transform ribbon

The Add Column tab provides additional tasks associated with adding a column, formatting column data, and adding custom columns. The following image shows the Add Column tab.

The Add Column ribbon

The View tab on the ribbon is used to toggle whether certain panes or windows are displayed. It’s also used to display the Advanced Editor. The following image shows the View tab.

The View ribbon

It’s useful to know that many of the tasks available from the ribbon are also available by right-clicking a column, or other data, in the center pane.

The left pane

The left pane displays the number of active queries, as well as the name of the query. When you select a query from the left pane, its data is displayed in the center pane, where you can shape and transform the data to meet your needs. The following image shows the left pane with multiple queries.

The left pane of Power Query Editor

The center (data) pane

In the center pane, or Data pane, data from the selected query is displayed. This is where much of the work of the Query view is accomplished.

In the following image, the Web data connection established earlier is displayed, the Overall score column is selected, and its header is right-clicked to show the available menu items. Notice that many of these right-click menu items are the same as buttons in the ribbon tabs.

The center pane

When you select a right-click menu item (or a ribbon button), Query applies the step to the data, and saves it as part of the query itself. The steps are recorded in the Query Settings pane in sequential order, as described in the next section.

The query settings pane

The Query Settings pane is where all steps associated with a query are displayed. For example, in the following image, the Applied Steps section of the Query Settings pane reflects the fact that the type of the Overall score column has changed.

The Query Settings pane

As additional shaping steps are applied to the query, they are captured in the Applied Steps section.

It’s important to know that the underlying data is not changed; rather, Power Query Editor adjusts and shapes its view of the data, and any interaction with the underlying data occurs based on Power Query Editor’s shaped and modified view of that data.

In the Query Settings pane, you can rename steps, delete steps, or reorder the steps as you see fit. To do so, right-click the step in the Applied Steps section, and choose from the menu that appears. All query steps are carried out in the order they appear in the Applied Steps pane.

Renaming a step in Query Settings

The Advanced Editor

If you want to see the code that Power Query Editor is creating with each step, or want to create your own shaping code, you can use the Advanced Editor. To launch the advanced editor, select View from the ribbon, then select Advanced Editor. A window appears, showing the existing query code.

The Advanced editor

You can directly edit the code in the Advanced Editor window. To close the window, select the Done or Cancel button.

Saving your work

When your query is where you want it, you can have Power Query Editor apply the changes to the data model into Power BI Desktop, and close Power Query Editor. To do that, select Close & Apply from Power Query Editor’s File menu.

Close and load your query to Power BI Desktop

As progress is made, Power BI Desktop provides a dialog to display its status.

Progress dialogue

Once you have your query where you want it, or if you just want to make sure your work is saved, Power BI Desktop can save your work in a .pbix file.

To save your work, select File > Save (or File > Save As), as shown in the following image.

Save your Power BI Desktop report

Shape data

When you shape data in the Power Query Editor, you’re providing step-by-step instructions (that Power Query Editor carries out for you) to adjust the data as Power Query Editor loads and presents it. The original data source is not affected; only this particular view of the data is adjusted, or shaped.

The steps you specify (such as rename a table, transform a data type, or delete columns) are recorded by Power Query Editor. Each time this query connects to the data source, those steps are carried out so that the data is always shaped the way you specify. This process occurs whenever you use the Power Query Editor feature of Power BI Desktop, or for anyone who uses your shared query, such as on the Power BI service. Those steps are captured, sequentially, in the Query Settings pane, under Applied Steps.

The following image shows the Query Settings pane for a query that has been shaped—you’ll go through each of those steps in the next few paragraphs.

Query settings for Power Query Editor

Using the retirement data from the Using Power Query in Power BI Desktop quickstart article, which you found by connecting to a Web data source, you can shape that data to fit your needs.

For starters, you can add a custom column to calculate rank based on all data being equal factors, and compare this to the existing column Rank. Here’s the Add Column ribbon, with an arrow pointing toward the Custom Column button, which lets you add a custom column.

Create a custom column button

In the Custom Column dialog, in New column name, enter New Rank, and in Custom column formula, enter the following:

    ([Cost of living] + [Weather] + [Health care quality] + [Crime] + [Tax] + [Culture] + [Senior] + [#"Well-being"]) / 8

Make sure the status message reads ‘No syntax errors have been detected.’ and select OK.

Custom column dialog

To keep column data consistent, you can transform the new column values to whole numbers. Just right-click the column header, and select Change Type > Whole Number to change them.

If you need to choose more than one column, first select a column then hold down SHIFT, select additional adjacent columns, and then right-click a column header to change all selected columns. You can also use the CTRL key to choose non-adjacent columns.

change data type

You can also transform column data types from the Transform ribbon. Here’s the Transform ribbon, with an arrow pointing toward the Data Type button, which lets you transform the current data type to another.

Transform ribbon arrow

Note that in Query Settings, the Applied Steps reflect any shaping steps applied to the data. If you want to remove any step from the shaping process, you simply select the X to the left of the step. In the following image, Applied Steps reflects the steps so far, which includes connecting to the website (Source), selecting the table (Navigation), and, while loading the table, Power Query Editor automatically changing text-based number columns from Text to Whole Number (Changed Type). The last two steps show your previous actions with Added Custom and Changed Type1.

Transform ribbon

Before you can work with this query, you need to make a few changes to get its data where you want it:

  • Adjust the rankings by removing a column—you’ve decided Cost of living is a non-factor in your results. After removing this column, you find the issue that the data remains unchanged, though it’s easy to fix using Power BI Desktop, and doing so demonstrates a cool feature of Applied Steps in Query.
  • Fix a few errors—since you removed a column, you need to readjust your calculations in the New Rank column. This involves changing a formula.
  • Sort the data—based on the New Rank and Rank columns.
  • Replace data—this tutorial will highlight how to replace a specific value and the need of inserting an Applied Step.
  • Change the table nameTable 0 is not a useful descriptor, but changing it is simple.

To remove the Cost of living column, simply select the column and choose the Home tab from the ribbon, and then Remove Columns, as shown in the following figure.

Changed type

Notice the New Rank values have not changed; this is due to the ordering of the steps. Since Power Query Editor records the steps sequentially, yet independently of each other, you can move each Applied Step up or down in the sequence. Just right-click any step and Power Query Editor provides a menu that lets you do the following: Rename, Delete, Delete Until End (remove the current step, and all subsequent steps too), Move Up, or Move Down. Go ahead and move up the last step Removed Columns to just above the Added Custom step.

Move up in Query Settings

Next, select the Added Custom step. Notice the data now shows Error, which you’ll need to address.

Query settings order

There are a few ways to get more information about each error. You can select the cell (without selecting the word Error), or select the word Error directly. If you select the cell without selecting the word Error directly, Power Query Editor displays the error information on the bottom of the window.

Error information

If you select the word Error directly, Query creates an Applied Step in the Query Settings pane and displays information about the error. You don’t want to go this route, so select Cancel.

To fix the errors, select the New Rank column, then display the column’s data formula by opening the View ribbon and selecting the Formula Bar checkbox.

Formula bar

Now you can remove the Cost of living parameter and decrement the divisor by changing the formula to the following:

    Table.AddColumn(#"Removed Columns", "New Rank", each ([Weather] + [Health care quality] + [Crime] + [Tax] + [Culture] + [Senior] + [#"Well-being"]) / 7)

Select the green checkmark to the left of the formula box or press Enter, and the data should be replaced by revised values. The Added Custom step should now complete with no errors.


You can also Remove Errors (using the ribbon or the right-click menu), which removes any rows that have errors. In this case it would’ve removed all the rows from your data, and you don’t want to do that—you probably like your data, and want to keep it in the table.

Now you need to sort the data based on the New Rank column. First select the last applied step, Changed Type1, to get to the most recent data. Then, select the drop-down located next to the New Rank column header and select Sort Ascending.


Notice the data is now sorted according to New Rank. However, if you look in the Rank column, you’ll notice the data is not sorted properly in cases where the New Rank value is a tie. To fix this, select the New Rank column and change the formula in the Formula Bar to the following:

    = Table.Sort(#"Changed Type1",{{"New Rank", Order.Ascending},{"Rank", Order.Ascending}})

Select the green checkmark to the left of the formula box or press Enter, and the rows should now be ordered in accordance with both New Rank and Rank.

In addition, you can select an Applied Step anywhere in the list, and continue shaping the data at that point in the sequence. Power Query Editor will automatically insert a new step directly after the currently selected Applied Step. Let’s give that a try.

First, select the Applied Step prior to adding the custom column—this would be the Removed Columns step. Here you’ll replace the value of the Weather ranking in Arizona. Right-click the appropriate cell that contains Arizona’s Weather ranking and select Replace Values from the menu that appears. Note which Applied Step is currently selected (the step prior to the Added Custom step).

Replace values

Since you’re inserting a step, Power Query Editor warns you about the danger of doing so—subsequent steps could cause the query to break. You need to be careful, and thoughtful! Since this is a tutorial that’s highlighting a really cool feature of Power Query Editor to demonstrate how you can create, delete, insert, and reorder steps, go ahead and select Insert.

Insert a step

Change the value to 51 and the data for Arizona is replaced. When you create a new Applied Step, Power Query Editor names it based on the action—in this case, Replaced Value. When you have more than one step with the same name in your query, Power Query Editor adds a number (in sequence) to each subsequent Applied Step to differentiate between them.

Now select the last Applied Step, Sorted Rows, and notice the data has changed regarding Arizona’s new ranking. This is because you inserted the Replaced Value step in the right place, before the Added Custom step.

That was a little involved, but it was a good example of how powerful and versatile Power Query Editor can be.

Lastly, you’ll want to change the name of that table to something descriptive. When you get to creating reports, it’s especially useful to have descriptive table names, especially when you connect to multiple data sources, and they’re all listed in the Fields pane of the Report view.

Changing the table name is easy. In the Query Settings pane, under Properties, simply type in the new name of the table, as shown in the following image, and select Enter. Call this table RetirementStats.

Rename a table

You’ve shaped that data to the extent you need to. Next, you’ll connect to another data source and combine data.

Combine data

The data about various states is interesting, and will be useful for building additional analysis efforts and queries. But there’s one problem: most data out there uses a two-letter abbreviation for state codes, not the full name of the state. You need some way to associate state names with their abbreviations.

You’re in luck. There’s another public data source that does just that, but it needs a fair amount of shaping before you can connect it to your retirement table. Here’s the Web resource for state abbreviations:

From the Home ribbon in Power Query Editor, select New Source > Web and type the address, select Connect, and the Navigator shows what it found on that Web page.

The Navigator window

Select Codes and abbreviations… because that includes the data you want, but it’s going to take quite a bit of shaping to pare that table’s data down to what you want.


Is there a faster or easier way to accomplish the steps below? Yes, you could create a relationship between the two tables, and shape the data based on that relationship. The following steps are still good to learn for working with tables; just know that relationships can help you quickly use data from multiple tables.

To get this data into shape, take the following steps:

  1. Remove the top row—it’s a result of the way that Web page’s table was created, and you don’t need it. From the Home ribbon, select Reduce Rows > Remove Rows > Remove Top Rows. Remove top rows The Remove Top Rows window appears, letting you specify how many rows you want to remove.


If Power BI accidentally imports the table headers as a row in your data table, you can select Use First Row As Headers from the Home tab, or from the Transform tab in the ribbon, to fix your table.

  1. Remove the bottom 26 rows—they’re all the territories, which you don’t need to include. From the Home ribbon, select Reduce Rows > Remove Rows > Remove Bottom Rows. Remove bottom rows
  2. Since the RetirementStats table doesn’t have information for Washington DC, you need to filter it from your list. Select the drop-down arrow beside the Region Status column, then clear the checkbox beside Federal district. Filter data
  3. Remove a few unneeded columns—you only need the mapping of the state to its official two-letter abbreviation, so you can remove the following columns: Column1, Column3, Column4, and then Column6 through Column11. First select Column1, then hold down the CTRL key and select the other columns to be removed (this lets you select multiple, non-contiguous columns). From the Home tab on the ribbon, select Remove Columns > Remove Columns. Remove columns


This is a good time to point out that the sequence of applied steps in Power Query Editor is important, and can affect how the data is shaped. It’s also important to consider how one step may impact another subsequent step. If you remove a step from the Applied Steps, subsequent steps may not behave as originally intended because of the impact of the query’s sequence of steps.


When you resize the Power Query Editor window to make the width smaller, some ribbon items are condensed to make the best use of visible space. When you increase the width of the Power Query Editor window, the ribbon items expand to make the most use of the increased ribbon area.

  1. Rename the columns, and the table itself—as usual, there are a few ways to rename a column. First select the column, then either select Rename from the Transform tab on the ribbon, or right-click and select Rename… from the menu that appears. The following image has arrows pointing to both options; you only need to choose one. Many ways to rename Rename the columns to State Name and State Code. To rename the table, just type the name into the Name box in the Query Settings pane. Call this table StateCodes.

Now that you’ve shaped the StateCodes table the way you want, you’ll now combine these two tables, or queries, into one. Since the tables you now have are a result of the queries you applied to the data, they’re often referred to as queries.

There are two primary ways of combining queries: merging and appending.

When you have one or more columns that you’d like to add to another query, you merge the queries. When you have additional rows of data that you’d like to add to an existing query, you append the query.

In this case, you’ll want to merge queries. To get started, from the left pane of Power Query Editor, select the query into which you want the other query to merge, which in this case is RetirementStats. Then select Combine > Merge Queries from the Home tab on the ribbon.

Merge queries

You may be prompted to set the privacy levels to ensure the data is combined without including or transferring data you didn’t want transferred.

Next the Merge window appears, prompting you to select which table you’d like merged into the selected table, and then, the matching columns to use for the merge. Select State from the RetirementStats table (query), then select the StateCodes query (easy in this case, since there’s only one other query—when you connect to many data sources, there are many queries to choose from). When you select the correct matching columns—State from RetirementStats, and State Name from StateCodes—the Merge window looks like the following, and the OK button is enabled.

Merge window

A NewColumn is created at the end of the query, which is the contents of the table (query) that was merged with the existing query. All columns from the merged query are condensed into the NewColumn, but you can select to Expand the table, and include whichever columns you want.

New column can be expanded

To expand the merged table, and select which columns to include, select the expand icon (Expand). The Expand window appears.

Expanded column

In this case, you only want the State Code column, so you’ll select only that column, and then select OK. Clear the checkbox from Use original column name as prefix because you don’t need or want that. If you leave that checkbox selected, the merged column would be named NewColumn.State Code (the original column name, or NewColumn, then a dot, then the name of the column being brought into the query).


Want to play around with how to bring in that NewColumn table? You can experiment a bit, and if you don’t like the results, just delete that step from the Applied Steps list in the Query Settings pane. Your query returns to the state prior to applying that Expand step. It’s like a free do-over, which you can do as many times as you like until the expand process looks the way you want it.

You now have a single query (table) that combined two data sources, each of which has been shaped to meet your needs. This query can serve as a basis for lots of additional, interesting data connections, such as housing cost statistics, demographics, or job opportunities in any state.

To apply changes and close Power Query Editor, select Close & Apply from the Home ribbon tab. The transformed dataset appears in Power BI Desktop, ready to be used for creating reports.

Close and apply your settings

8 thoughts on “Data Transformation with Power BI

Leave a Reply

Your email address will not be published.