Curriculum
Course: Data Analysis 2024 (Batch B)
Login

Curriculum

Data Analysis 2024 (Batch B)

Week 10

0/36
Text lesson

Class Twenty – Six; Text (Shape data)

Shaping can mean transforming the data, such as renaming columns or tables, removing rows or columns, or changing data types. Power Query Editor captures these steps sequentially under Applied Steps in the Query Settings pane. Each time this query connects to the data source, those steps are carried out, so the data is always shaped the way you specify. This process occurs when you use the query in Power BI Desktop, or when anyone uses your shared query, such as in the Power BI service.

First step is to identify the column header and names within the data and then evaluate where they are located to ensure they are in a right place.

How to correct Header when they are not place in the right place.

1.    By select the use first Row as Headers on a Home tab or

2.    By selecting the dropdown button next to the column 1 and then select use first Row as Header.

Next step is to rename column. You can Re-name column when you see that after selecting first Row as Header and some names does not appear rightly, you can click the column to rename it with the right name.

Remove Top Rows

When shaping your data, you might need to remove some of the top rows, if they are blank or if they contain data that you do not need in your project.

To remove the excess rows, select remove rows on Home tab and click remove top rows.

(You can specify by remove top 10 rows). Same applicable to bottom rows.

Remove Columns

A key step of data shaping process is to remove unnecessary columns. Examine each column and ask yourself if you really need the data that it contains.

How to remove column

1.    Select the column you want to remove and on Home tab, select remove column or

2.    You can select the columns you need and then on the Home tab select remove column and click remove other columns.

Change data type

If you need to change a data type, select the column or columns to change. right-click a column header, select Change Type, and choose a new data type from the menu, or drop down the list next to Data Type in the Transform group of the Home tab, and select a new data type.

Split Column by Delimiter

When you have many items in a column (ie Full Name) and you want to split the name to be FirstName on its separate column from LastName.

Right click the column and select split column and choose by delimiter then select space if it is space that separate FirstName from LastName and click Ok to separate them to two columns then Rename your column.

Change value from column

Use replace value to change any value you wish to change from your column.

Null should not be in any of your aggregation column (when you use average function and you have null value in the column, it will give you a wrong answer). When you want to use aggregation functions, make sure you replace a null value with zero (0).

Unpivot Columns

This can be use when data is gotten from any data source but it is most often use when data was imported from Excel sheet.

How to unpivot

When you have a data in power query that look like

Year

2018

2019

January

15370

23462

February

14960

34871

March

13450

52343

April

12580

92710

May

11230

64815

June

16952

81634

July

17520

40291

 

If you leave the data like this, it will be difficult to create of all cells combine from 2018 and 2019.

First rename the “Year” to “Month”. Highlight the 2018 and 2019 columns the select the transform tab in power query and then select unpivot

Then rename the attribute column to “Year” and Value column to “Sale Amount”.

Views

Column Quality; it is used to show information of the column (ie Valid, error and empty).

Column profile; giving information about data but its limit to one column.

Column distribution; shows the number of distinct and unique values.

Append and Merge Query

To append data, under home tab click append query and select append query as new and a dialog box will appear with options; (a) Append with two tables (b) Append with three or more tables. Select base on the number of tables you want to append. After selecting three or more tables then two boxes will appear, the one at the left with all the names of your table and the second at the right with one table name, highlight all the table at the left box and click add in between the table to move them to the right-side box then click Ok to append your table.

Appending is to join table vertically while Merging is to Join table Horizontally.

To merge two tables, they must have header relationship.

under home tab click merge query and select merge query as new and a name of the first table will appear with a dialog box empty click the drop down on the dialog box and select the table name you want to merge with the first table, then click the column that have relationship for both tables, and select inner (only matching rows) to match your table and click Ok.