I had a conversation last week with a Excel power user who wanted to combine data he got delivered from his sales department in a csv file and data in one xlsx file with the target of the sales.
He had difficulty understanding how he can combine these two in PowerPivot. Although he had seen some demo’s and read some blog posts about fact tables, dimension tables and relationships and he still was unable to get these 2 sources combined. In this blog post we are going to combine two fact table by creating a relationship and take a look at how relationships work.
The purpose is to create a report where we can see the number of sales by month by country next to the target values.
We start by importing the files into PowerPivot, first we insert the txt file:
Then the target values by year, month and country from the xlsx file by using create linked table
This results into two tables inside PowerPivot
So we now are able to create a report using our sales we imported.
But to get values of the two different tables in one pivottable we have to relate the two to each other, PowerPivot warns us that we should create a relationship, but is unable to detect one for us.
We have to create one ourselves, this is where we have to stop thinking like a BI pro, when I started to think about a solution for this I immediately started to think about creating new dimensions, importing new tables to use as intermediate table between the two fact tables, like a kimbal star diagram:
We have to remember we are building an ad hoc report with data that doesn’t come from an ETL, we have gathered some data somewhere and all we need is a quick answer, not to build a Bi application that we have to use for all kinds of analytics in the next few year.
We have to take a few steps back and remember what the purpose of this report was, i want to get the sales to be combined by the target. Combining two tables means creating a relationship. To be able to do this we have to take a look at what relationships between tables are. In powerpivot relationships are called “lookup relationships”:
This is a great name to describe what happens with relationships. Lets say we have one table containing values per country (fact table). The country column from the fact table is filled with countrycode. We want to show the country name instead of the code in our report. To be able to do this we create a secondary table called country. We want PowerPivot to lookup values from the country table every time we show values from the fact table by countrycode. To do this we create a relationship:
In this case we would select our facttable as “table” in the relationshipscreen and we select the country table to lookup the country value in. This would make it possible to show the countryname instead of the countrycode at our pivottable.
Ok back to our example, we have two tables where we would like to lookup the target value from the “target” table next to each value from the “fact” table. To be able to lookup a value from a table you have to make sure PowerPivot can only find one unique value at the lookup table for each row in the fact table. When we would have two countrynames for FR PowerPivot wouldn’t know what to put in the pivottable as the value belonging to FR. PowerPivot has a check to make sure you do this correctly, you will get a “The relationship cannot be created because each column contains duplicate values. Select at least one column that contains only unique values.” error when you try to create this relationship. We need to find a unique value in the lookup to table to relate to our fact table.
To identify this value, also know as the key, in our example we look at the columns from both tables , we quickly can identify the unique value:
As we can see both tables have calendaryear, Monthnumberofyear and salesterritorycountry. So when we want to lookup the target we can lookup the target belonging to the combination of these 3 values. We cannot make a relationship on the Calendaryear column only since this wouldn’t be unique. We have to create a unique column from the combination of the three values, we can do this by using the & operator, resulting in the following column:
YearMonthCountry=Fact[CalendarYear] & Fact[MonthNumberOfYear] & Fact[SalesTerritoryCountry]
We need to do this at both tables and then we can create a relationship between the two tables:
Now we are able to combine the two values in one report.
Powerpivot will automatically sum all the values from the two tables by looking up all the values in the lookup table belonging to the data from the fact table that is selected in this pivottable.
The last thing to keep in mind is when we would select a productname from our fact table in our pivottable we are unable to lookup a value from the target table since the product isn´t in the “target” table.