Archive

Archive for the ‘PowerPivot’ Category

Tutorial: Optimize a Sample PowerPivot Model for Power View Reports

December 14th, 2011 No comments

Our documentation team released a great tutorial that explains all the ins and outs on what you need to do when you want to optimize your tabular model for Power View

The entire tutorial can be found here: http://social.technet.microsoft.com/wiki/contents/articles/6176.aspx

 

How to unpivot using Excel and load it into PowerPivot

December 8th, 2011 3 comments

I had an interesting question today where someone send me a workbook that contained a flattened table like this:

Now as a result we want to see the top 1 nr of products per country. The problem here is that we cannot just load this data into PowerPivot and start analyzing to get this result. In this case the columns need to become attributes in a dimension. This is a classical ETL operation that can be done by many tools like SQL and SSIS , but not by PowerPivot as a modeling tool.

So how do you fix this problem ? At first I used a Macro that we found online to unpivot the data, that did get me the result I wanted. But some time later I realized that there is a trick in Excel that I can use that I have blogged about before here, by combining this trick and another it will be pretty easy to unpivot.

In this blog post I will show you how to Unpivot a flattened table using PowePivot and Excel (Actually you would not even need PowerPivot if you don’t want to).

The first thing we do is create a regular excel pivottable out of the table I showed in the first image

Next we change “Count of Product x” to “Sum of Product  x” and change the pivottable into a classic pivottable using Pivottable options:

This will turn it into a column per level:

Next we move the values from Colum to Rows, this results in an unpivoted result.

Next we need to turn the measure “Sum of Product X” into a column. Copy the entire pivottable and past it in a new sheet keeping only the values

This will result in a flat table inside Excel. Now I select the “Sum of Product X” table and use search and replace to replace “Sum of ” with a empty space. I also renamed the product column header:

Now the last thing that I need to do is get rid of the blanks and replace them with the same values as the previous row. Here I can use another Excel trick:

Start by selecting all the columns we want to fill the blanks of, go to the home tab and select “find and select”, and click “Go To special”. Here we can select
blanks and click ok. This will result in all the blanks being selected.  Type =, press the up arrow and press Ctrl+Enter (that’s hold down Ctrl and press the Enter key). All the
empty cells will be filled with the value of one row above. Exactly what we  wanted.

Now we can choose two options, load the table into PowerPivot or just use the regular Excel pivottable. I prefer getting my data into PowerPivot so let’s do that. Select a cell inside the table and go to the PowerPivot ribbon and click Create linked table. This will automaticly create a Excel table (make sure to select my table has headers) and load the data into PowerPivot:

Now we can create a new pivottable based on this table:

As final step we need to determine the Top 1 product per country, again we are going to use an Excel function. Select a product in the pivottable and click on the filter button on row labels. Now we select value filters, click Top 10 and change the number of items from 10 to 1 and press Ok.

This will give us the result we want:

As you can a lot can be done with Excel, you just have to know your way around. While this might not be a surprise for most of the Excel pro’s out there, we BI Pro’s are eager to run things through SQL or SSIS but it can just as easily be done by Excel. Maybe now I will be Excel gansta enough to join Rob Collie’s elite Excel group :)

 

 

 

Deploy only changes to a Tabular model from BIDS and not process

October 21st, 2011 No comments

One of the question that regularly comes up is: “He I have created a tabular model in BIDS and deployed it to my test server, now I have changed a measure or a calculated column I deploy those changes and all of my tables are again loaded. I don’t want that ! I just want to deploy some metadata”.

Well actually that is exactly what we do, but the UI is a little deceiving.

When you deploy for the first time and any time after you get the same UI:

But under the covers what we do is:

  • Update the metadata of the database on the server
  • Send a Process Default command to the database
What happens on the second step at Process Default is that the engine is smart enough to discover what needs to be done to get the database in a fully processed state. If the table already contains data it will not update the database (it will NOT check if there is new data) otherwise it will process the table. At the end of the Process Default command the engine will send a Process Recalc to the database, this will make sure that all the calculated columns / relationships are made into a good state as well.
So in short, if you deploy your model for a second time it won’t reprocess the tables if you haven’t made any structural changed to the model like adding columns. Even though it looks like that in the refresh UI.
Hope this helps.

 

 

 

 

 

 

 

 

Getting up and running with SQL Server Denali for Business Intelligence (Crescent and PowerPivot in CTP3)

August 18th, 2011 No comments

I want to point to a blog post created by a colleague Pej Javaheri. This blog post will describehow to set up a complete machine with Crescent, Tabular project and PowerPivot for SharePoint:

check it out here.

 

PowerPivot DAX case: compare product sales by months since product introduction

August 14th, 2011 1 comment

A few weeks ago I got this interesting question from someone. He wants to compare the sales growth of his products in the months since the introduction of the product. Lets say “Product One” first selling date is 1/2/2010 and “Product Two’s” first selling date is 2/1/2010. If we want to compare the sales since introduction we want to compare month 0 between the two products. Month 0 for Product One is 1/2/2010, for Product Two it is 2/1/2010.

 

I created a table with some data to simulate this:

As you can see different products have different start dates. The first thing we need to do is enrich our table with a new column that will determine the amount of months between the date from the current row and the date of the first sale for this product. Is we have that month we can use this in our report to compare months, first month of sales for each product will be 0, the next month 1 and so forth. This makes comparisons very easy.

Let’s get started with getting the first date for each product. So what do we have to here ? For each sales transaction in the table we need to check all the other sales transaction from the same product to determine the first salesdate. We can do this with a Calculate column using the MINX function:

=MINX(
FILTER(Sales,Sales[Productname]=EARLIER(Sales[Productname]))
,Sales[SalesDate])

So what does this formula do ? Give me the MIN value of Sales[SalesDate] for all the rows in the Sales table where the Sales[Productname] is the same value as the Sales[Productname] from the current row.

The function to note here is EARLIER(). What does this do ? well FILTER(Sales) gives us a filter over the entire Sales table, we no longer are aware of the current row of the table we are in, the earlier goes to the previous row context, being the current row of the table we are running the formula of the calculate column in.

This results in the following value for each row:

As you can see we now have the first date for each product available. Now all we have to do is calculate the difference between the dates in months. Unfortunately DAX doesn’t have a datediff function so we have to do it the Excel way, I am using a technique described in a blog post from Vidas Matelis for it.

= (YEAR([SalesDate]) - YEAR([First sales date]) ) *12
+ (MONTH([SalesDate])- MONTH([First sales date]))

This results in the column that gives us the months for each sales transaction compared to the date of introduction

Now we can compare the sales from these products in a pivottable:

But this of course is not the end of it, we now only see the sales for each month. We would like to see the sales since product introduction to the current date.

This is pretty simple formula:

=if(HASONEVALUE(Sales[MonthSinceIntroduction]),  CALCULATE([Sum of Sales], Sales[MonthSinceIntroduction] <= VALUES(Sales[MonthSinceIntroduction])) , BLANK())

This formula will check if only one month is selected in the current context, and for this single month calculate the [Sum of Sales] where the MonthSinceIntroduction is smaller or equal to the current MonthSinceIntroduction value we are in on the column.

This will result in this pivottable:

 

But hey.. hold on..why don’t we see values for the months we don’t have sales ? The dax formula should work. I had to dig deep to find an answer and had to call in the troops :) (called Jeffrey Wang in this case).

This behavior is a caused by the MDX query that is being executed. Remember the data of the Pivottable is being queries by MDX. MDX will take care of retrieving the rows, column and filters, after it has retrieved the entire. Each DAX measure is being evaluated for each cell (or the cross section between rows, column and filters). One of the behaviors that MDX has  is that when there is no value in the table for a row / column combinations from the same table MDX Autoexists kicks in. Autoexists will make sure that if the combination between rows, columns and filters does not exists within the same table it will NOT execute the formula for that spot. Since MDX is designed for a Multidimensional model where for each fact table you have a separate dimension this made sense. For PowerPivot this makes a little less sense but still pretty simple to work around.

1 Use Crescent :) that will use DAX to create the report and does not have MDX autoexist.

2 create a second table with values 0 to 25 in it and link it to the original table and use those values in the report. That will solve the problem:

Observe I did not change the formula at all.

Now I can create a chart for it as well:

 

Everything You Always Wanted to Know About PowerPivot Data Refresh but Were Afraid to Ask

August 14th, 2011 No comments

A great new whitepaper by my colleague Mariano Teixeira Neto:

The purpose of PowerPivot for SharePoint is to allow users to share their PowerPivot workbooks in a secure and scalable environment. To ensure that you are sharing and collaborating on the most recent data available to you, PowerPivot for SharePoint provides a data refresh feature that lets you automatically update PowerPivot data in an Excel workbook. This white paper describes in detail the data refresh feature in PowerPivot for SharePoint. It starts by explaining the steps for setting up a data refresh schedule in SharePoint, and then it continues with an in-depth view on how data refresh works on a SharePoint farm. Both administrators and the business analysts who author and manage PowerPivot workbook data can benefit from learning more about setting up and using data refresh in a SharePoint environment.

Download the entire whitepaper here.

Categories: PowerPivot, SharePoint Tags:

PowerPivot DAX PARALLELPERIOD vs DATEADD

August 11th, 2011 5 comments

Something that might be interesting to know is that DATEADD and PARALLELPERIOD  look very similar but do behave differently. This blog posts looks at the difference between the two. Lets look at these two formula’s:

=CALCULATE([Sum of SalesAmount], DATEADD(DateTable[FullDateAlternateKey],-1,YEAR))
=CALCULATE([Sum of SalesAmount], PARALLELPERIOD(DateTable[FullDateAlternateKey],-1,YEAR))

They both look very similar right ? But what is the difference ? Lets look at the results:

The first and only thing you see here is the difference of 2004. Why ? To determine previous year DATEADD uses a range between the first date where you have values in the facttable to the last date where you have values in your facttable for the current year on rows, it will this range to go back one year. Whereas  PARALLELPERIOD uses the first and last date in your datetable for the current year to go back one year.

What ? Ok let’s look at it in a table for DATEADD :

I added two measures that will show you what the first and last date are that being used to go back one year. For 2004 you can see that it uses the startdate 1/1/2004 as this was the first date a sale was made in that year. But as last year we see 8/31/2004 because this was the last date a sale was done in that year (the year is not finished yet). If you now go back one year it will calculate the sum of SalesAmount for 1/1/2003 to 8/31/2003.

Now look at the first and lastdates PARALLELPERIOD uses. As you can see the it determines the current year, looks for the first and last day in the year using the date table   and use that to move back one year. Going back one year it will calculate the sum of SalesAmount for 1/1/2003 to 12/31/2003. Giving a larger result.

 

Ok looks like PARALLELPERIOD is the way to go right ? Well it depends on what you want to do. PARALLELPERIOD really goes back one entire year (based on what you select in the parameter) using the first and last date. DateAdd uses the first and last date in the current selection. What does this mean ? We can best illustrate this by adding months into the mix:

What makes these two measures return different results here is that both use a different mechanism to determine the current period. Where PARALLELPERIOD always goes back one entire year (or whatever parameter you select in the measure), DATEADD will look at the current row selection and go back one year for that selection. In this final image for the row with month 04/2004 DATEADD will show you the sum of SalesAmount for 04/2003 where PARALLELPERIOD will go back one entire year.

Categories: DAX, PowerPivot Tags: ,

PowerPivot DAX: Find and Search in Denali

August 10th, 2011 1 comment

In Denali we made a small (but significant) change to the Find and Search syntax. By adding a new optional argument “NotFoundValue” you can tell DAX to return a value instead of returning an Error when the value is not found. In 2008 R2 DAX would always return a Error.

 

What is the benefit ? When you use the find or search function in a calculated column and it will return an Error for one row it will return an Error for all the rows. To solve this you have to wrap the Find or Search formula with a =iferror() to actually be able to work with this formula. Another benefit is that you don’t have to work with errors inside the function, this can degrade performance.

Categories: DAX, PowerPivot Tags: ,