PowerPivot Inside Out (Part 2/3)

This is the second article of three. If you hadn’t read the first part yet, it is recommended that you do so. It talked about the basics of PowerPivot and already covered the first three of the ten things you need to know. So we will continue with number four…

4. Sorting values of one column by values in another

In Excel this is more or less science fiction. But in PowerPivot this works beautifully. So imagine this, you have a column with descriptions that go something like “beginner”, “intermediate” and “master”. When you put these in a pivot table, you actually get a right order, but only since that is the alphabetical order of things. But what if our descriptions were “Rookie”, “Seems to know his way around” and “Almost god”. Now if you were to insert these descriptions into a pivot table, the order of things would be anything but natural. This is a very rudimentary example of things, but it can go much further. In PowerPivot, time intelligence is a big thing. You can read more about it in the sixth section entitled Time Intelligence. But in order for many of the brilliant time intelligence functions you can use to work you have to use what’s called a Calendar Table. It’s a simple table with continuous dates. So before long you will have to deal with dates in PowerPivot. When it comes to dates, Pivot tables created from a Power Pivot data model do not act as one would expect. For Example if you add dates to the Rows and right click on those dates…

…there is no ability to Group or Ungroup them. So if you want to get data by months, you have to get month names to a separate column. Nothing wrong with that, you either use Text or Format, but when you add those to a Pivot table, you get something like this

So the months are not in the logical order. The alphabetical order of the months is no surprise, but it is a hassle. Luckily we have a way to solve this. All you need is a column in the PowerPivot table with numbers that correspond to the correct sorting order. With months this is a simple MONTH function.

Now here it is, in PowerPivot you have a Sort By Column Command. And here you can say

Sort the Month Names as you would Month Numbers. And right away we get

A great article about sorting months chronologically here.

5. Disconnected tables

Usually in a PowerPivot Data Model, you would see something like this…

…so a Fact Table and Dimensions as connected tables. This is a very simple star schema as opposed to a far more complex snowflake schema where the dimensions have their own connected tables and so on. These connections are part of the magic behind PowerPivot. But sometimes you want to have a table that is not connected to any other table. We call this tables disconnected tables. They may sometimes do havoc, as they almost inevitably lead to the “Relationships between tables may be needed” warnings, but they can be a great asset.

Here’s an example. Our data will be as simple as this

Now based on this data, we have a following analysis.

So SUM Of Sales per Year per QTR. This is a very simple measure

[SumOfSales]
=SUM(Table1[Sales]) .

But now we want to take this a step further, we want to calculate a SumOfSales for our top 10 sales in that period. Again a fairly simple measure Sales – TOP 10…

Now here comes the tricky part, what if we want to change that 10 dynamically. Top 3, 20, 50… Here we will create a new table with the values we want to use for the N.

We add this table to our data model…

And since we will not add a connection, this is now a disconnected table in our Data Model.

Now in the Pivot table we add a Slicer based on the TOPN field of the disconnected table.

Next step is to write a measure, which will return the selected value. This can be a simple MAX like so…

With this we can modify the TOP10 measure we wrote earlier, to include this selection…

With this measure, we can now use the slicer to modify the N counter and get a SumOfTopNSales. On the image below is the Sum of top 12 sales.

It’s ingenuous.

6. Time Intelligence

As mentioned in section 4, Time Intelligence is a big thing in PowerPivot and DAX. It can do wonders for you, but one condition has to be met, for you to be able to use Time Intelligence. You must have a Calendar Table! 99% of Data Models include a Calendar table. It’s a table of sequential dates (none can be missing!). Now you connect this Calendar Dimension to your Fact Table Date field and you are set to go! Now you can use functions like TOTALYTD, TOTALMTD, TOTALQTD…

As you can see, TOTALYTD expects as a second argument something called Dates. This is the Date field of your Calendar Table. As you can see it also offers you a [YearEndDate] option, where you can specify something different from 12/31/ and get YTD for a July – June year and so on.

Another great function is a DateAdd where you can say do a sum of sales, but go back 8 months or back 4 years and so on.

The 8 months back would be [Sales8MonthsBack] = CALCULATE([SUMOFSALES],DATEADD(Calendar[Date],-8,MONTH)) . This is a perfect FILTER function to use in your CALCULATE, but I’m getting ahead of myself. The goal here is to let you know that Time Intelligence is a big thing and deserves to be among the 10 things you need to know and use if you want to take your PowerPivot reports to the next level.

Now we are three steps closer to eternal happiness 🙂

Advertisements

PowerPivot Inside Out (Part 1/3)

10 things you need to know

The following article is kind of a baseline knowledge, one should possess, before attempting the creation of their first PowerPivot workbook in Excel. I always believed that understanding is key, and therefore this article is written to deepen your understanding of how PowerPivot Add-Inn for Excel works and what’s actually happening under the hood. In the process of reading this article you will also be given many reasons for using PowerPivot in Excel. Although I borrowed the title from Microsoft books, this will be just as Informative but much shorter J.

  1. First thing’s first. What is PowerPivot?

It’s an Add-In for Excel 2010 and 2013, and it brings SQL Server Analysis Services (SSAS) to Excel, but in a way that is new to Excel users as it is for T-SQL Professionals. The revolution does not stop there. It must also be said that it stores data far more efficiently than Excel does. More about that in paragraph 3. Comparing the amount of data that can be stored in PowerPivot vs. the amount in Excel goes something like 999.999.999 vs. 1.048.576. That’s not even the limit, it’s just the most I’ve seen… But rather than portraying this as a choice of one or the other, the fact is one cannot exist without the other. If you want to do amazing things in Excel, sooner or later you will have to use PowerPivot. So to answer the initial question, Power Pivot is an Add-Inn for Excel, which allows you to store large amount of data in a compact form as a metadata of an Excel file and then analyze that data with Pivot Tables, Power View sheets or PowerMap projects.

Just as an extra point I would like to point out that Power Pivot V1, V2 and Power Pivot for Excel 2013 are not interchangeable. If you have Power Pivot data in Excel 2013 you cannot open that data in Excel 2010. In the opposite case, it does offer to upgrade your data from 2010 to 2013 however. But once you’ve done so, you cannot open it in 2010!

2. What are the space and RAM requirements

Here is a basic comparison of a file size when data is in an Excel sheet, (just for fun I added a difference between formatting your data as a Table rather than just leaving it as a List) or in a Power Pivot metadata of an Excel file. As an added value, there is a significant difference in the consumption of RAM as can be seen in the table and chart.

But leaving this comparison aside for a moment, let us explain how we can save space with the import of data into PowerPivot. The thing to understand is that when you import a table from any source into PowerPivot, it’s not the number of rows that should concern you, it’s the number of columns. The reason why this is so, is the way that PowerPivot indexing works. Now if you have a column called Company in a table that has hundreds of millions of rows but there are only thousand unique companies, than this column requires almost no space on disk and RAM. But if you have a field with many unique values (like ID) than that will take up a lot of resources. You can see the difference in section 3, but let us just conclude by saying this:

You can gain most (performance vise) at importing data, by only importing those columns(!) that you really need for your data model and analysis.

3. So where and how is the data stored?

As you know all Office files from 2007 on are basically a zip folder. So if you take your Excel file called test.xlsx and rename it to test.zip, you can browse through that file.

In the xl folder you will find a folder called model. Now in that folder you can find a file that has a .data extension. This is actually an offline OLAP cube that allows you to work with your data even if you cannot connect to a data source (being SQL server, Access file or an Excel file).

But when you open an Excel file that has PowerPivot data, first it will load as if no PowerPivot id present. The Pivot tables will be present and visible, but no Data Model will be loaded. That is until you either refresh a Pivot Table, change a filter or slicer value in a Pivot Table or open a PowerPivot window. Now that’s when you can see the RAM consumption go up, but there is a way that you can even see the memory consumption for each individual column or field. Here is how. You go to C:\Users\[USER NAME]\AppData\Local\Temp. In the Temp folder you will find at least one folder called VertiPaq_* where a star represents the following 20 symbols. VertiPaq has changed is identity as xVelocity or Officially as “xVelocity in-memory analytics engine (VertiPaq)” with SQL Server 2012. But Essentially the size of this VetiPaq folder (you can guess the right one by the creation date) is very close to the RAM consumption of that Data Model. But it gets even better. Inside the VertiPaq folder there is a folder with a .db at the end of it’s name. In this folder you will find folders representing the tables in your data model. Luckily here you can see the name of the table at the beginning of the folder name. And in this folder you have individual files representing the size of each column.

And it’s clear to see that most space and RAM consuming column is Date, which is to be expected as it has most unique values.

For the cherry on the cake, you can resolve many problems in PowerPivot by deleting the VertiPaq folders (when the file is not running of course).

Don’t forget, this is only part one of thid In-depth dive into PowerPivot. Two more parts are going to follow soon.

 

Does a PowerPivot Pivot Table beat a regular Pivot Table

Hands down, YES!

Before you read this, you should understand the difference between Power Pivot Add-In and a Pivot Table in Excel. You can read a short but very down to facts post on the two in this Pivot Table vs Power Pivot post

If you are new to the PowerPivot Add-In, or if you just heard it being mentioned in passing, what you should know is that PowerPivot is a game changer. Before Excel 2010, Microsoft was gently laying down the groundwork. Being in a strictness of “You should use Tables, and you can’t go wrong” (and you should) or being that they slowly brought the file format to the xlsx with the XML and ZIP and meta data… Everybody noticed the visual change from menus to the ribbon, but underneath something far greater was about to emerge.

It all started with Excel 2010 and the PowerPivot Add-In. Up to that point, Microsoft knew that Excel is the application that no company can go without, but it had serious limitations. The 216 or 220 of rows ( 65536 and 10480576) which might sound impressive is nothing when talking about Big Data or even smaller Databases. Even worse, those are the theoretical limitations of Excel whereas the practical ones are far smaller. With a few 100000 rows filled, the file becomes too slow and Excel can no longer do Automatic calculations. Knowing this, Microsoft was also aware that there were people doing serious magic in their SQL and SharePoint servers but there was a serious gap between MDX (MultiDimensional eXpressions) and OLAP (OnLine Analytical Processing) constructions and knowledge of Excel that normal Excel users possessed. They had no means to analyze the data without the help of a SQL and SharePoint masters. And let’s face it one data source per Pivot Table is a big stretch for the best analytical engine out there.

With Power Pivot (not a mistake, it should be Power SPACE Pivot now (I guess it’s something to do with registration)) they added a so called DAX (Data Analysis Expressions) language and leveled the playing field. It had some functions that Excel users were already familiar with and some Time intelligence functions that people writing SQL queries already knew. But for both it was a learning Experience. But what it did is it brought the two worlds together.

Just a short mention, in Excel 2010 and especially Excel 2013 the word Power is the key. Now you have Power Pivot, Power Query, Power View and Power Map who combined give an Excel 2013 user the analytical power never seen before. And also worth mentioning, PowerView or PowerMap cannot function without Power Pivot. Data has to be stored in a Data Model that is PowerPivot.

Ok, but this article is not about Power Pivot and DAX, it’s about Pivot Tables, which can now come from a simple Table in Excel or from a Power Pivot Data Model. In Excel 2010 you can actually see the difference in the Pivot Table field list because it adds the Slicers Vertical and Slicers Horizontal fields.

In Excel 2013 they were removed and the difference cannot be seen this way.

So if the difference cannot be seen, is there a difference?

Yes! There are two main differences just in Excel.

Data Sets

One of the best things that you can do when your data comes from an outside, SQL or better yet OLAP source. So why does it work with Power Pivot? Well Power Pivot is actually just and offline OLAP for Excel. If you change the .xlsx to .zip and browse that zip, you will find a .data file. That is your Power Pivot data Model with all the tables and so on. But let us look at what Data Sets are used for. Given the following sample…

Let’s say it’s the 3rd September 2012. In our Pivot we have two Years’ worth of Actual and Forecast data. But for 2011, we don’t need the Forecast, but for 2012 we need both, the Actual and the Forcast. Now if we were to turn of Forecast at a filter level,

then that would turn of Forecast for the entire Pivot Table.

But that is not the desired result. Whereas we would like to hide the Forecast for 2011, we would still like to have it visible for 2012. The word “hide” slipped out in the previous sentence, but as it’s now out there, I would like to point out, that hiding it is not the path to eternal happiness. Especially if we imagined we have more Pivot Tables where we would have to repeat the same process.

Now the screenshots above were taken with a Pivot Table created from a Table in Excel. From here on, we will be looking at the same Pivot Table, but this time created from Power Pivot data (same table but inserted into Power Pivot as a linked table). Here’s where things get really interesting.

First we will go to ANALYZE (Excel 2013) or Options (Excel 2010), then Fields, Items, & Sets and then choose Create Set Based on Column Items.

What you will get is the following window

Here you can give your set a name, and then basically say which columns should be visible and which not. You can select the 2011 Forecast Row and choose the Delete Row button on top. Now since you can remove whichever row you like, you therefor remove that column from the analysis. My final view of this window was

Now the moment you press OK you get

Which is good but the best thing is in the Pivot Table Field List you get

And this Sets folder will now be visible and more importantly usable in all Pivot Tables! So the same Set of Columns I another Pivot Table is just a click away. Very useful.

OLAP Tools

Now Sets were great but here is something even greater – OLAP Tools. A very important thing to understand is that Power Pivot works with all the data in RAM. But first it must pull all the data from the source. And that’s exactly what it does. If you create a Power Pivot project where you insert data from Access or SQL Server or MySQL etc. that data is stored in the file! So if you lose connectivity or you want to display your analysis somewhere where you can’t get to your SQL Server, it will work without a hitch. So all this data is in a .data file which is in the Excel files Meta Data. But that .data file is nothing else that an Offline OLAP Cube. So if you’re working on a Power Pivot data, you’re working on an OLAP Cube. And therefore you get OLAP tools. And amongst those tools, there is one unparticular, I would like to focus on. And that is Convert to Formulas.

Now many times we wanted to insert a column of our own calculations into the Pivot Table. But that cannot be done. Well it cannot be done in a plain Excel Pivot Table but it can be done in a Power Pivot Pivot Table. In regular Pivot you would select GETPIVOTDATA and get the entire Pivot on another sheet and there you would insert your columns. But now it can be done right there. All you have to do is to choose Convert a Pivot Table to formulas and you will no longer have a Pivot Table per say. So you cannot choose the fields anymore, but guess what all the calculations are still active and fresh. So refreshing your Power Pivot data, refreshes these cells, but now that’s exactly what they are, just plain old cells. You can insert your own columns and do whatever calculations you want.

And even the Filters from the Pivot Table are still there, still active and still functional! Such power can be derived from the use of this functionality.

As you can see, there is far more to Power Pivot than meets the eye. It’s not just that it gives you the power to work on more than 1048576 rows of data and on multiple data sources at once, it even makes the good old pivot in excel that much better. So to finish of exactly where I started. Does a Power Pivot Pivot Table beat a regular Pivot Table. Once again, hands down, YES!

Dynamic ranges in Excel

For the purposes of this article (if you wish to follow long and I strongly suggest you do so) we will be using the attached workbook that has the A2:E10 Range filled with the following random data.

In real life Ranges like this are very rare and very unkind to the user, since no column has unique format of data used (numbers and text are present in every column). Our goal is to set up a simple SUM function to add up the values in Column A, but to do this in such a way that every addition of data in that column will be reflected instantly in our SUM.

To create a so called dynamic range in Excel we must give that range a name. Usually we would do this by selecting the cells that make up the range and writing the name in the Name Box. Doing this in such a way would make our SUM function look better but it gives us no dynamics. So, to make the range dynamic we will add the name manually by using the Formulas/Define Name command.

This is the command we will be using. And let’s say that to start with we will only name the first column of data. So selecting A2 (since it’s where our range will begin) we fill in the following in the dialog box.

In the Name Field we put COLUMN. And in the Refers To we put

=OFFSET(Sheet1!$A$2,0,0,COUNTA($A:$A),1).

Let’s explain the function above.

The OFFSET function returns the reference to a Cell or a Range, that is for a certain number of rows and columns away from the current Cell or a Range. Where it can be used for a simple reference to a different range, it can also be used to create Ranges. Now for the purpose of distance from the cell we use the Rows and Columns arguments but for the purpose of creating a range, we give the Height and Width arguments.

Syntax: =OFFSET(range, rows, columns, height, width)

range the starting Cell or Range

rows the number of Rows to offset by. Can be both a positive or a negative integer.

columns the number of Columns to Offset By. As above, can be positive or negative.

height the number of Rows we wish the Range to contain

width number of columns the new Range should contain

Example:

=OFFSET(A1,1,2,1,1) gives C2

=OFFSET(C3,-1,-2,1,1) gives A2

This was more or less only the use of the Rows and Columns arguments of the OFFSET function. The Height and Width arguments were only used for the purpose of defining that we want the Range to be a single cell.

But in our example, the function was different. It went as follows.

=OFFSET(
no suprises so far Sheet1!$A$2
is the original Cell or Range, it can also be $A$2 or simply A2. Then we put ,0,0,
the commas are meant to separate the two arguments which are Rows and Columns and in this case both are 0, mean that the first cell remains in place, so no offset. The next part is a bit of magic on our part, it states COUNTA($A:$A). What this does, is it tells us how many rows is in the Range that the name COLUMN describes. Since we want this argument to actually count the rows used by the range, we could simply put the COUNT function there, but the problem would be, that the COUNT function only counts numbers where as our range contains both numbers and text. Therefor COUNTA is required for the counting of the cells that contain data. So it just counts how many cells have data, no mather of the data format. There is another trick we used in this function, that is to give the range as $A:$A. This helps us in not limiting ourselves right out of the starting blocks and will work for as many cells as it will take throughout column A.

After that we have ,1) . Nothing magical here, it just states that our Range is one Column in width. Followed be the parenthesis.

Now we can write our function

=SUM(COLUMN)

This adds up all the values in column A. If we were to add a new value below the last value, it would automatically expand our COLUMN range by that cell and automatically add that value to our SUM. Even if the value we were to add was a string, the range would have expanded since the COUNTA function counts cells that are not blank regardless of whether the cell content is value or a string. Needless to say the SUM would not change.

But there is a catch with using this. Let’s say our COLUMN range is now A2:A10. What if we were to leave a blank cell (A11) between the last value and the value we would be adding (A12). The range would still expand by one row (A2:A11), but since we left one blank cell in-between our previous data and the new value, that blank cell now becomes part of the range. What that means basically is that our new value is not included in the COLUMN range and therefor in our SUM. In this case the COUNTA function is not strong enough for what we want to do. Let’s redefine our named range using the following instead of COUNTA…

=MATCH(9,99999999999999E+307,$A:$A,1) when we wish for the range to automatically expand to the last cell with a value or

=MATCH(REPT(“z”,255),$A:$A,1) when we wish for it to expand to the final string value. This is paramount to the dynamic Data Validation lists.

In both cases the first values are exaggerated but they are the largest Value and the “largest” String that you can put into a cell and therefor will work in every situation.

Here’s how this works. The MATCH function will tell us where (row number or better yet a consecutive cell in a range) the last value that is still smaller than our 9,9E+307 lies. Now the range will automatically expand to that last cell and include it in the SUM or a Data Validation list. Pure Brilliance.

Taking the Data Validation Dropdown list to the next level

Creating a dependent or linked Dropdown list

In this article I will show you how to create dependent Dropdown lists using Data Validation and an Excel function called Indirect. First let’s take a look at what we are trying to accomplish. We will start with the following workbook and data…

Now we wish to have two dropdown menus, one in cell C3 and one in cell C5. But, and here’s the catch, the dropdown in C5 has to be relative to whatever is choosen from the dropdown in C3. So if for instance someone picks France in C3, the dropdown menu in C5 has to give Paris, Lyon and Marseille, but if Slovenia is choosen in C3, Ljubljana, Maribor and Koper are in a dropdown in C5.

To achieve this we will use Data Validation with two additions. The first being the INDIRECT function and the second is using the named ranges.

Step 1: Dropdown list with Data Validation

Just for clarification I will explain how to create a dropdown list in cell C3 using basic Data Validation. So standing on C3 you go to Data/Data Validation

In the dialog box you get you choose List under Allow: and point to the cells containing the names of the countries in the Source field.

So now you get a dropdown list in cell C3 where you can choose any of the countries.

It’s very important to understand that with using Data Validation to create the dropdown, and accepting all defaults under Error Alert you have also effectively limited the input into that table to only the values from the dropdown.

Step 2: Creating the named ranges

Before we can define the dropdown in C5, we must create a few named ranges so that Indirect will point to the right range.

We want to name the range where the French Cities are France and so on. We will use Excel to do this for us. We go to Formulas/Create From Selection and choose Left Column. So create the named ranges based on the values in the first column of the data I selected.

Doing this gives us six named ranges which we can see at the dropdown in the Name Box.

Pay extra attention to the United Kingdom name which was altered in way. The name reads United_Kingdom. The reason the underscore appeared is that names in Excel cannot include spaces. Therefor United Kingdom is not a valid name for Excel and it was changed. This will have dyer consequences for our sample but it will be a learning experience. So now that we have the named ranges we need, we can go to the next step but since the next step will include the Indirect function we should first familiarize ourselves with that function.

INDIRECT function

Simplest sample is this. In A1 we have 100, in A2 we have A1. In A1 we write =INDIRECT(A2) and we get 100. So the cell that we give to the INDIRECT function actually tells the function where it should be looking. In effect, cell A2 told the INDIRECT function to go look into A1 and there it read the 100 it returned. Now we can really go to Step 3…

Step 3: creating a dependent dropdown

Same as in the above example, while standing on C5 we go to Data/Data Validation and say Allow: list. Now here the magic. Under Source: we write =INDIRECT($C$3) so we say to excel go and see what is written in C3. We already know it’s going to be a name of a Country. But with indirect we say go look at that name and see if there are any named ranges under that name. And INDIRECT tells the Data Validation which cells to take as a source and is effectively dependent on what is chosen in C3.

Now this works perfectly for all the countries except for UK, since cell C3 says United Kingdom, the named range for UK cities is under United_Kingdom (for reasons we discussed earlier) the INDIRECT function cannot work and produce a range for the dropdown.

See the attached sample Workbook

In my next article I will be discussing the Named ranges or should I say Dynamic Named Ranges which can take this dropdown a level higher since it would be dynamic and you can add Cities and they dynamically appear in the Dropdown list.

How do I “separate” Pivot Tables

During the creation of Office 2007, 2010 and 2013, a great emphasis was given to the file size. Of course you would like to make the file size as small as possible and for this purpose even the creation or better yet the behavior of PivotTables has changed since Excel 2003. The PivotTable has always created something called Data Cache based on the data provided. The need for indexing and fast creation of analysis has forced it to work in such a manner. In Excel 2003 each PivotTable had its own Data Cache but now the PivotTable that is created using the same Data Model or Data Source as another PivotTable also borrows that Pivot Tables Data Cache. Therefore, for each new PivotTable analysis that uses the “same” data, Excel saves hard dish space since it does not create its own Data Cache but rather uses the same one as previous PivotTables created on the same Data Model.

While this solution is obviously a great way to save space on the computer, this method also has two quite severe consequences for your PivotTables.

1.
Refreshing one individual PivotTable consequently refreshes all PivotTables that are based on the same data, which can be a great thing but you can easily imagine some cases where this would not be such a good thing.

2.
The grouping of records within a single field (for example, a Date field that you combine by months or quarters) now cannot be done on an individual PivotTable but immediately effects all the other PivotTables that are related to the same Data Model. If you used this field in another PivotTable, it reflects this grouping instantly. So you’ve lost the ability to group for example a Date field by months in one PivotTable and by Quarters in another.

There are even more ways how sharing a Data Cache effects our analysis but both listed above are reason enough for the need of a “separation” to arise. In fact, In this article we will discuss three different ways how to do this. First way is linked to the creation of the new PivotTable report and tells us how to create a PivotTable in such a way that it already has its own Data Cache and does not share one with the existing PivotTables. Afterwards I will give you two methods on how to “separate” PivotTables that have already been created…

Method 1 (creating a separate PivotTable report)

If you want to create a new PivotTable so that its Data Cache is separate from the other PivotTables you might have, then you must create it in a particular way, or better yet with a special command called PivotTable and PivotChart Wizard. This command has been part of Excel for what seems an eternity, however since by default it is not on any of the ribbons, you have to add it to the Quick Access Toolbar. To do this, you go to File (file)/Options (options), and then Quick Access Toolbar. Above choose Commands Not in the Ribbon

On the left side, find the PivotTable and PivotChart Wizard and with the Add button add the commands to the Quick Access Toolbar.

Afterwards we click in our data and run the command from the Quick Access Toolbar. This will sent us on a very familiar path

The first step is more or less self-explanatory, but the second one is very important, since it’s the step where we mark the area with the data for the the analysis.

As soon as you say Next, get the following (important!) notice

If you select Yes, then the PivotTable will be calculated on the same Data Cache as preexisting PivotTables and it will suffer from all the symptoms described above. If you select No, then you will create a new Data Cache for this PivotTable and therefor it will be separate from the preexisting PivotTables!

Method 2 (manual creation of a separate Data Cache for preexisting PivotTables)

The method is quite simple. Select the PivotTable that you would like to “branch off” and cut it from the workbook and paste it into a new one. Then you only have to copy it back. Sometimes this is enough, but sometimes you have to close the first workbook and save the new workbook, and then open it again and copy the PivotTable back into it.

Method 3 (“separation” of already created PivotTables with the help of VBA code)

The following VBA code does the trick for all the PivotTables in your Workbook.

Sub DataCache()

Dim PivTbl As PivotTable
Dim ws As Worksheet
Dim wsTemp As Worksheet
Dim pt As PivotTable

For Each ws In ActiveWorkbook.Worksheets

For Each PivTbl In ws.PivotTables

c = PivTbl.RowRange.Column
r = PivTbl.RowRange.Row
ws.Activate
Cells(r, c).Select
Set wsTemp = Worksheets.Add
ActiveWorkbook.PivotCaches.Create(SourceType:=xlDatabase, SourceData:=PivTbl.SourceData).CreatePivotTable TableDestination:=wsTemp.Range(“A3”), TableName:=”PivotTableTemp”
PivTbl.CacheIndex = wsTemp.PivotTables(1).CacheIndex
Application.DisplayAlerts = False
wsTemp.Delete
Application.DisplayAlerts = True
Set PivTbl = Nothing
Next PivTbl

Next ws

End Sub

 

Getting rid of old Row and Column Labels from the Pivot Table

Pivot tables still remains the go to solution in Excel for Data Analysis. This article will address the following issue with the Pivot Table command that enables you to still see the leftover entries that no longer exist in the data table. We will also talk about how to remove them…

Here is an Example of what I’m talking about. We have the following data table in Excel.

Notice that in the Month column, there are Four Months present, December, January, February and March. Now based on this data, we create a Pivot table where we calculate the Average number per Month.

Nothing out of the ordinary there. Now let’s change the data a bit. What we will do is delete the December data, as we no longer need it, and add the data for April. So now the data table we are working on looks like this

Now after refreshing the Pivot Table, we get something like this

Now that seems perfectly fine, but let us see what we get in the dropdown menu at Row Labels or Month…

And there it is, although our data no longer has any Rows of Data belonging to December, the December is still part of the Month field as you can see. The record was not deleted from the data model. Now this has some great uses as with the GetPivotData function etc. But for this article we will discuss how to get rid of the discarded data in two ways. Manually and using a VBA code or a Macro.

Getting rid of old Row and Column Labels from the Pivot Table manually

You place yourself in the PivotTable and either Right Click and select PivotTable Options or go to the Analyze (Excel 2013) or Options (Excel 2007 and 2010) Tab. In the PivotTable Options dialog box you place yourself on the Data tab.

The command we are looking for is Number of items to retain per field. The value is set to Automatic by Default and that means that Excel will decide how many items to retain. What we need to do is to change that to None and then Refresh the PivotTable.

After doing so, you can clearly see that December has disappeared from the Month field.

Getting rid of old Row and Column Labels from the PivotTable by VBA

If you have more Pivot Tables and you want to do this faster than the option above, you can use the following VBA code…

Sub DeleteOldPivotData()

Dim PivTbl As PivotTable

Dim ws As Worksheet

For Each ws In ActiveWorkbook.Worksheets

For Each PivTbl In ws.PivotTables

PivTbl.PivotCache.MissingItemsLimit = xlMissingItemsNone

PivTbl.PivotCache.Refresh

Next PivTbl

Next ws

End Sub

Sorting months chronologically and not alphabetically in a Pivot Table report based on Power Pivot data

THIS POST IS OUT OF DATE! You can read the upgraded and up to date version of this article here

Here is our problem. When you create a Pivot Table in Excel that has a date field, you can Group that field by month and the sort will be logical (January, February, …). But when you create a Pivot Table based on Power Pivot, the grouping does not work! So you have to get to the month names and a correct sorting by using a different path. We do this by using the Format function in PowerPivot, but the problem is that when you put this field in a Pivot Table, it gets sorted alphabetically. This is logical since the values are text and have nothing to do with dates as far as that Pivot Table is concerned, but this is a problem since the months are not sorted chronologically. This article will tell you how to achieve that.

Let’s start with a simple table in Excel that has only two columns. One has Date values and the other has a number of visitors on that date. Now we would like to create a Pivot Table report to see how the number of visitors is spread through the months.

Case 1: Pivot Table report based on an Excel Table

First we create a Pivot Table based on an Excel Table


The Pivot Table will show the number of visitors by months. But to do this, since we only have Dates, we have to do Grouping by months on the Dates


And right away we get the desired result.


Case 2: Pivot Table report based on Power Pivot data.

First we add our Table data to Power Pivot the easiest way – by using the Add to Data Model command on the PowerPivot tab.


Now that we have the data in the Power Pivot we can create a Pivot Table report from Power Pivot window. But when we create a Pivot Table and want to see the analysis by months we see we just can’t select the Group command. It is grayed out…


So to get to months we use a different trick, we go back to the Power Pivot window and create a calculated column using a Format function. This is a PowerPivot rendition of the Text function from Excel. The syntax is the same, just the names differ.


Excel Version


Power Pivot version.

Using the Format function we now get the month names and a new field to create a Pivot Report by. But when we create it, it looks quite disappointing.


So the numbers are OK, but the sorting is alphabetical and not the kind we want. To get the sorting right, we have to go back to the PowerPivot window and create a new Calculated column using the Month function. This way we now get a month number along each date and month name.


Now just adding that to the Pivot Table report would get rid of our problem, but let’s not forget that we want the month names as they were, only the sorting is wrong. But now we have all we need.


In the Power Pivot window, we select a value in the month name column and then select a Sort by Column command on the home tab and hey, look at that. You can now say that the Month name column will be sorted by Month No. column.


Doing that has changed our Pivot Report instantly!


And we are one step closer to eternal happiness

MS EXCEL MONTH function madness

I guess we are all familiar with the MONTH function in Excel. If you give it a date it will return the month number. So far so good. Now here’s the strange thing…

If for example cell A1 contains a name of a month (let’s say November) and you want to get the month number out of it. Now some might say VLOOKUP and they would be spot on but here what you can also do. You can say =MONTH(“1″&A1) and you get 11. Wow!


Here’s why this works. 1November is recognized as a date so the MONTH function still recieves what it was expecting, but nobody really knows this 🙂 If anybody has more of this kind of madness in Excel to report, I would be very greatfull

Increase your efficiency in Excel with these mouse based Excel shortcuts

These days when talking about efficiency in any Office application, particularly in Excel, keyboard shortcuts are a must. But since we can’t really be efficient without the mouse, let’s see what are the hidden, but great uses of the mouse in Excel.

Paste as Value

Let’s start with one of my favorites. If you’ve opened Excel more than once, you have absolutely felt the need to copy some data and paste it as a Value. Where you only need the results of a formula but not the formula itself. And I guess we all went about it the same way. Right click and Copy or Ctrl + C and then Right Click and say Paste Special and I guess you all know the drill. But here’s another way to go about doing this.

You select the Cell or a Range of Cells that you want to change to values. Then you place the mouse cursor anywhere on the edge of the selected Range but the bottom right corner. So anywhere but there! Now you right click and hold that click and then just shift focus, so swing the Range to the right (left, top, bottom) and back while holding the right click all this time. When the Range is where you want to Paste it, you simply release the Right Click. You get the following menu…

You select Copy Here as Values Only and here comes the eternal happiness we all seek J

Fill Series

When faced with creating a series in Excel, we all turn to the Home tab and look for the Fill command. But here’s how the pros do it…

Same as when using the Fill Series command, you write the first value in your series in a Cell and then place yourself on that cell. Next step is to place the mouse cursor in the bottom right corner of the cell’s border (so the exact opposite of the above example) and then Right Click, hold the click and again just shift focus (swing the Cell to the right (left, top, bottom) and back while holding the right click all this time). Now you get the following menu…

Some brilliant options there but the one we need is the bottom one Series…

Get around get around i get around

Borrowed the title from the Beach boys, but in Excel it is vital that you can Get Around J The Ctrl + Arrows combinations are a given when maneuvering a large table. But the mouse can also get you there. It’s actually quite simple. You place your self in a cell and if you want to go to the bottom of data in that column, all you have to do is place a mouse cursor on the bottom border of that cell, and then you just double click. If you placed yourself on the Right border, you would go to the last piece of data in that row and so on… As simple as that!