Cut the Size of Your Pivot Table Workbooks in Half

July 2, 2009 by datapig Leave a reply »

In honor of the upcoming 4th of July weekend, I've declared this week to be Pivot Table week. What do Pivot Tables have to do with Independence day? I'll let you know when I find out.

In the meantime, are you 'that guy' who keeps clogging everyone's email with 20 megabyte files.  I want to share a simple trick to cut the size of your Pivot Table workbooks in half.  

First a few words about the Pivot Cache:  When you initiate the creation of a pivot table report, Excel takes a snapshot of your dataset and stores it in a pivot Cache. A pivot Cache is nothing more than a special memory subsystem in which your data source is duplicated for quick access. That is to say, Excel literally makes a copy of your data, and then stores it in a cache that is attached to your workbook.

Each pivot table report you create from a separate data source will create its own pivot Cache that will increase your file size. The increase in file size depends on the size of the original data source that is being duplicated to create the pivot Cache.

Of course, the benefit you get from a pivot cache is optimization. Any changes you make to the pivot table report, such as rearranging fields, adding new fields, or hiding items, are made rapidly and with minimal overhead.

The down side of the pivot cache is that is basically doubles the size of your workbook. So every time you make a new pivot table from scratch, you essentially add to the file size of your workbook.

 

How to limit the size of your pivot table workbook:  
 
1. Copy and Paste instead of creating from scratch.  Sometimes you need to make multiple pivot tables from the same data source. Instead of making each pivot table from scratch (which adds to the file's size) copy and paste the pivot table. When you copy an existing pivot table and paste it, you essentially create a pivot table that reads from the same pivot cache. That is, you create a new pivot table without creating another memory container to add to the file size.

Note that pivot tables which share the same pivot cache will also share calculated fields, calculated items, and groupings.

 
2. Delete your source data tab.
I often see workbooks that have a pivot table tab and a 'Raw Data' tab. If your workbooks have both, you're wasting space, essentially distributing two copies of the same data.

You can delete your source data and your pivot table will function just fine.  After deleting the source data, saving will shrink the file. 

The only functionality you will lose is the ability to "Refresh" the pivot data, as the source data is not there. But DataPig, how do I get back to the raw data if I delete it?

Simple. Just double click the intersection of the row and column totals. This will create a new sheet with all the data currently contained inside the pivot cache.

 

3. Use a Duplicate Cache Finder.
For those of you who already have a workbook filled with pivot tables,  you can use a Duplicate Cache Finder to find and fix all duplicate caches.  Debra Dalgleish has a nifty workbook that will do just that:  PivotCacheFix.zip

On this 4th of July weekend, you should check out Debra's site www.contextures.com. She's Canadian, but she's smart like an American.

Advertisement

13 Responses

  1. Mauricio says:

    The link to PivotCacheFix.zip is broken.

  2. datapig says:

    Thanks Maurico. Corrected.

  3. E says:

    To reduce the file size I uncheck 'Save data with table layout' in PivotTable Options. If I want the underlying data to be available the next time I open the spreadsheet I click the 'Refresh on open' box in PivotTable Options.

  4. jeffrey weir says:

    Or you can reduce the size of your cache by orders of magnitude by modifying your 'get data' query so that it only actually gets the data you need for the specific task.

    For instance, if you're sucking every single line item from every single order for every single customer from a database to Excel, only to aggregate the data up to monthy totals across major product groups, then you've got MAJOR redundancy in your pivot cache.

    You should filter and aggregate at the database end wherever possible. An exception might be when you're examining data with fresh eyes, and you're just pulling in a whole heap of different fields to "see what you can see". This is what I've been doing at the place I'm currently contracted to, because I'm trying to find the best metrics and the best patterns from scratch. But even then you should go back later and filter out the 'boring bits' and just keep the stuff you actually need in the cache.

  5. datapig says:

    Great point Jeffrey.

  6. Tracy says:

    I'm a bit late reading this – but love the tip for deleting the source sheet. didn't realize we could do that. thanks!

  7. stotsky says:

    I've got 6 Sheets in the Excel file and each of them has pivot table. The sourse of data for each of these pivot tables is the same table in MC Access. Here comes the qestion, how can I avoid duplication??
    Thank You.
    PS Sorry for my poor englis, I'm from Russia

  8. AndyA says:

    Hi there
    I was directed to your site thru the Excel G forum, i have a a workbook with two pivot tables – however I cannot find th PT1 ( i know it exists as the wizard for PT2 says Pivot table to in thenoptions dialogue box. I tried the Pivot Cache Fix but it did not work for me – is there anyother way of deleting a pivot table and its cache?

  9. datapig says:

    Andy: The PivotTable name in the Options dialog box is not a true index number. You very well could have only one pivot table named PivotTable2. In other words, the name field has nothing to do with the number of PivotTables that actually exist.

  10. Doran says:

    Is there a way to clear the pivot table cache ?

    I would be interested in clear the pivot table cache before a pivot table refresh, that way, I would be assured that the data received would be in datasource order, especially if using 'none' as your pivot field sort order.

    I am trying to avoid using the sort option, as it sometimes interferes with the grouping option (causes it to sort in correctly).

    Any suggestions on clearing the pivot table cache?

    Thanks in advance.

  11. deadeye says:

    You may want to reconsider deleting the data tab after creating a pivot table in Excel 2003 if you have cells with more than 255 characters.

    Excel will truncate the contents of the cached version to 255 characters. I don't know if this is the case with newer versions of Excel.

  12. They notice changes in the market trends and recreate their internet marketing tools as well as website content periodically. They put emphasis on the core areas of their business that bring the most benefits. A lot many customers come in to buy when they know that they get the best value for their money. Internet business marketing is also about establishing individual contacts with your prospects as well as existing customers.

  13. Jenny says:

    Hi, thanks for the video tutorial.

    Is there an add-in for excel 2007? If so, where can I download this from?

Leave a Reply