The Huey Louie Dewey Error Message Needs an Update

August 15th, 2014 by datapig 1 comment »

Every so often, when working in Access, I get the ever delightful Huey Louie Dewey error message.

Never heard of it? Well, when you try to specify query criteria with commas and no operators or quotes, you get this error message:

The expression you entered contains invalid syntax, or you need to enclose your text data in quotes. You may have entered an invalid comma or omitted quotation marks. For example, if the Default Value property of a text field is ''Huey, Louie, and Dewey,'' it must be enclosed in quotes if you mean it as a literal text string. This avoids the confusion with the expression ''Huey Louie'' And ''Dewey''.

.

When I saw this error message today, I got to thinking. Someone put a lot of love into this error message. Check out how many words it uses to explain, in detail what is going on.

Compare this error message with other Access error messages:

Error 6: "Overflow"

Error 13: "Type Mismatch"

Error 3060: "Wrong data type for parameter"

.

The Huey Louie Dewy error message has personality. Name another error message that names Disney characters in it. Huey Dewey and Louie are Donald Duck's mischievous nephews. Yes – someone went out of their way to make sure they worked in this trio of anthropomorphic ducks into a Microsoft Access error message.

.

As charming as that may be, I feel like it's time to give this error message an update. I mean really, these Disney characters aren't exactly pop culture spring chickens. Debuting in 1937, they're pretty old ducks by now.

.

So I propose updating this error message to include a more modern trio of pop culture icons; a trio that will prompt smiles from a whole new generation of nerds.

.

Maybe something like this:

…You may have entered an invalid comma or omitted quotation marks. For example, if the Default Value property of a text field is ''Lucky, Dusty, and Ned,'' it must be enclosed in quotes if you mean it as a literal text string. This avoids the confusion with the expression ''Lucky
Dusty'' And ''Ned''.

.

.

Or you could appeal to the kids:

…You may have entered an invalid comma or omitted quotation marks. For example, if the Default Value property of a text field is ''Harry, Hermione, and Ron,'' it must be enclosed in quotes if you mean it as a literal text string. This avoids the confusion with the expression ''Harry Hermione'' And ''Ron''.

.

.

Or maybe try to get a few pre-teens:

…You may have entered an invalid comma or omitted quotation marks. For example, if the Default Value property of a text field is ''Bella, Edward, and Jacob,'' it must be enclosed in quotes if you mean it as a literal text string. This avoids the confusion with the expression ''Bella Edward'' And ''Jacob''.

.

.

Oh…I know. Use something real nerds would appreciate:

…You may have entered an invalid comma or omitted quotation marks. For example, if the Default Value property of a text field is ''Tyrion, Jamie, and Cersei,'' it must be enclosed in quotes if you mean it as a literal text string. This avoids the confusion with the expression ''Tyrion Jamie'' And ''Cersei'.

.

.

Hold the phone…I got it.

…You may have entered an invalid comma or omitted quotation marks. For example, if the Default Value property of a text field is ''Debra, Mike, and Dick,'' it must be enclosed in quotes if you mean it as a literal text string. This avoids the confusion with the expression ''Debra Mike'' And ''Dick'.

.

Now that's an error message with style!

Understanding Weighted Averages

August 12th, 2014 by datapig 3 comments »

One of the first warnings an aspiring analyst hears is, "never take an average of an average". The reason for this warning may not be intuitive, but it's important to understand why it's generally a bad thing to average a bunch of averages. In today's post, I'll attempt to explain the problems with using simple averages and how Weighted Averages can help you avoid inaccurate conclusions about your data.

.

Where Straight Averages can Go Wrong

Imagine you manage two sales reps (Jim and Tim). These two sales reps have been working for you for three months. Over the last three months, you've collected data on their win rates vs. sales calls. After three months of selling, you'd like to see which sales rep has been more effective based on win rate.

.

At first glance, you see that Jim's win rate has been 65%, 85%, and 86% over the last three months. Tim's win rate has been 65%, 84% and 83%. You'd like to get an average win rate for each sales rep; so you simply average the win rates. From the looks of it, Jim seems to be the better performer. Not only does he have better win rates each month, but his straight average win rate is 78% (slightly better than Tim).

.

But wait. Look at what happens when you look that their numbers from the perspective of total wins for the entire three-month period. Taking the total wins vs. total sales calls, you see that Tim, with a 76% win rate, is actually more effective. What the heck!? How can Jim have a better win rate each month, but still trail behind Tim when calculating the total win rate for the month?

.

This strange observation has a name in statistics. It's called Simpson's Paradox.

No - not that Simpson.

.

Edward H Simpson was a British statistician who published a paper in 1951 detailing a phenomenon in which a trend present in different groups is reversed when the groups are combined. In other words, one conclusion may be reached when data is analyzed in smaller groups, but an opposite conclusion may be reached when data is analyzed in aggregate.

.

The reason this paradox rears its head is because, often times, there is a "lurking" variable in sample data that isn't considered in the analysis. In the case of our two sales reps, they have an unequal pool of sales calls (Jim has 89 total sales calls, while Tim has 128). So while each month's win rate is correct, you can't simply average the win rates. Doing so, would ignore an important variable (the total number of sales calls during each month).

.

Weighting Values

Often times, the correct analysis is not a straight average, but an average that weighs each value based on another variable in the sample data. When you "weigh" a particular value, you give that value significance in relation to the other values in your sample. That is to say, some values count more than other values. This is similar to the way Final Exams in school would count for a larger part of your grade than Quizzes.

.

A weighted average calculation is essentially:


.

For instance, in the case of Jim (our sales rep), we need to weight his win rates so that they have more significance in months that involved more sales calls. This way, we get a more accurate representation of his overall average win rate.


.

Using the SUMPRODUCT Function to Calculate Weighted Averages in Excel

Using weighted averages in Excel is as simple as entering in the weighted average calculation.

In this example, we are calculating the average price per unit across all transactions.

You can see that the simple average gives us a price per unit of $200. But this number is misleading; given that over 70% of all units were sold at the $50 rate. The weighted average price takes into account the number of units sold at each price, giving us a more accurate representation of our overall average price.


.

An easier way to enter this formula is to use the SUMPRODUCT function. "Product" is another name for the value returned from a multiplication operation. So the SUMPRODUCT returns the sum of multiplication products.

Instead of typing out:

= (C3*B3) + (C4*B4) + (C5*B5)

You can enter:

= SUMPRODUCT (C3:C5, B3:B5)

.

The SUMPRODUCT function will multiply the values in C3:C5 with the values in B3:B5, then return the sum of the products.

.

Here is the same weighted average using the SUMPRODUCT function.

.

Well, that's it.

I'm off to the buffet.

I can't wait to increase my weighted average weight while listening to Tom Waits.

Dashboard Tools Add-In Update

August 5th, 2014 by datapig 2 comments »

I've been pretty busy with client work, so no new blog post today. I did, however, want to share some updates I've made to my Dashboard Tools Add-In.

.

If you don't know yet, the Dashboard Tools Add-in is a utility that helps you: enhance your dashboards with stylized graphics, find viable color schemes, and create dashboard mock ups.

.

For this update, I added two new tools: a Color Capture tool and a Customize Percentages tool.

.


The Color Capture Tool

The new Color Capture tool is based on the eye dropper tool developed by » Read more: Dashboard Tools Add-In Update

Quantifying Subjective Text with an Excel-Based Sentiment Analysis Tool

July 24th, 2014 by datapig 8 comments »

Sentiment Analysis is a growing field in the world of BI and Data Science. With social media serving up millions of data points by way of comments, tweets, and status updates, it's becoming increasingly important to be able to take all that subjective text and turn it into quantitative data that can be integrated with other metrics. That is essentially what Sentiment Analysis is - a term that describes the exercise of quantifying subjective text by analyzing certain keywords deemed to be either positively or negatively charged.

.

How Sentiment Analysis Works

In its simplest form, a Sentiment Analysis algorithm can be nothing more than determining if a text string has more positive keywords than negative keywords. Text with more positive words is classified as positive content, while » Read more: Quantifying Subjective Text with an Excel-Based Sentiment Analysis Tool

Generating Normally Distributed Random Numbers – Sample Data that Makes Sense

July 16th, 2014 by datapig 8 comments »

Like many of you, I'm often asked to develop mockups (working demos of dashboards and reports).

In my zeal to make my mockup as realistic as possible, I spend a good bit of time generating sample data.

.

For example, if I'm mocking up an HR dashboard, I'll dedicate some brain power to generating salary data. Now in order to make salary data realistic, there has to be a believable mix of high salary and low salary records. I can't just use the RAND and RANDBETWEEN functions Excel provides. These functions generate random data in a "Uniform Distribution". That is to say, every number has an equal chance of occurring with an equal frequency.

. » Read more: Generating Normally Distributed Random Numbers – Sample Data that Makes Sense

Lazy Summer Update

July 9th, 2014 by datapig 4 comments »

Hey there….remember me? I've been having a lazy summer; cooking my obscene body in the hot South Carolina sun. I'll be back with regular posts next week. I just wanted to get this update out.

.

New Book in August

Dick Kusleika and I wrote an Excel Formulas book. This book is for anyone who wants to jump head first into formulas. Here, we offer step-by-step instructions for creating and using 101 of the most commonly requested formulas. We include tips, tricks and detailed explanations along with actual working formulas. This is targeted at beginning to intermediate Excel users. Feel free to click on the book to see more details.

.

. » Read more: Lazy Summer Update

8 VBE Tips Every Programmer Should Know

June 19th, 2014 by datapig 13 comments »

I've covered these time-saving VBE (Visual Basic Editor) tricks across several posts over the years. I thought it would be helpful to get them all into a single post.

.

Whether you're a fresh-faced analyst new to programming, or a jaded veteran living on sunflower seeds and Mountain Dew, these tips will make programming with Excel and Access so much easier.

.

Tip 1: Block Comments

Are you still commenting code one line at a time?
» Read more: 8 VBE Tips Every Programmer Should Know

Rounding to Significant Digits

June 16th, 2014 by datapig 6 comments »

I was recently asked to show some financial figures in significant digits. I'll be honest and say I've never had to do this in my many years of reporting, so I didn't exactly know what that meant. Apparently, the idea is when you're dealing with numbers in the millions, there is no need to inundate a report with superfluous numbers for the sake of showing precision down to the tens, hundreds and thousands place.

.

As an example, instead of showing the number 883,788, you could choose to round the number to one significant digit. This would mean displaying the same number as 900,000. Rounding 883,788 to two significant digits would show the number as 880,000.
» Read more: Rounding to Significant Digits

5 Years of Bacon Bits

June 9th, 2014 by datapig 18 comments »

On June 10th 2014 the Bacon Bits blog officially hits the 5-year mark.

Yes, this humble blog of mine has been changing the lives of countless nerds for 5 years.

So today, you get to watch me stroke my own ego with this self-congratulatory navel-gazing post. Enjoy!

» Read more: 5 Years of Bacon Bits

Using Pictures to Create Info-Graph Charts

June 6th, 2014 by datapig 4 comments »

I've been playing around with using pictures in my Excel info-graphics. I rarely use pictures in charts, but with an info-graphic, it seems like pictures could be appropriate for this type of visualization.

Here are a couple of examples I pulled together to illustrate the kinds of things you could do.

.

In this example, I created a waffle chart and then filled the two data series with their own pictures. » Read more: Using Pictures to Create Info-Graph Charts