June 26, 2018

Your Data is Lying to You: Pitfalls with Advanced People Analytics

Organizational Effectiveness

People analytics is undeniably a hot topic today. Few would argue that the availability of HR data and the tools to manipulate it enable stronger decision making, development of better people strategies, and increased influence with business leads. Putting aside the many benefits that data can offer, users should watch for mistakes that can result in bad decisions. 

Five common issues with advanced people analytics

1. False causation

Consider your latest employee engagement survey. You had an uptick in engagement over the one you did 2 years ago, and when you look at your financial data, you see that profitability also increased. Engagement caused profitability, right? Or is it that the healthier financial situation caused an increase in engagement? The tricky thing about one-to-one comparisons is that it is often difficult to tell which is the driving factor and which is the dependent variable. To reduce the likelihood of this error, eliminate other possible causes. Regarding profitability, has the general economic climate improved? Were there one or two big contract wins? Was there a reorganization or process reengineering effort? Any new vendor renegotiations? Be sure you’re taking a complete picture of your situation.


2. Hasty generalization

Consider exit interviews. You review your data for last year and see that 3 of 8 people leaving under a specific manager cited “poor management” as a reason for leaving. Should you be concerned? Probably. Should you use this data point to decide that this is a manager in need of training, demotion, or firing? Certainly not. When organizations are just starting to gather data, they are often in a hurry to use it quickly. Big mistake. Especially when there are limited data points available, it’s important to gather enough data before using it to make big decisions. Consider additional factors that might be at play – location, life stage, commuting distance, performance ratings – and test additional hypotheses.


3. Confirmation bias

So you’ve got a new baby – congratulations, it’s a sales training program you helped create. It has all the bells and whistles – blended learning, role plays, interactive video. You pilot it with a group of ten salespeople. In the next month, their numbers improve by 20%. You show the data to everyone. Your training was a massive success! Not so fast. What you ignored was that across the company, the average salesperson’s increase during the same time-period was 18%, seven of the ten people who took the training had just come off a bad month, and the cost to develop and administer the training was incredibly high. You made the mistake of picking only the data that supported the position you wanted to support while ignoring the rest. Confirmation bias plagues us all in all aspects of our lives (not just work). To overcome, seek out independent opinions, particularly when you have an emotional interest in reaching a specific conclusion. 


4. Regression to the Mean

You’ve recently implemented a monthly pulse survey, and you’re measuring participation. Good news! Response rate is 50% after 3 months, so you shift your attention elsewhere. Bad news. The initially high response rate will likely taper off after time. Performance and participation tend to regress to the mean, so whether it’s an average performer having a killer year or a new initiative that has a great deal of early enthusiasm, keep measuring and prepare yourself for a return to expected levels.


5. Data dredging

You have a massive HRIS export. You have detailed financial results by division and geography. You run a massive correlation analysis and find a statistically significant relationship between last year’s salary increase and this year’s absenteeism. You deliver the finding to your head of compensation and proudly report that “There’s only a 5% chance this is random!” Ah, but let’s do an experiment.

Do an Excel coin flip test:

  • Type “=randbetween(0,1)" in Cell A1 and hit Enter
  • Copy that formula across the Rows and Columns down to Cell G100, creating 100 Rows of 7 Columns
  • Sum each Row in Column H
  • Count how many times the sum in Column H comes to 7 (the equivalent of 7 flips resulting in Heads)

Statistically, there’s a 1 in 128 chance of this occurring in any single row, but do it a hundred times, and your odds of an all-heads row increases to 55%.

The takeaway: If you randomly test hundreds of variables for statistically significant relationships, you’re going to find them somewhere, but be careful. Data dredging is useful for identifying hypotheses, but don’t make the mistake of believing that the relationships that you find are, in and of themselves, meaningful. Find additional ways to test your hypotheses.

People analytics is an exciting field, filled with possibility for driving decision making at the organizational level. Keeping in mind common statistical and analytical errors and how to avoid them will help you steer the organization in the right direction.


How Aspirant Can Help

Aspirant's Organizational Effectiveness experts can help you overcome these pitfalls as part of enabling an impactful people analytics program. Use the form below to schedule a casual discussion about how we can better position your company for success.


Any questions or feedback?
We'd love to hear from you.

Let's Connect!


Judy partners with executives and leadership teams to engage and inspire employees in a way that delivers sustainable strategic results. She brings deep expertise and creative ideas to solve organizational effectiveness issues and closely collaborates in a way that builds internal capabilities. Judy has spent over 25 years consulting in a variety of industries, bringing her expertise in behavior to a wide range of organizational issues including organizational behavior change, leadership, change management, culture and engagement.

Related posts