A killer app with huge upsides and dangerous downsides: Applying AI to People Analytics

February 23, 2024

Ever since ChatGPT was launched in November 2022, there has been a figurative and literal “gold rush” into the world of generative artificial intelligence (AI).

It is too soon to predict with certainty much of anything about where AI will be taking us. Yet I am very confident about one particular application: applying AI to People Analytics (PA) to analyze staff and team issues and dynamics. Unlike many prognostications for AI, the potential for applying it within PA (a) is real, (b) has very big upsides that can be realized now, and (c) has dangerous downsides that are hard to prevent. This article addresses both the encouraging upsides, and how to avoid the downsides.

The benefit of applying AI to People Analytics

AI can be deployed to substitute for humans doing tasks, or to augment the humans. Applying AI to People Analytics has an element of both.

Many basic data manipulation and reporting tasks already were on greater automation path before ChatGPT. AI will continue and likely accelerate that trend, removing humans from more and more lower-level analytics tasks, which is a good thing. Examples include:

  • Quickly synthesizing enormous volumes of text-based feedback from employee surveys, from managers, from 360 evaluations, etc. In recent years, a lot of these tasks have been automated. Yet the solutions currently on the market are still not-so-easy to implement, requiring specific vendors and their applications, are often expensive, and require specialized programming knowledge. AI will make very powerful text analysis tools readily available for even basic users to apply. Lowering the cost of analysis, and the barriers to entry, will greatly increase the use cases, including adding much-needed qualitative analysis to enhance the quantitative analyses which overly-dominate PA practice. Current tools rarely go beyond word counts, missing important nuances that we need to understand to act effectively on employee or customer feedback. Ai already appears be better doing these types of analyses.
  • Automating repetitive or lower complexity tasks, including writing straightforward code, scheduling meetings, drafting meeting notes, creating first drafts of documents, creating data visualizations, etc. This will be a very important contribution to the effectiveness of PA groups, because every single one of them is capacity constrained: they always can use more people to run analyses, engage with stakeholders, collect new data, etc. Freeing up time spent on these tasks will accelerate PA groups’ effectiveness.

Even more promising is what AI can do for higher-order tasks: accelerate the trend of machine learning and large statistical models, which will majorly augment what humans do today. All these approaches apply a combination of linear programming and nonlinear methods, searching for patterns in data that can shed light onto questions of employee engagement, retention, and productivity. AI will enable analyzing ever-larger and more complex sets of data.

An example

Social science models of employee behavior are complex, and suffer from a lack of “direct” data that can be used to test and verify them. People’s work motivations are very complex, and they themselves rarely accurately identify all the factors. So we have to employ data proxies for what people will do, which is a huge data challenge.

For example, models of employee productivity or retention are based on the sum total of experiences people have at work, and work environment. This includes how they are treated by their supervisors, teammates, and others. Adding more data on team dynamics, composition, the experiences of each person, their attitudes, cross-functional collaboration and challenges, technical data on operational glitches, and more, will greatly increase the models’ predictive power. That is one huge upside potential.

With so much upside, what could go wrong?

The outputs from statistical analyses are only as good as the inputs. If the inputs are faulty, then the outputs might look interesting and actionable, but they will be meaningless and lead to poor decision making.

On the one hand, statistical profiling can help point to important factors that may drive an individual or group to behave in a particular way. However, prediction is never a perfect science, and must be balanced against the role of individuality and the importance of individual differences.

Philip Dick’s The Minority Report novel from 1956, which was made into a popular movie in 2002 and television series in 2015, provides a great illustration of the promise and pitfalls of predicting human behavior from the types of data discussed here. The story is based on a future where data is used to predict criminal behavior with high accuracy, with substantial police resources devoted to rooting out criminals before they act for the first time. But some predictions go terribly wrong, leading to tragic outcomes including people dying and being sent to prison unjustly. The moral of the story, in statistical terms: Beware of false positives, and negatives.

The stakes in PA are not as high as in The Minority Report, but the lessons are just as relevant because AI has the potential to greatly increase the predictive power of PA models. AI model of turnover risk can be used to identify groups of people more likely to leave the organization in the near future. If that prediction is used to try to intervene to reduce the likelihood of them leaving, the outcome can be a win-win for the employees and the organization. However, if the prediction is used to decide to proactively layoff people before they leave voluntarily, or to decide to hire fewer people that fit the profile, the negative consequences could be substantial: pushing out – or never hiring – individuals who never would have left, and who would have been happy, productive contributors.

There is a long and checkered history of data being used inappropriately to model employees. The classic example is IQ scores. IQ scores for decades were used to screen and predict which people would be better matches for work. The problem is not a lack of correlation – it is that there is no legally justifiable link between IQ scores and job requirements in virtually all cases, even though scoring well on an IQ test often correlates with job fit and performance.

Many other types of data quite reasonably could shed insights into people’s motivation and ability to work at a specific point in time, such as:

  • Short term or long term health issues, both physical and mental
  • Bearing the burden of caring for family members including child and elderly care
  • Having fewer financial resources, which can make it harder for people to pay for the support that makes it easier to be successful at work

Organizations are not supposed to use most such data in employment decisions. And even where it may be legal, it’s usually not ethical to do so, because people are supposed to be given the chance to prove themselves at work and not ruled out before the fact a la The Minority Report.

Which brings us to the challenges posed by large language models (LLMs) which hover up enormous volumes of data. Consider the following information which is available on the internet, somewhere:

  • Online browsing and purchasing behavior
  • Engagement in non-work activities across all aspects of life
  • Personal email and text communications
  • Credit ratings
  • Facial recognition databases
  • Crime databases, including suspects who have not been convicted of anything

Every single one of these types of data have either direct or indirect links to information that organizations should not access when making employment decisions based on health status, family care responsibilities, and financial resources. On those grounds alone, there is risk in training Ai on large volumes of data which can easily include many of the above sources. Add into that the risk that models could tag someone as less desirable for employment based on opposition to authority via political activities, suspicion of illegal activity, reading non-conformist material on the internet, shopping for odd items, etc.

Looking at the glass as half full, accessing these types of data is nontrivial. Collecting such data and merging it into internal information on individual employees requires taking purposeful steps. Yet, from the glass-half-empty perspective, (a) there are lots of people both inside and outside organizations who are not aware of the ethical and legal issues surrounding using such data, and (b) there are quite a few “shady actors” who are happy to make money however they can, which would include accessing and combining such data, creating the AI models, and selling the results to whoever will buy them.

Beware “interesting” outputs that can mislead

Company culture is defined by the nature of the work, and the choices made by senior leaders. But it’s also shaped in large part by the socioeconomic status of the organization’s members, and the way they live their lives.

Any AI model that uses non-work data to model engagement, retention and productivity is likely to create “employability” scores that reinforce those biases towards the dominant company culture, in entirely unintended ways. Once such employability scores are created, whether through malicious intent or benign neglect, they can be “validated” using the experience of current employees and future job applicants. Which would provide all the proof needed to justify using them to determine job outcomes. This is why emerging laws require more “white box” approaches, an attempt to stop exactly these kind of bad actors. Yet inevitably, someone, or a group of someones, somewhere will try. So we have to be extremely vigilant about the risks.

A necessary safeguard: Keep the right humans in the process

There should always be people who check the reliability of AI models’ output, yet who is qualified?

The most likely people to be tapped lack critical expertise – including the many People Analytics professionals who are former HRBPs or consultants. Their data “expertise” usually covers only creating reports that count things.

People with data science expertise, common in today’s PA groups, can be better. Yet data science training alone is not sufficient to safeguard against the risks of unsupervised AI.

Two types of lacking expertise are the problem: social science, and specific employment law. Improper social science expertise leads to “dustbowl empiricism” approaches not tied to what organizational behavior and industrial-organizational psychology research have shown drive individual and group behaviors. Improper employment law expertise creates legal risk and organizational reputation risk.

Relying solely on data science expertise is like allowing an AI model to train itself without oversight. Data scientists are trained in data manipulation and analysis, not the research that explains human behavior. Engaging a pure data scientist to train and monitor an AI model runs the risk of data mining without the proper judgment to evaluate the results.

Organizations need people with advanced social science and employment law expertise to train, evaluate and manage any AI model applied to PA. And they have to invest lots of time and resources to ensure there are no unintended “adverse impacts” based on people’s identities that have experienced discrimination historically, including gender, race, ethnicity, age and disability status, including the specific set of protected classes in the United States or whichever jurisdiction applies. If you don’t make these investments proactively, plaintiffs lawyers will ensure much higher costs paid later, if any statistical evidence of adverse impact is found.

A human- and science-based perspective on any PA model’s predictions is always needed, whether AI is applied or not. Because even a high likelihood of something happening is never destiny.

This article was greatly improved by conversations with and feedback from Alexis Fink.

Click here for the original LinkedIn post.