The Aggregate Problem

Some AI tools operate on populations, not individuals. When they're wrong, the consequences land on people who never appeared in the model.

May 15, 2026

When HR leaders think about AI risk today, the picture is reasonably consistent. There’s a tool. It scores a candidate, ranks an applicant, predicts a performance issue, or flags a retention concern. Someone is responsible for checking that the tool is accurate, fair, and explainable. If a worker is harmed, there’s a decision to trace back and, in principle, contest.

That picture is reasonable. It’s also incomplete in a specific way that’s costing real workers real outcomes.

A growing share of the AI shaping people’s working lives doesn’t fit the one-tool-one-decision frame. These are tools built to operate on populations rather than individuals — workforce planning models, skills inference engines, compensation frameworks, location strategy tools. They don’t produce outputs about who gets hired or fired. They produce outputs about how many people the function should have, how skills should be defined, where work should sit. Individuals carry the consequences without ever appearing in the model.

This is the aggregate problem. It’s mostly absent from the current AI risk conversation in HR. And it produces real consequences for workers that nobody — not the model’s developers, not the buyer, not the company’s responsible AI program — is positioned to see.

When the model is about a group

Workforce planning is the cleanest example. A model forecasts how many people a function will need over the next twelve months. The number isn’t about any specific person. But it determines whether real people get hired, whether real people get laid off, whether real people get promoted into a role that was budgeted for.

If the model overestimates attrition by ten percent, the company over-hires. Some of those hires later get cut. Others lose promotion paths because the headcount envelope was wrong. By the standards the model is evaluated against — forecast accuracy, calibration — a ten percent miss isn’t a disaster. By the standards of the people who carry the consequences, it can be.

Skills taxonomies work the same way. AI infers what skills employees have and where gaps need to be addressed. At the population level the inference is accurate enough. The twenty percent of roles it misclassifies — usually the unusual or hybrid ones — show up as people routed away from development, internal mobility, and recognition. Not because anyone decided to deprioritize them. Because the taxonomy didn’t see them.

Compensation models follow the same pattern. A model that produces well-calibrated bands at the function level but undercalibrates a specific demographic intersection produces individually identifiable pay inequities that, in aggregate, sit comfortably within statistical tolerances. The aggregate evaluation says the model works. The affected individuals are not in the evaluation.

Location strategy, restructuring frameworks, internal mobility platforms — same dynamic. Each operates at the level it was designed for. Each shapes thousands of individual outcomes that never appear in its evaluation.

The tools were built to be accurate at the aggregate level. They are. The harm isn’t aggregate.

Why this hasn’t been the conversation

Four reasons.

It doesn’t look like high-risk AI. Current risk classifications focus on systems that produce decisions about specific individuals. Aggregate models don’t. They produce structures and forecasts that humans translate into individual decisions downstream — and the translation step is where most frameworks stop looking.

It’s evaluated by the wrong disciplines. The people building workforce planning, skills, and compensation models are operations researchers, statisticians, and data scientists. Their accuracy criteria are aggregate-level — forecast error, calibration, predictive lift. Those measures are appropriate for what the model is designed to do. They don’t include any criterion that asks what happens to the worst-case individual when the model is wrong. That question lives in I-O Psychology, measurement science, and labor economics — usually not in the room.

It sits outside HR’s conversation. Aggregate models live in workforce planning, finance, and strategy — functions that have historically operated outside the responsible AI scope built around HR’s individual-decision tools. The people with standing to identify the risk don’t sit in the right meetings.

It’s hard to regulate and easy to deny. The chain from an aggregate model’s output to an individual worker’s harm runs through humans and downstream decisions that nobody is tracking. For regulators, that makes the field genuinely difficult to write rules in — there’s no single decision to point to. For employers, the same difficulty works as a defense. The model wasn’t about that worker. The decision was made by a manager. The translation logic isn’t documented as “AI.” Each link has someone or something else to point to. Hard to regulate, easy to deny — and that combination is one of the most durable reasons the conversation has stalled.

Why it should matter to you

Even without regulation, the risk is real and it lands in real places.

Operationally. A workforce planning model with a known ten percent error rate puts that error into your hiring, attrition, and promotion patterns every quarter. The cost shows up as turnover, as over- or understaffing, as decisions that look correct against the plan and wrong against reality.

Legally. When a layoff, denied promotion, or pay decision is challenged, the documentation a regulator or plaintiff’s counsel will ask for goes back through whatever logic informed it — including the aggregate models that shaped the headcount envelope, the skills taxonomy, or the comp band. The accuracy framework the model was evaluated against doesn’t satisfy that line of questioning. “Was the forecast accurate?” is not the same question as “what did the organization know about how this output would affect specific people, and what was put in place to manage that effect?” If the answer is that the model was treated as an aggregate analysis with no individual-impact review, the answer itself becomes the exposure.

Reputationally. The patterns become visible eventually. Workers compare notes. Journalists trace them. The organizations that examined how their aggregate models were affecting individuals before they were forced to will tell a better story than the ones that didn’t.

Strategically. An HR function building decisions on aggregate AI it hasn’t examined is building on a foundation it cannot defend internally. When the model is wrong, the consequences land on real people, and those people are inside the organization. They notice.

The risk of doing nothing isn’t that something dramatic happens tomorrow. It’s that the slow accumulation of small wrong outcomes — for real workers, generated by aggregate logic nobody is overseeing — becomes the basis of how the workforce operates. By the time anyone names the problem, it’s structural.

Where to start

Bringing aggregate AI into the conversation doesn’t require new technology. It requires asking different questions. Three are usually enough to surface the gap.

Which tools in our HR portfolio produce outputs about populations, structures, or forecasts that someone else later translates into decisions about people — and how accurate do those outputs need to be for the translated decisions to be defensible?

If a worker were adversely affected by one of those decisions and asked how the underlying model worked, would we have a reviewable answer?

Who in the organization is accountable for the individual-level consequences of these models — and is that accountability separate from the accountability for their aggregate accuracy?

If those questions are uncomfortable, the aggregate AI in the organization is producing real consequences under evaluation frameworks that don’t see them. The first step is to look at the right scale. Better to do it now than after something else forces the issue.

Work Science Consulting LLC (WSC) provides independent, science-based advisory to organizations navigating AI in HR and workforce contexts. worksciconsulting.com

Work Science Consulting - Substack

Discussion about this post

Ready for more?