Why is Validating Hiring Assessments Important?

Samantha McGrail

November 8, 2023

Estimated Reading Time: 0

Companies utilize human resources selection tools in the hiring process to help find the right candidate for their open positions. These selection tools must be validated to ensure they are effective.

Validity is the most vital consideration in developing and evaluating selection procedures. The U.S. Department of Labor describes the "validity" of a selection procedure as the extent to which there is empirical evidence or data that accurate inferences (read: decisions) can be made from the score for a particular employment selection purpose.

In other words, one could think of validity as being a measure of the effectiveness of a given approach. Therefore, a selection process could not be considered valid if it does not help businesses increase their chances of hiring the right person for the job. Reliability is a critical component of validity because a hiring assessment can only be valid if it produces information reliably and consistently.

Organizations risk deploying a flawed employee selection process if they fail to utilize validated selection tools. Systemic mistakes in the design of hiring systems can lead to poor organizational outcomes and a negative impact on its workforce’s diversity, including the risk of legal liability.

“If you can show that there’s a relationship between selection assessment scores and subsequent success on the job, that’s pretty strong evidence,” Marc Fogel, Director of Product, Industrial/Organizational Psychology, told Talent Select AI. While assessments that have not undergone a validation process may provide seemingly relevant information, this does not mean such tools are accurate, meaningful, or legally defensible.

What Validity and Reliability Mean in The Context of Employee Selection

To better understand the concepts of validity and reliability, consider the following scenario: As you prepare for your big trip to Europe, you weigh your suitcase at home to ensure it's below the maximum weight limit of 35 pounds. Based on your expensive, trust-worthy, top-of-the-line scale, you know that your suitcase weighs exactly 30 pounds.

Three scales weighing the same suitcase, all showing different weights

Anxious about your long flight, you arrive early at the ticket counter at the airport, where your suitcase is weighed before it can be tagged and sent to the plane. You put your suitcase on the scale, and it reads 80 pounds! You tell the ticket agent you know your suitcase is 30 pounds and politely ask them to try again. Now, the scale reads 20 pounds. They try one more time, and this time, the display reads 130 pounds. You point out to the ticket agent that the scale is unreliable and invalid (not consistent or accurate) and that you want to try a different one.

The agent leads you to a second scale, which promptly displays a weight of 65 pounds. You try twice more to no avail. The scale consistently shows the weight as being 65 pounds. You are nearly positive the suitcase weighs 30 pounds, so why does the scale keep reading 65? The scale is reliable but not valid (consistent but not accurate). You plead with the ticket agent to try one more scale.

Hoisting your 30-pound suitcase on the third scale, it reads 30 pounds. Now, the ticket agent needs clarification and asks you to try again. You do so twice, and both times, the scale reads 30 pounds. The ticket agent agrees that your luggage is below the weight limit and checks you into your flight. You feel lucky that you eventually found a scale that is both reliable (consistent) and valid (accurate), just like the one you have at home.

Turning back to the validity and reliability of hiring assessments, while not as clearly defined as the true weight of a suitcase (which, surprisingly, is also not as straightforward as one might think), every candidate has a ‘true’ level of skill or trait that they bring to the job. While there is no single method to assess these skills and traits accurately, some assessments (think weighing scales) do so more reliably and accurately than others.

At Talent Select AI, we have validated our assessment to ensure high levels of reliability and validity of our measurement techniques regarding the assessment of job candidates' skills and traits.

Curious how we do it? Contact us to see Talent Select AI in action.

Fair, Unbiased & Legally Defensible Hiring Assessments

As part of the validation process, Talent Select AI ensures its assessment’s legal defensibility by investigating and continuously monitoring the relationship between assessment outcomes and subgroup membership. As expected, our findings support the use of our selection assessment tool as nondiscriminatory against any protected classes.

As a best practice, we utilize three primary sources of guidance for conducting validation research. The sources include:

The Uniform Guidelines on Employee Selection Procedures (UGESP)- outlines the federal government's position on how tests should be developed and used in making employment decisions consistent with federal laws and regulatory guidelines.
The Principles for the Validation and Use of Personnel Selection Procedures- adopted by the American Psychological Association (APA) in 2018 as an authoritative guidelines document for employee selection testing.
Standards for Educational and Psychological Testing- approved as APA policy by the APA Council of Representatives in August 2013 and represents the gold standard in guidance on testing in the United States and many other countries.

Concerning fairness, the Principles document states that:

“Fairness is a social rather than a psychometric concept. Its definition depends on what one considers to be fair. Fairness has no single meaning and, therefore, no single definition, whether statistical, psychometric, or social.”

The document continues to describe several perspectives on the topic of fairness and concludes that:

“There is agreement that issues of equitable treatment, access, bias, and scrutiny for possible bias when subgroup differences are observed are important concerns in personnel selection. Most organizations strive for a diverse and inclusive workforce and equitable treatment of cultural and linguistic minorities. There is not, however, agreement that the term “fairness” can be uniquely defined in terms of any of these issues.” (p.23)

Concerning bias, and in agreement with The Standards document, the Principles state that:

“Bias refers to systematic error in a test score that differentially affects the performance of different groups of test takers. The effect of irrelevant sources of variance on scores on a given variable is referred to as measurement bias, whereas the effects of irrelevant sources of variance on predictor-criterion relationships, such that slope or intercepts of the regression line relating the predictor to the criterion are different for one group than for another, is referred to as predictive bias”. (p.23)

Put more simply, bias is a psychometric concept that refers to qualities of an assessment that lead to systematically different outcomes across different subgroups. That said, not all selection assessments that result in systematically different outcomes for subgroups are problematic or illegal.

Consider, for example, a selection assessment designed to assess candidates’ physical capabilities for a firefighter role. While there is no doubt that many females have the physical strength to climb stairs in full firefighting protective clothing while carrying firefighter equipment, an average female candidate is likely to perform less well on such an assessment than an average male candidate, according to a study in the Journal of Functional Morphology and Kinesiology.

In this case, because the average difference between the subgroups (male vs. female) is related not only to the assessment but also to their ability to perform the job, the assessment of a candidate’s ability to perform this task would be considered both appropriate and legally defensible.

Over the past few years, there has been a flurry of activity at the federal, state, and local levels within the United States regarding the legal defensibility of selection assessments. Notably, the focus has been on the legality of using technology, especially the application of artificial intelligence (AI) in the selection process. For selection assessments to be legally defensible, assessment developers and users should ensure the assessments they develop and use have been professionally validated, shown to be job-related, and produce little or no subgroup differences concerning members of protected classes regardless of the assessment content or methodology

"Talent Select AI is committed to rigorous ongoing validation of our tools so we can continue to ensure accuracy, reliability, validity, and legal defensibility of our assessment. ” Fogel explains. "With the recent executive order on AI use and pending AI legislation on the books in more than 15 states, it's vital that organizations prioritize transparency and validity when considering any new AI-enabled selection tool. "

See first-hand how Talent Select AI ensures accuracy and validity so you can start measuring what matters and hire best-fit candidates faster.

Samantha McGrail

Samantha McGrail is a content writer based out of Boston. She graduated from Saint Michael's College in 2019 and previously worked as an assistant editor focusing on pharmaceuticals and life sciences. Samantha can be reached at samantha.mcgrail@talentselect.ai.

Why is Validating Hiring Assessments Important?

What Validity and Reliability Mean in The Context of Employee Selection

Fair, Unbiased & Legally Defensible Hiring Assessments

Latest Resources

AI Under Scrutiny: Unveiling the Truth Behind Headline-Making Mistakes

Closing the Gender Gap in Tech: Strategies for a More Inclusive Workforce

What is Analyzing & Interpreting?

Why is Validating Hiring Assessments Important?

What Validity and Reliability Mean in The Context of Employee Selection

Fair, Unbiased & Legally Defensible Hiring Assessments

Latest Resources

AI Under Scrutiny: Unveiling the Truth Behind Headline-Making Mistakes

Closing the Gender Gap in Tech: Strategies for a More Inclusive Workforce

What is Analyzing & Interpreting?

Stay in the know