Assessment methods and tests should have validity and reliability data and research to back up their claims that the test is a sound measure.
Reliability is a very important concept and works in tandem with Validity. A guiding principle for psychology is that a test can be reliable but not valid for a particular purpose, however, a test cannot be valid if it is unreliable. Read more about Reliability...![]()
Assessment methods including personality questionnaires, ability assessments, interviews, or any other assessment method are valid to the extent that the assessment method measures what it was designed to measure.
The most important validity to those interested in the usefulness of tests for predicting work-related outcomes is Predictive Validity.
Predictive validity is the extent to which a test or questionnaire predicts some future or desired outcome, for example work behaviour or on-the-job performance. This validity has obvious importance in personnel selection, recruitment and development.
Predictive validity is usually measured by the correlation between the test score and some appropriate criterion. The criterion could be performance on the job, training performance, counter-productive behaviours, manager ratings on competencies or any other outcome that can be measured.
The following table summarises some of the general research findings around the predictive validity of the different selection methods available:
Assessment Method |
Predictive Validity for |
GMA (IQ) Tests |
.65 |
Integrity Tests |
.46 |
Employment Interviews (structured) |
.58 |
Employment Interviews (unstructured) |
.60 |
Conscientiousness |
.22 |
Biographical data |
.35 |
References |
.26 |
Job experience |
.13 |
Assessment centres |
.37 |
Peer ratings |
.49 |
Years of Education |
.10 |
Emotional Intelligence |
.24 |
Work sample tests |
.33 |
Emotional Stability |
.12 |
Source: Schmit, Oh & Shaffer (2016) - Working paper
Assessment method |
Predictive validity |
Criterion measure |
Integrity Tests |
.58 |
|
Integrity Tests |
.51 |
Overall job performance |
Source – Comprehensive meta-analysis of integrity test validities by Ones, Viswesvaran & Schmit (1993).
When we combine assessments in a battery we can increase the validity of the testing if the tests are of approximately the same validity and have low inter-correlations. 
Guilford & Fruchter (1978) summed up the different effects of lengthening tests and including more tests in a battery as follows:
Schmidt, Oh & Shaffer (2016) found the two combinations with the highest multivariate validity and utility to predict on the job performance were GMA (IQ test) and an integrity test (mean validity of .78) and GMA (IQ test) and a structured interview (mean validity of .76).
The following table illustrates how validities increase as test length increases. The calculations are based upon typical reliability and validity figures of .70 and .40 respectively for a 5 minute test. The difference in validity between a 5 minute test and a test of infinite length is only a .078 difference (.478-.400).
Test Time (Minutes) |
Validity (r) |
1 |
.270 |
2 |
.332 |
3 |
.365 |
4 |
.386 |
5 |
.400 |
6 |
.410 |
7 |
.410 |
8 |
.418 |
9 |
.424 |
10 |
.430 |
11 |
.434 |
12 |
.437 |
13 |
.440 |
14 |
.443 |
15 |
.445 |
Test of infinite length |
.478 |