In Construct Validity in Psychological Testing (Cronbach & Meehl, 1955) the authors address four types of test validity: Predictive, Concurrent, Content, and Construct. Each should be considered in what passes for modern “educational research” but is not. Instead, educational research tends rather to be based on criterion-oriented validity, which, Bechtoldt says, "involves the acceptance of a set of operations as an adequate definition of whatever is to be measured."
Predictive validity occurs when Institutional Research analysts consider whether passing the lowest level developmental mathematics course is predictive of retention and persistence. Concurrent validity is seen when test and criterion scores are obtained at the same time. An example might be correctly identifying the use of plus and minus signs as descriptors and operators (test score) and being able to solve symbolic expressions that include both positive and negative numbers in addition and subtraction operations correctly.
Content validity is established by showing that the test items occur in the area in which the investigator is interested. Such systematic sampling is evident in the test banks popular with publishers today. Where content validity is not well-established is in showing that the instructor has taught what is being tested.
Construct validation “is involved whenever a test is to be interpreted as a measure of some attribute or quality which is not ‘operationally defined’ [but instead by] the orientation of the investigator." This is typical of publications in educational research where pedagogy and its relation to curriculum is generally not exposed. Typically unanswered questions are whether or not test items are culture-free. For example, southern children know full well that the sun can be shining while it is raining, and that this can occur on opposite sides of the street. However, the required test answer is “no,” and so the test criterion is not reliably predictive of learning or intelligence, but rather of where the student lives.
Cronbach and Meehl say that what is measured by data interpretation also needs to be tested. Does the interpretation “measure reading ability, quantitative reasoning, or response sets?” Can the researcher respond adequately if someone were to ask whether there were another way to interpret the correlation, or whether other evidence were available to support the interpretation, making it “reasonable to inquire about other kinds of evidence.” But, sadly, rarely is such critical thinking brought to educational research.