Chapter III: Methodology


77 freshmen, 3 sophomores, 12 juniors, and 46 seniors from Maharishi International University took part in this study. Subjects were volunteers from a variety of classes.  Freshmen were all in their first year core course curriculum, and upperclassmen were recruited from a variety of art and science classes. 23 of the seniors were randomly selected
(using a random number table and class list) to take the Rokeach Adult Dogmatism Scale to determine if there was a significant difference in open-mindedness between those who volunteered and those who were randomly selected (no difference).

At MIU there are 175 freshmen and 199 seniors, so this sample was representative of 44% of the freshmen, and 23% of the seniors. 64 of the freshmen took the Watson-Glaser Critical Thinking Appraisal, 53 took both the WGCTA and the RADS, and 64 took the RADS.  MIU regularly tests the MIU student body in EEG coherence at the laboratory at ICSR, So EEG scores were available already. 23 seniors took the WGCTA, (23 took both the WGCTA and the RADS). and 23 seniors took only the RADS.

Instruments: Watson-Glaser Critical  Thinking Appraisal

The Watson-G1aser Critical Thinking Appraisal was designed by Goodwin Watson, and his research assistant Edward M. Glaser in 1941 at Columbia University & Teachers College. According to Smith (1977) this test is based on Dressel and Mayhew's (1954) definition of  critical thinking, and research in this area:

The essence of the democratic creed is that each person possesses potentialities for discovering his own problems, and for developing personally satisfactory and socially acceptable solutions to them, so that he has no need to defer completely to the will of an authority although he is perfectly willing to make use of expert opinion when relevant. (Dressel and Mayhew, 1954).

With this in mind, "the exercises include problems, statements, arguments, and interpretations of data similar to those which a citizen in a democracy might encounter in his daily life" (Watson and Glaser, 1964). The test has had four editions, Form Am, Form Bm, Form Ym, and Form Zm.

The test includes five subtests designed to measure different, though interdependent, aspects of critical thinking. Each form contains 100 items, and the test can be completed in about 50 minutes. The WGCTA is a test of critical thinking power rather than speed, so there is no rigid time limit.  The five subtests are described as follows:

1. Inference. (Twenty items.) Samples ability to discriminate among degrees of truth and falsity of inferences drawn from given data.
2. Recognition of Assumptions. (Sixteen items.) Samples ability to recognize unstated assumptions or pre-suppositions which are taken for granted in given statements or assertions.

3. Deduction. (Twenty-five items.) Samples ability to reason deductively from given statements or premises: to recognize the relation of  implication between propositions: to determine whether what may seem to be an implication or a necessary inference from given premises is indeed such.
4. Interpretation (Twenty-four items.) Samples ability to weigh evidence and to distinguish between (a) generalizations from given data that are not warranted beyond a reasonable doubt, and (b) generalizations which, although not absolutely certain or necessary do seem to be warranted beyond a reasonable doubt.
5. Evaluation of Arguments. (Fifteen items.) Samples ability to distinguish between arguments which are strong and relevant and those which are weak or irrelevant to a particular question at issue (Watson and Glaser, 1964).

The subtests of the Watson-Glaser Critical Thinking Appraisal were selected from the best of a larger number of subtests included in earlier editions. The WGCTA, according to the list of references in Euros' Mental Measurements Yearbooks had been used in many more research studies, and had undergone more revision and improvement than the alternatives.  The alternatives considered, for the reader's information, were:

The Cornell Critical Thinking Test (Ennis & Millman, 1978)
Maw Critical Thinking meet (new, 1959)
Logical Reasoning Test (Burney)
Test of Critical Thinking in the Social Studies (Wrightstons.1939)
Chickering Critical Thinking Behaviors (McDowell & Chickering,1967
The Critical Consciousness Inventory {Smith-Alschuler, 1976)
McDermott Inventory of Critical Thinking Attributes
Curry Test of Critical Thinking
Wisconsin Tests of Testimony and Reasoning, Test R-1
Literal Comprehension and Critical Reading Test (Greenblatt)
The Critical Listening Test (Richards)
A Test of Critical Thinking, Form G (American Council on Educ.)
A Test on Principles of Critical Thinking, Form F15 (Rust)


Figural reasoning tests, those which involve reasoning through symbols and diagrams (non—verbal) are considered by many psychometricians to be the purest measure of critical thinking ability (Whimbey, 1976).  This study, however, was concerned with critical thinking with relation to complex, and verbal reasoning.

The Cornell Critical Thinking Test (CCTT) was seriously considered for use in this study, but the theoretical problems used to test critical thinking were felt to be needlessly negative (eg. whether or not certain strains of ducks should be killed). Spearman—Brown correlations were low with respect to the WGCTA in a study by Follman (1969).  The WGCTA was one of four unique factors in a study by Landis (1976) while the CCTT was not. In a factor analytical study of critical thinking tests, the WGCTA accounted for a larger percentage of the total variance in a Kaiser varimax rotation of a correlational matrix of 22 subtests of scholastic achievement tests than the CCTT.

Michael, Devaney, and Michael (1980) concluded that "the CCTT measure may be substantially lacking in factoral validity, as the psychological nature of its constructs could not be readily identified."

The selection of the proper test instrument for this study was important. Ross (1975) found in a study of 8 inductive tests and 3 deductive tests, including the WGCTA, research in inductive reasoning is instrument dependent--not all instruments test reasoning ability in the same way. Rust (1959) found that in a study of several CT  tests including WGCTA, ”only in rare instances was the a priori reasoning of the test

makers regarding the grouping of items confirmed. All items within a subtest do not measure the same abilities and therefore do not measure what they are intended to."

Frank S. Freeman (1950) wrote that in tests of CT “individual critical thinking is reduced often to a minimum".  He felt that at best, they test ability to discriminate arguments and recognize assumptions, and that limitations of the tests must be recognized. In an item  analysis of the WGCTA on 200 high school students in South Australia, Broadhurst (1970) concluded that the WGCTA is ”not as valid a measure of critical thinking as one may desire.”

Robert Ennis, the creator of the CCTT, in an analysis of the WGCTA concluded that the WGCTA gives too high a score to the "chronic pathological doubter" (1958). He mentioned the “common knowledge" criterion in the WGCTA as not satisfactory and that the fifth subtest is structured so that a person answers depending on his value system, not CT.  Yet, history, validity, and reliability seemed to favor the WGCTA.

Split-half reliability coefficients for the WGCTA Form YM were .85 for 5297 liberal arts freshmen, and .85 for 200 college senior women from ten liberal arts colleges. These were odd-even split-half reliability coefficients corrected by the Spearman-Brown prophesy formula. Corrected split-half and KR-20 total test reliability estimates for the WGCTA Form Zm were lower, .655 and .667 respectively (Follman and Miller,1971).


Content validity of the WGCTA is discussed in the Manual for Forms YM and ZM (Watson & Glaser, 196M):

Content validity is usually established by showing that the items in the test call for responses that represent a ba1ance and adequate sampling of some clearly defined universe of knowledge, attitudes, or skill.  In the area of critical thinking there is no general  agreement on the definable limits of the subject-matter per se, nor is it possible to conceive of a clearly defined universe into which all aspects of critical thinking could be classified.

With respect to construct validity, the way in which the parts of a test relate to each other and to the whole, "the moderately low subtest intercorrelation coefficients, ranging from .21 to .50 support the contention that relatively distinctive abilities are being measured with sufficient overlap to warrant their inclusion in one total score" (Watson and Glaser, 1964. p.14).

The predictive validity of the WGCTA, how well it predicts future performance in some relevant area, "as of any other test or selective device, tends to be unique and must be established empirically in each situation where the test is to be used" (Watson and Glaser, 1964, p.15). For example, George (1968) found WGCTA scores to be a valid predictor of final grades in biology and Lysaught (1964) found them to predict success in computer programming. Lysaught and Pierleoni (1970) determined that the most powerful criteria for educational institutions seemed to be combined WGCTA and Otis IQ scores.

Index  1  2  3  4  5  6  7  8  9  10  11  12  13  14