The Journal of Social Psychology, 1982, 116, 255-261.


University of New South Wales, Australia



Weinstein's exhaustive demolition of the relationship between achievement motivation and intermediate levels of risk preference suffers from two methodological uncertainties. He used only pencil and paper measures of risk preference and long test batteries which could have led to "response exhaustion." The studies reported here (N = 137 and N = 99, respectively) used only 50 and 37 self-report items, respectively, and behavior in the ring-toss experiment as a criterion variable. It was found that neither distance adopted in the experiment nor the subjective probability of success were related to achievement motivation score. Previous positive results cannot be interpreted because of the deficient reliability of the projective measures used and the selective reporting of results. One of the central assumptions of the Atkinson-Feather model appears therefore not to be correct.


One of the more notable features of the considerable body of published work on the concept of achievement motivation is the production of quasi-mathematical models. Central to these models put forward by Atkinson and others (3, 4) is the postulate that the tendency to approach achievement tasks will be maximized when the probability of success is .50. This postulate is held to be supported by empirical findings to the effect that those with high achievement motivation tend to prefer intermediate risks. Weinstein (17) in his Table 1 lists 18 published studies of this relationship of which only three produced negative results.

Like many other writers, however (e.g. 7, 10), Weinstein is concerned about how results such as these (based on projective tests) can be interpreted because of the generally poor internal and test-retest reliability of such tests. He therefore undertook a mammoth project of correlating 12 measures of risk-taking with three projective and six self-report measures of achievement motivation. His reasoning appeared to be that, as self-report measures are generally more reliable, they might compensate for the unreliability of the projective tests.

What Weinstein found was a general lack of relationship between any measure of motivation and any measure of risk-taking. How do we explain the conflict between his work and that previously published? The explanation that Weinstein appears to favor lies in the sociology of knowledge: He proposes that the studies published are not representative of the studies undertaken. He attributes this to the undoubted, though antiscientific, bias of both researchers and editors to report positive results.

Two other explanations that are considered by Weinstein, however, are that the expression of a particular type of motivation is "washed out" by repeated testing [Atkinson's (3) "sawtooth" effect] and that the pencil-and-paper risk-taking tasks he used gave less valid information than an actual behavioral test might have. As Weinstein is not able to dismiss either of these possibilities, further work towards testing them seems called for.


The measure of risk-taking is the Litwin (11) "ring-toss" experiment, one of the tasks used by Weinstein but only in the form of a pencil-and-paper analogue. The object of the experiment is to observe the distance from the peg adopted by players in the game of quoits when allowed to adopt any distance they like. Highly motivated players were expected to adopt an intermediate distance.

Following Weinstein's lead, the present study relied for its measure of achievement motivation on self-report tests. These are in any case not normally charged with susceptibility to response exhaustion and were additionally on the present occasion limited to a total of 50 items.

The motivation tests had been validated in Australia for general population use: the Ray (13) short "AO" scale, the Ray (14) Task-orientation and Success-orientation scales, and the Ray (15) "Catchphrase" achievement motivation scale. The second and third scales have considerable item overlap with the first scale and a total of 30 items thus sufficed to score all three. The 30 items were also scored as a single overall scale. All scales were balanced against acquiescent set and had proven reliabilities in excess of .70.

The Ss were mainly students at the University of New South Wales but a University "Open Day" was utilized to obtain some Ss outside the normal student age-range. Forty-two of the Ss were from this latter source and the sample on the present occasion has thus some slight claim to more diversity than has been the case in previous studies in this area. As Open Day visitors are however usually relatives of students, no claim that the Ss encompassed a broad socioeconomic status range should be inferred.

The set of quoits with six rings upon the peg was placed in a long corridor floored with 1' square vinyl tiles. Ss were invited to "play," with the instruction "adopt any distance you like." Distances adopted were measured by noting covertly the number of vinyl tiles between the back of the subject's heel and the peg. All distances were rounded to the nearest foot. An indefinite number of "games" were allowed, but it was distance adopted in the first game that was recorded. Ss were then asked to fill out the 50 item questionnaire. As far as possible, the ring-tossing and questionnaire answering tasks were represented as unrelated.

A total of 137 Ss were processed. There were 58 males and 79 females. When examined separately, the mean scores of the two sexes on both the motivational variables and the experimental criterion were virtually identical. Age also did not predict either achievement motivation or experimental score. For the purposes of further analyses, therefore, the data were not differentiated in terms of these variables.

The coefficient "alpha" reliability of the five tests across the sample as a whole was generally satisfactory. Results of .75 for the AO scale, .77 for the TO scale, .64 for the SO scale, .82 for the Overall scale, and .84 for the "Catchphrase" scale were obtained.

The Ss were then ranked in terms of the distance they adopted from the peg. They were then divided up into seven groups with ns as nearly equal as possible. The final ns were 15, 21, 2 7, 21, 24, 20 and 9. These data were then subjected to one-way analyses of variance. For all motivational variables the Fs were nonsignificant, generally less than 1. There was then no relationship of any sort between level of aspiration and achievement motivation; there was no tendency for Ss adopting intermediate distances to be more highly motivated.


So far then it appears that Weinstein's conclusions stand supported. Reversion to an experimental measure of level of aspiration and the use of a quite brief test battery do not alter the findings. A remaining problem, however, is why distance adopted in a ring-toss task should be used as an operational definition of perceived probability of success. Extraneous variables such as skill and height of the S could obviously distort this measure. One has to assume that such differences will be randomly distributed before the measure can be accepted.

It also seems worthwhile to take some account of the recent experimental elaborations by supporters of the established theory, such as Hamilton (9). Apparently spurred by negative results such as Weinstein's, Hamilton adopted a very elaborate experimental design in an attempt to be certain he was directly measuring level of aspiration. He still found, nevertheless, no association between a .5 probability of success and level of motivation. He did however find such an association with those adopting a .4 probability of success. In something of a "near enough is good enough" spirit, he interpreted this as support for the established theory. A direct replication of Hamilton's experiment was hence not thought worthwhile. Additionally, the space-age complexity of Hamilton's method was thought to render it particularly vulnerable to the usual criticism that can be levelled at experimental methods of data collection: they are so isolated from everyday reality as to make any generalization from their results highly dubious. Nonetheless, there were three of Hamilton's innovations that seem worth emulating: his attempt to measure probability of success directly, his emphasis on individual versus group testing, and his emphasis on extensive "practice" for the S before any measure of performance was taken.

In order to avoid the complexity of Hamilton's design, probability of success was ascertained by directly asking the S (after he had had practice) how many rings he expected to get on the peg. This method has been used before and the criticisms that Hamilton lists for it were therefore taken into account in the design.

Ss were brought to the set of six quoits placed in a long corridor and were told they were to play three "games" and that they could adopt any distance that suited them; they were specifically encouraged before and during the first game to try out a variety of distances. Throughout the games the Es engaged the Ss in conversation and after two games had been played they casually asked each S as part of the conversation, "How many rings do you think you will get on this time?" The answers given were the data for the study. Afterwards the Ss were asked to fill out the questionnaire.

This procedure substantially overcomes the three criticisms of the method listed by Hamilton: there was no incentive for an S to alter his behavior to validate his stated estimate; the casual nature and appositeness of the crucial question reduced the usual defensive reasons for understating or overstating the estimate; the probability specified by the S was not an uninformed one but was gathered only after actual practice in the situation. Even if the probability was systematically over or understated, this tendency would simply have the effect of changing the probability associated with high motivation to somewhere slightly on either side of .5. Under Hamilton's "near enough is good enough" rule, this should not be a crucial difficulty. If, on the other hand, over- and underestimates were not systematic but effectively random, they should cancel one another out.

The scales in the questionnaire were the same as in Study I except that the "Catchphrase" scale was replaced by Argyle and Robinson's (2) "Fear of Failure" scale. As in the previous study, the items of the AO, TO, and SO scales were given intermingled at the beginning of the questionnaire. Fear of failure is more usually measured by way of either the Mandler and Sarason (12) Test Anxiety Questionnaire or the Alpert and Haber (1) Debilitating Anxiety scale. It was felt however that using measures of general anxiety to index specifically fear of failure involved unnecessary assumptions where "purpose built" instruments were available.

The 43 females and 56 males used as Ss were again recruited in a way designed to give a more varied group than the usual sample of introductory psychology students. They were recruited from among their student friends and acquaintances by members of a Sociology Research Methods class who also acted as investigators.

There were found to be no differences in experimental score or motivation between the sexes, so that the data were combined for further analyses. The Ss were divided into seven groups, but this time on the basis of the number of rings they expected to get on the peg. There were five who expected no rings, 21 expected one ring, 20 expected two rings, 18 expected three rings, 16 expected four rings, six expected five rings and 13 expected six rings. These groups were used in one-way analyses of variance of each motivation scale. For all scales the Fs were nonsignificant. For all scales, there was therefore no significant variation in the means observed among the seven experimental groups. Levels of resultant motivation were also compared across the seven groups. This was done by subtracting the standardized score on the Argyle and Robinson scale from the standardized score on the Ray (13) 14 item AO scale. Again the F was nonsignificant. The Argyle and Robinson scale did fulfil the theoretical requirement of being orthogonal to the achievement motivation scale. The nonsignificant correlation between the two was -.09 which contrasts with the .38 correlation reported by Hamilton.

The mean score observed on the 14 item "AO" scale in the present study was 30.51 (SD 5.85). This compared with 31.88 (5.69) in the previous study and 31.44 (5.83) obtained in a random doorstep study (N = 95) of the Sydney metropolitan area (13). Neither of the present means, then, differed significantly from the general population mean. Thus, although they were not planned as such, the present samples were in fact representative of the general population at least in respect of the main variable under study.


It would seem that Weinstein's (17) results stand replicated even when the possibilities he could not exclude in interpreting his own work are eliminated. Once again one of the central assumptions of the complex multiplicative models of achievement behavior favored by such writers as Atkinson and Feather (4) lacks empirical support.

This replication of Weinstein's results using different methods does reinforce his conclusion that previous reports of the emergence of the expected relationship were probably largely an artifact of the worship of "positive results" prevailing in the 50s and 60s. The fact that the projective tests in this earlier research are unreliable does not mean that relationships will not occasionally be shown, rather they will not readily be replicated. If such failures of replication are systematically ignored, however, an appearance of consistency will be created where none in fact exists.

Put in the context of the systematic way in which the McClelland-Atkinson-Feather theories and assumptions have steadily been overturned by more methodologically careful work (e.g., see in addition to the references listed by Brown (6) and Weinstein (17) work by Veroff et al. (16) and others (5, 8)) Weinstein's study and its present extension would seem to leave the still popular quasi-mathematical models of this field rather isolated from reality. They may be as irrelevant as they are elegant.


