Journal of Clinical Psychology, October, 1982, Vol. 38, No. 4., 779-782.



University of New South Wales, Australia

Bloom has defended the forced-choice form of the Machiavellianism scale from observations by Ray that such scales have intrinsic validity problems -- particularly with social desirability. Bloom also has shown that the failure of the F scale to predict authoritarian behavior does not necessarily deprive it of all claims to validity. It is pointed out that Bloom's observations contain nothing new and ignore important considerations. The validity problems of forced-choice format are elaborated by means of examples, and it is pointed out that Bloom appears to lack criteria for ever allowing the F scale to be shown as invalid.

Ray (1979a) noted the general lack of relationship between the California F scale and actual authoritarian behavior. It was concluded from this that the F scale is invalid by normal standards and that an account of the correlates of authoritarianism therefore must await the use of a new, more valid, scale. The Ray (1976) "Directiveness" scale was proposed as such a scale, and a study was reported in which its correlates were examined. One of the scales included in the study was the Christie and Geis (1970) "Machiavellianism" scale in its Likert (Mark IV) format. It was noted in passing that the forced-choice "Mark V" form of this scale was not used on the general ground that such scales had inherent problems with social desirability artifact. It also was noted that even Christie and Geis themselves reported a correlation between the Mark V scale and social desirability responding.

Bloom (1980) has taken two sentences from Ray (1979a) that concern this aspect of the Mark V Machiavellianism scale and has focussed on rebutting them. He points out that the forced-choice scale correlates with some measures of social desirability set, but not with others. He concludes that the artifact that can be demonstrated in this way is for several reasons unimportant. His entire account of the matter amounts to little more than a recapitulation of what Christie and Geis already had said. What Bloom ignores, however, are the somewhat more complex reasons why social desirability set can be a problem even in the absence of initial correlations with an independent measure of it. These reasons were given by Ray (1979a) primarily in the form of references, so it is evident that a more fully spelt-out account may be needed. A brief summary of what the three references show therefore is given below:

Ray (1973) is a report of a study wherein "de-ipsatized" (Likert) forms of the Bass (1967) forced-choice task-orientation and interaction-orientation scales were constructed and validated. The reason behind the study was the observation that the forced-choice scales had been found by Bass to have rather paradoxical validity characteristics. Among other things, task-orientation was found by Bass not to correlate with achievement motivation. As writers on achievement motivation tend either to equate the two concepts or to find that task-orientation or something very much like it is at least a major subdivision of achievement motivation (Featherman, 1972; Mawhinney, 1979), it seemed possible that the forced-choice scale was not measuring what it should. When the Likert form of the scale was found in fact to correlate .615 with a fairly conventional achievement motivation scale, it was concluded that respondents to the forced-choice scale must have been choosing the more socially desirable alternative rather than the one that best described their motivations. This was in spite of the fact that Bass had made the attempt to equate his alternatives in terms of their social desirability.

The failure of Bass's attempt in this regard was held to be explainable by the work of Orvik (1972), who showed that the equation of alternatives in terms of social desirability was virtually impossible, not only because of variability in what is seen as socially desirable between groups, but also because of variability in what is seen as desirable within groups. In other words, a person still may be responding in terms of perceived social desirability even if the alternatives nominally have been equated in terms of that attribute.

The third reference (Gatz & Good, 1978) showed with another forced-choice scale (the Rotter I-E scale) that forced-choice format can hide multi-dimensionality. They concluded that the "internal" and "external" alternatives in the scale that they studied represented in fact two independent attributes rather than opposites. This also is a serious validity problem with a forced-choice scale. In a balanced Likert scale, the correlation between the supposedly opposed items always can he examined and, if need be, found wanting. In a forced-choice scale the "opposedness" of the alternatives for the particular population has to be taken on faith. Items that are seen as having opposed meanings by one sample may not be so seen by others (Kirton, 1977; Ray, 1970), so the faith required is considerable.

In the particular circumstances of the study reported in Ray (1979a), the faith required would have been of anchoritic dimensions. One would have had to assume that Machiavellianism items that had been equated for social desirability and tested for opposition of meaning among American college students preserved the same relative social desirability and relative meaning among an Australian general population sample! It may be seen, then, that forced-choice scales become rather like Thurstone scales if we are to use them with anything like proper scientific caution -- they virtually have to be reconstructed for each population to which they are to be applied. The Likert scale actually used in the study, by contrast, gave fewer problems. Only a low correlation (-.232) between Machiavellianism and Social Desirability score was observed, and even this effect could have been removed entirely by way of partial correlation if that had been required. However, because the authoritarianism measure used did not load on social desirability, social desirability could not have been a common confounding factor so partialling out of social desirability would have been superrogatory.

One problem with the Machiavellianism scale on the given population that the Likert format did enable us to detect, however, was that the supposedly pro- Machiavellian and anti- Machiavellian items in fact correlated only .156 (after appropriate reverse-scoring). This indicates substantial construct invalidity in the scale -- whether used in a Likert or in a forced-choice format. The difference is that the Likert format makes the invalidity detectable. The general misgivings held about any scale in forced-choice format were shown to be fully justified in relation to the Machiavellianism scale in particular. Statements that Christie and Geis claim to be respectively pro-Machiavellian and anti-Machiavellian in fact were found to have very little relationship of any kind.

A parenthetical but important question at this stage might be why the Machiavellianism results were used at all in Ray (1979a) if the scale was so bad. The basic reason was simply that it was a negative result that was being reported. It was reported that the Machiavellianism scale and the Directiveness scale did not correlate. The reason for the lack of correlation may well have been the invalidity of the Machiavellianism scale itself, but the reasons for any negative result are in principle multifarious, and explorations of the reasons behind each negative result that was being reported would have overburdened that particular article with speculation. Additionally, it seemed rather apparent at the time that the most likely reason for the poor performance of such a well-known scale was the fact that it had been used only in a very abbreviated form. Using the scale in its full form should serve to correct the validity defect. Later research, however, (Ray, 1983b) found that even the full form of the scale had a similar problem. It will be seen, then, that life for the present author would have been a lot simpler if the forced-choice scale had been used. However, it would have been the simple life of the ostrich. Compare also Ray (1979a) for another exploration of negative results in the study that Bloom criticizes.

In summary, then, Bloom's defence of the forced-choice "Mach V" scale adds nothing new and ignores serious problems to which all forced-choice scales are at least potentially heir.

In the second leg of his paper, Bloom offers a complicated account of how in the abstract scales might have validity. At its simplest he seems to be saying only that a scale should not be condemned out of hand simply because it fails to predict behavior once or twice. Because this seems too obvious a point to warrant the lengthy account that Bloom gives, a second possibility appears to be that he is referring to the very important point that the lack of a direct relationship between attitudes and behavior need not dismay us. There may be a relationship between the two, but it need not be obvious at first sight. The present author also has been a persistent advocate of this view -- with his demonstrations that achievement motivation may be at least as good a predictor of authoritarian behavior as are authoritarian attitudes and personality (Ray, 1973, 1980a, Ray & Lovejoy, 1983). We may dominate others not only because we like it as such, but also because we want to use them in an instrumental way to attain other ends. So while Bloom is right in insisting that there could be some relationships between the F scale and behavior, he again is contributing nothing new. What is needed is surely some suggestion or (better) some demonstration of a particular type of behavior that the F scale does predict. As it is now 25 years since Titus and Hollander (1957) pointed out in their influential review that the F scale correlates with other pencil-and-paper measures rather than the actual behavior, one would think that by now there should have been ample opportunity for such relationships to have been sought out and found. As has been pointed out at some length elsewhere (Ray, 1976, 1980b, 1983a), however, such relationships as are found tend either to have an element of circularity or are compatible with an account of the F scale as measuring something other than authoritarianism (e.g. general social conservatism). One has to ask, therefore, what evidence Bloom would accept as showing that the F scale was not valid as a measure of what it purports to measure. Is he saying that 30 years of vigorous, unremitting, and widespread research are still not enough? If so, his own contribution must be eagerly awaited.


