Journal of Applied Psychology, 1984, Vol. 69, No. 2, 353-355

REINVENTING THE WHEEL: Winkler, Kanouse, and Ware on Acquiescent Response Set

John J. Ray

School of Sociology, University of New South Wales, Kensington, New South Wales, Australia

Winkler, Kanouse, and Ware (1982) presented a method of dealing with "acquiescent response set" that consists of counting up double agreements to paired items opposed in meaning and partialling out the resulting score from any correlations the items show. The article takes no account of the important contention by Rorer (1965) to the effect that acquiescence does not generalize and, hence, cannot be so treated. Winkler et al. (1982) are aware of another method of scoring acquiescence (counting up all "agrees" regardless of item meaning) but do not compare it with their method and give essentially irrelevant reasons for rejecting it The method they use to validate their procedure is shown to produce findings that are substantially true by definition (i.e., trivially true). It is concluded that although the article contains virtually nothing new, it may be of some interest if supplemented by further information.

Because the use of scales that makes no allowance for the possibility of acquiescence artifact (e.g., Bem, 1974; Munro & Adams, 1978) is still widespread, one welcomes any article that draws attention to the problem. The recent article by Winkler, Kanouse, and Ware (1982), however, presents as new much that is old and has a number of serious lacunae.

A basic weakness is their failure to mention or acknowledge the important article by Rorer (1965) on the topic. Winkler et al. do not recognize that acquiescent response set was once a topic of considerable interest to scale users (particularly users of the California F scale) before Rorer (1965) declared "acquiescent response style" to be a "myth". As the problem of producing a form of the F scale balanced against meaningless acquiescence was at the time proving particularly intractable, the declaration by Rorer that systematic meaningless acquiescence did not exist to any appreciable extent was relief gladly received, and acquiescence as an area of research interest seems to have all but died out thereafter. The principal basis for Rorer's iconoclastic conclusion was that various measures of acquiescence generally fail to intercorrelate (McGee, 1962; Martin, 1964). There is, therefore, (in Rorer's view) no such thing as a trait of acquiescence or a consistent tendency to "acquiesce regardless of meaning.

It is, therefore, hard to understand why Winkler et al. (1982) appeared to regard it as self-evident that some subjects will show a consistent tendency to acquiesce regardless of meaning. That is, they fail to comment on what must be one of the most basic issues in the field; they base their work on an assumption that has been vigorously questioned.

As it happens, some recent work suggests that Rorer overgeneralized his conclusions and that acquiescence does in fact generalize within scale types, though not between scale types (Ray, 1979, 1983; Ray & Pratt, 1979; Heaven; 1983); Rorer's optimism seems excessive, and the importance of precautions against unrestrained acquiescence is not in dispute here. Readers of the Winkler et al. articles, however, would surely have benefited from knowing that the procedures proposed were controversial and might be relevant only within certain carefully specified contexts.

There are other weaknesses in the article. The two "most common alternatives for dealing with response bias in attitude measurement" are described as: "Excluding the most acquiescent respondents from analyses and using the raw item correlations" (Winkler et al., 1982, p. 556). It is difficult to see how making no corrections for acquiescence ("using the raw item correlations") can fairly be described as a way of dealing with it. It would surely be more accurately described as not dealing with it.

There is in fact a second method of dealing with the acquiescence problem: deriving a separate acquiescence score simply by adding up all the responses to a balanced scale regardless of meaning or direction of wording (Martin, 1964). (One might also add the requirement that the "oppositeness" in direction of wording be demonstrated, e.g. by a significant negative correlation between the two halves of the ostensibly balanced scale, Ray, 1983). Such an "acquiescence" score can then be used to partial out the effect of acquiescence from any correlations with other variables that the scale might have when scored for content (e.g., Ray, 1970). It would surely have been of interest to compare this method with the method put forward by Winkler et al. (1982). The method they proposed (though not original to them; see Christie, Havel, & Seidenberg, 1956) is to derive an acquiescence score for each respondent by counting double agreements to "logically" opposed items.

Winkler et al. (1982) were aware of this second method; in fact, they presented three reasons in their Discussion section why they do not "prefer" it. All three reasons were, however, unconvincing. They said that their method of matched pairs ensures that the scales are balanced. Although this is true, it in no way makes balancing an exclusive property of scales composed of matched pairs. Scales not composed of matched pairs can also be balanced -- perhaps even more easily. At least many previous authors have thought so (Altemeyer, 1981; Lee & Warr, 1969). Furthermore, they said that matched pairs provide a longer scale. How? Surely a scale without matched pairs can be of any length at all. Finally, they said that matched pairs mean that the respondent does not have to be burdened with extra irrelevant content. In this they probably referred to the fact that some researchers use a group of "nonsense" items from which to derive an acquiescence score. This procedure is, however, quite optional and even, in view of Rorer's remarks, potentially misleading. Researchers can and do derive their acquiescence scores from whichever balanced scale they happen to be using (e.g., Ray, 1972).

Winkler et al. (1982) may be commended for showing the rather unusual awareness that the reliability of their acquiescence score is worth exploring, but their judgment that an "alpha" of .56 is "satisfactory" is surely questionable. Shaw and Wright (1967) regarded a reliability of .75 as the minimum required in a research instrument. As it happens, there are some reasons why we might be disposed to accept a low alpha in an acquiescence measure (Ray, 1979, 1983; Ray & Pratt, 1979) but Winkler et al. seemed unaware of any need for discussion of the problem.

The main method used by Winkler et al. (1982) to evaluate the effectiveness of their method of removing acquiescence effects has its lighter side. They first found that a factor analysis of the raw data matrix produced two format factors; all the "pro" items clustered on one factor and all the "anti" items clustered on the other factor. They concluded, quite rightly, that this is a typical sign of strong acquiescent response bias in the data. It is the sort of thing observed where a high frequency of careless double agreements to ostensibly opposed items cancels out the number of opposing responses to the paired items made by more careful respondents and results in an overall orthogonality between the two types of item. Winkler et al. (1982) then counted the double agreements, partialled them out, and showed that the format factor could not now be found. Their analysis, in short, showed that double agreements have a certain effect, used a common statistical procedure to remove the double agreements, and then presented it as an achievement that the effect of the double agreement is no longer seen. This is another example of the tendency (deplored by Smedslund, 1978, in psychology generally) to present as empirical findings things that are actually true by definition.

A final deficit in their article is that it contains no significance tests for anything. We are, for example, asked to believe that their Table 2 shows older, less-educated, or black respondents to be more prone to meaningless acquiescence. Yet no 't' or 'F' test results are given. Readers might have been able to carry out significance tests for themselves if standard deviations had been included in the table, but even this was not done.

It is to be hoped that this note and any rejoinder to it will help put the Winkler et al. work in better perspective and help to remedy some of its deficits. Although there appears to have been nothing original about the work, extra data on an old problem can still be of interest.


