The Australian and New Zealand Journal of Sociology, 1973, Vol. 9, No. 3. pp. 11-12.


J. J. Ray

School of Sociology, University of New South Wales

When as an undergraduate, I was introduced to factor analysis, my teacher, a factor analyst of redoubtable experience, was careful to point out that factor analysis should be regarded as an exploratory tool only -- as something that may or may not turn up interesting re-arrangements of one's data. He would describe factor analysis as: "a way of taking a walk through your data". Naturally, there is always a temptation to expand the role of factor analysis to something much more ambitious than this and in this paper I wish to draw attention to one particular, but very common, example of inappropriate usage.

Initially, however, a more detailed statement of haw factor analysis should be used seems required. When we have information about a large number of characteristics of a group of people we often analyse the relationships between those characteristics in terms of an intercorrelation matrix. If we have ten characteristics, there will be a matrix of 45 correlation coefficients. Factor analysis is an attempt to reduce this matrix to more manageable and interpretable proportions. If we can decide that underlying the ten variables there are only three basic variables, the correlations between them requiring interpretation are reduced from 45 to three. The catch is: How do we decide that the ten variables can be reduced to three basic variables? One way is by conceptual analysis. We can say that the three variables of occupation, income and education reduce to a basic variable of social class. This is of course quite disputable but for some purposes it may be useful. What happens, however, when one has no strong ideas about how a set of variables should be regrouped? Often, we use factor analysis. It tells us what clusters seem to exist among the total set of correlation coefficients. If, however, it is disputable that income, occupation and education go together to form a single variable, the sorts of clusterings pointed out by factor analysis are usually of even more disputable status. If they were not, we would scarcely have needed factor analysis to point them out. As it is, they often seem to make little or no sense at all and any name applied to them is usually quite speculative and of poor fit. In a word, factor analysis is a way of suggesting hypotheses about basic variables underlying a larger set of variables.

A much more ambitious but unfortunately very common use of factor analysis, however, is to treat it as a way of "finding out" what a group of attitude statements measure. Factor analysis is also, a fortiori, esteemed a proper way to construct attitude and personality scales (as just a few recent examples take Lynn, 1969; Finifter, 1970; Neal and Rettig, 1967; Anderson and Western, 1967; Costello, 1967). It seems to me that some questions ought to be raised about the propriety of this procedure.

It will be here proposed in fact that factor analysis is both wasteful and misleading when used in the manner of the above authors. To understand this, the method of item analysis conventionally used in test-construction must be described. In this method (Guilford, 1954), the correlation of each item with some criterion is found and weakly correlating items are dropped. Usually, this criterion is the total score on the test itself. In this case a small number of items (say the weakest four) is at first dropped and the reliabilities ("alpha") of the original and shortened scales are compared. If alpha has risen, the total score of the shortened scale and the correlation of each item with it are recomputed and the whole process repeated. When dropping weak items ceases to produce rises in alpha the final form of the scale has been reached. (This form is maximally reliable but not maximally internally consistent. The maximally internally consistent scale would in fact consist of those two items between which the highest correlation in the matrix occurs) What this entire procedure ensures is that those items are selected which best measure what is general to the original item pool as a whole. Since that item pool was presumably written to tap a single particular construct, we are able to make an initial assumption -- subject to validity check -- that what is general to the original item pool is what we set out to measure. Therefore, by the conventional method, those items are selected which best measure what we set out to measure.

The procedure of Lynn and the other authors mentioned above offers no such guarantees. Orthogonally varimax rotated "principal components" (one of the most common of the mathematical procedures used to accomplish factor analysis) will always tend to produce a large number of factors on any of which only a small proportion of the original items will load highly. So far from producing a measure of what is general to the item pool, this procedure organizes items into groups which have the least possible relationship between one another. One can decide that one or more of these groups best measures some construct that one had set out to measure, but this is arbitrary.

Just how arbitrary it is will be seen if we realize how much difficulty is normally experienced in "interpreting" factors. Beswick and Hills (1969) initially named their third eigenvector ("principal component" or "factor") as measuring "authoritarianism". (In a pre publication paper read at the Brisbane (1968) conference of the Australian Psychological Society.) When it was pointed out that every item loading highly on that factor contained the word "Australian", the authors re-labelled the factor in their published paper as measuring "Australian chavinism". There is obviously something in common between the two concepts but it is submitted that the difference is great enough to be very disturbing indeed. The more-or-less clear cut factor structures that one sees reported in the iournal literature may represent the exceptional minority of studies conducted. The highly selected work appearing in journals cannot be taken as representative of the sort of results normally produced by principal components analysis. Those articles which do make the grade always tend to convey the impression that "factor-analysis told us this". They might just as accurately have said: "It just happened on this occasion that factor analysis produced an interpretable result."

The underlying and basic difficulty, of course, is that factor analysis has no way of distinguishing between "true correlation" and "error correlation". Because of the occasion-to-occasion variability of this error correlation, we have the result described by Taft (1963; 156): "The weakest aspect of this technique (factor analysis) is undoubtedly its inability to produce clearly comparable basic dimensions between one study and another". This is a strong statement but one nonetheless that has repeatedly been shown to be true. Conventional item analysis, on the other hand, represents a deliberate attempt to deal with the problem of error variance. The whole rationale is that minor effects are randomized out in favour of what is general to an item pool originally designed to be concept-specific.

There is, of course, a sense in which item-analysis is a particular type of factor analysis. The vector of item-to-total correlations is in fact almost identical with the first unrotated centroid loadings of the data. The studies being criticised here however employ the vastly different (but fashionable to the point of being normative) technique of varimax-rotated "principal components".

Though there are good theoretical arguments for rotation in general, there is however something rather bizarre about this practice of rotating principal components (eigervectors). The whole attraction of eigenvectors is that they represent a unique solution for any given set of data. To rotate them is in fact to destroy just this solution. Had Lynn and the others used the loadings of the unrotated first eigenvector they might have had a more defensible criterion for item selection. As it is, they no doubt used their method with some notion in mind of "finding the inherent structure" in their items. That this is a forlorn (if commendable) hope can be seen if we realize that we can take any one of our high varimax loaded item groups and re-factor-analyse them. By so doing we will produce yet another set of factors [1]. Which set of factors should we take? Which set of factors represent the "inherent structure"? The whole trouble with this use of factor analysis is that there is as yet no objective decision procedure available for specifying the number of factors to be extracted. The proposals of Lawley and others utilizing "statistical significance" as a criterion founder on the fact that as the sample size is increased, the number of factors shown to be significant also automatically increases -- finally approaching the number of variables in their basic correlation matrix (Nunnally, 1967). The extensive but now abandoned controversy over the structure of intelligence (the British "g factor" versus the American "primary mental abilities") should be eloquent testimony to the uselessness of factors analysis for "finding out" the structure in a set of variables. See also Armstrong (1967) and Lykken (1971).

What the question amounts to is that Lynn and the other authors mentioned have arranged their data (per varimax analysis) into only one of several quite different and equally defensible structures. They evidently had some thought that this might be the "true" structure. In following such a will-of-the-wisp they probably failed to attain the most useful structure.

One wonders whether the criticisms of factor analysis per se by Armstrong (1967) and others in the statistical literature have been considered by many of its users: Perhaps there is something of a cultural lag here. Factor analysis was in fact invented by Spearman, Thurstone and others for the purposes of the controversy over the structure of abilities mentioned above. Since it has been by and large abandoned in the purpose for which it was invented, it seems a little odd that social scientists in other fields have taken it up. Perhaps they in due course will experience a like sense of disillusionment with this technique. In conclusion then, let it be conceded that from time to time factor analysis may reveal interesting patterns of relationship between items. This however, can justify its use as an exploratory tool only. As a method of producing workable scales it is simply inefficient.



[1]. This is not so in cluster analysis (McQuitty, 1961). If one cluster analyses a set of data and then attempts to re-cluster-analyse just the items in any one cluster, one still ends up with the same single cluster that one started out with. McQuitty's cluster analysis is also superior to factor analysis in that it requires no arbitrary a priori decisions about parameter size (i e. what factors to rotate) or factor structure (orthogonal versus oblique). It has also been my experience that a cluster solution is in general more readily interpretable than a rotated principal components solution of the same data. This could well be because cluster analysis considers only the highest correlations in the matrix. There is thus a very much greater chance that all the observed correlations considered have some "true correlation" (and not just random error effects) underlying them. That this chance is realized in practice is also attested to by the work of Parker and Bynner (1970) and Gray and Revelle (1972). These authors are only two of several who have in recent years reported the empirically greater usefulness of cluster rather than factor analysis.


Anderson, D. S. and J. S. Western (1967) An inventory to measure students' attitudes. St. Lucia, Brisbane: University of Queensland Press.

Armstrong, J. S. (1967) "Derivation of theory by means of factor analysis or Tom Swift and his magic factor analysis machine." The American Statistician, 21: 17-21.

Beswick, D. C. and M. D. Hills (1969) "An Australian ethnocentrism scale." Aust. J. Psychol., 21: 211-226.

Costello, C. G. (1967) "Two scale to measure achievement motivation." J. Psych., 66: 231-235.

Finifter, A. (1970) "Dimensions of political alienation." American Political Science Review, 64: 389-410.

Gray, D. B. and W. Revelle (1972) "A cluster analyltic critique of the multifactor racial attitude inventory." Psych. Record, 22: 103-112.

Guilford, J. P. (1934) Psychometric methods. N.Y.: McGraw Hill.

Lykken, D. T. (1971) Multiple factor analysis and personality research." Journal of Experimental Research in Personality, 5: 161-170.

Lynn, R. (1969) "An achievement motivation questionnaire." Brit. J. Psychol., 60: 529-534.

McQuitty, L. C. (1961) "Elementary factor analysis." Psych. Reports, 9: 71-7s.

Neal, A. C. and S. Rettig (1967) "On the multidimensionality of alienation." American Sociological Review, 32: 54-64.

Nunnally, J. C. (1967) Psychometric theory. N.Y.: McGraw Hill.

Parker, S. R. and J. M. Bynner (1970) "Correlational analysis of data obtained from a survey of shop stewards." Human Relations, 23: 345-359.

Taft, R. (1963) "Applied social psychology, ecological studies of immigrant assimilation and scientific psychology." Aust. J. Psychol., 15: 149-161.

My article above produced a critical comment to which I gave the following rejoinder:

The Australian and New Zealand Journal of Sociology, 1975, Vol. 11 No. 1, p. 29


John J. Ray

University of New South Wales

Gow (1974) certainly makes more of an attempt to survey the aspirations of factor analysts than I pretended to do. He fails, however, to produce any evidence that these aspirations are realised. The one substantial instance of factor invariance he produces is that of Cattell and his 16PF. All this shows, however, is that if you use similar methods from time to time you get similar results. My query, however, concerns whether this result is in any sense the true or correct result. My assertion that these results can only be useful (rarely) and not correct, Gow has in no way impugned.

Gow and I seem to be in agreement that sociologists are making increasing use of factor analysis. The difference is that I deplore this trend as a waste of time and computing funds whereas Gow, for reasons still not clear, is more optimistic.


Gow, D. J. (1974) Some hidden dimensions of factor analysis: A reply to J. J. Ray'. Australian and New Ze'aland Journal of Sociology, 10 (3): 184-186.

Go to Index page for this site

Go to John Ray's "Tongue Tied" blog (Backup here)
Go to John Ray's "Dissecting Leftism" blog (Backup here)
Go to John Ray's "Australian Politics" blog (Backup here)
Go to John Ray's "Gun Watch" blog (Backup here)
Go to John Ray's "Education Watch" blog (Backup here)
Go to John Ray's "Socialized Medicine" blog (Backup here)
Go to John Ray's "Political Correctness Watch" blog (Backup here)
Go to John Ray's "Greenie Watch" blog (Backup here)
Go to John Ray's "Food & Health Skeptic" blog (Backup here)
Go to John Ray's "Leftists as Elitists" blog (Not now regularly updated)
Go to John Ray's "Marx & Engels in their own words" blog (Not now regularly updated. Backup here)
Go to John Ray's "A scripture blog" (Not now regularly updated)
Go to John Ray's recipe blog (Not now regularly updated -- Backup here)

Go to John Ray's Main academic menu
Go to Menu of recent writings
Go to John Ray's basic home page
Go to John Ray's pictorial Home Page (Backup here)
Go to Selected pictures from John Ray's blogs (Backup here)
Go to Another picture page (Best with broadband)