Real Climate has published something of a review of Montford’s book. When I read it (yes, I do read Real Climate, as much as I can – that is, I tend to have enough time, I just find it difficult to digest nonsense 😉 it occurred to me that it was a perfect example of bad criticism, and thus – as I’ve mentioned before – something which I do think falls into an area where I have a little expertise.
First, it will be worth summarising one of the arguments that Montford makes. Rather helpfully for the statistically challenged, like myself, Montford takes time to explain what Principal Components analysis (PC analysis) actually does: it sifts raw statistical data in order to extract significant information (notable patterns). Crucially, each ‘sifting’ extracts less useful information than the last, so PC1 is very useful but each successive PC is less so. Montford: “while the PC1 might explain 60% of the total variance, by the time you get to PC4, you might be talking about only 6 or 7%. In other words, the PC4 is not telling you much of any significance at all”. Montford uses this very helpful analogy:
“The PCs are often described as being like the shadow cast by a three-dimensional object. Imagine you are holding an object, say a comb, up to the sunlight, and it is casting a shadow on the table in front of you. There are lots of ways you could hold the comb, each of which would cast a different shadow onto the table, but the one which tells you the most about the object is when you expose the face of the comb to the light. When you do this, the sun passes between the teeth and you can see all the individual points. You can tell from the shadow that what is being held up is a comb. This shadow is analagous to the first PC. Now rotate the comb through a right angle, so that you are pointing the long edge of the comb to the sun. If you do this, the shadow cast is just a long thin line. You can see from the sahdow that you are holidng a long thin object, but it could be just about anything. This would be the second PC. It tells us something about the object, but not as much as the first PC. You can rotate through a right angle again and let the sunlight fall on the short edge of the comb. Here the shadow is almost meaningless. You can tell that something is being held up, but it’s impossible to draw any meaningful conclusions from it. This then, is the third PC.”
This is how Tamino ‘responds’:
For instance: one of the proxy series used as far back as the year 1400 was NOAMERPC1, the 1st “principal component” (PC1) used to represent patterns in a series of 70 tree-ring data sets from North America; this proxy series strongly resembles a hockey stick. McIntyre & McKitrick (hereafter called “MM”) claimed that the PCA used by MBH98 wasn’t valid because they had used a different “centering” convention than is customary. It’s customary to subtract the average value from each data series as the first step of computing PCA, but MBH98 had subtracted the average value during the 20th century. When MM applied PCA to the North American tree-ring series but centered the data in the usual way, then retained 2 PC series just as MBH98 had, lo and behold — the hockey-stick-shaped PC wasn’t among them! One hockey stick gone.
Or so they claimed. In fact the hockey-stick shaped PC was still there, but it was no longer the strongest PC (PC1), it was now only 4th-strongest (PC4). This raises the question, how many PCs should be included from such an analysis? MBH98 had originally included two PC series from this analysis because that’s the number indicated by a standard “selection rule” for PC analysis (read about it here).
MM used the standard centering convention, but applied no selection rule — they just imitated MBH98 by including 2 PC series, and since the hockey stick wasn’t one of those 2, that was good enough for them. But applying the standard selection rules to the PCA analysis of MM indicates that you should include five PC series, and the hockey-stick shaped PC is among them (at #4). Whether you use the MBH98 non-standard centering, or standard centering, the hockey-stick shaped PC must still be included in the analysis.
The truth is that whichever version of PCA you use, the hockey-stick shaped PC is one of the statistically significant patterns. There’s a reason for that: the hockey-stick shaped pattern is in the data, and it’s not just noise it’s signal. Montford’s book makes it obvious that MM actually do have a selection rule of their own devising: if it looks like a hockey stick, get rid of it.
So – Tamino’s argument is that because the hockey-stick shape emerges with the fourth ‘cut’ it still counts as statistically significant. Although he accepts that the standard convention is to use just two passes (= PC1 and PC2) he goes on to say “applying the standard selection rules to the PCA analysis of MM indicates that you should include five PC series, and the hockey-stick shaped PC is among them (at #4)”. (Please shout if I’ve misunderstood the substantive point that Tamino is making here.)
Can people see why I find this an inadequate response to Montford? Montford explains PC analysis at length, and a significant element of the argument is that the #4 cut doesn’t give useful data. Tamino at first accepts this (with a link expanding the acceptance) but then seems to go back on himself by simply asserting that five series should be included, and that the hockey-stick shape (#4) is significant. Why? Where is the argument for this?
There are ways in which Montford could be shot down here – and I would imagine that a competent statistician, familiar with these issues, could do it quite swiftly _if_ Montford is wrong. My point is a broader one – purely as a matter of rhetoric, Montford has the more compelling argument. He makes a point and explains it in detail – I understand the argument that Montford is making and it seems coherent. Tamino’s response is very different, in effect it is merely an assertion, which we are to take ‘on authority’. As the authority of the realclimate site is – for me – completely shot, the argument falls.
If there is another place where realclimate defends the statistical usefulness of a PC4 analysis, I’d be interested to read it.