date: Mon, 24 May 2004 09:50:06 -0400 from: "Lisa Johnson" subject: Manuscript 1096109 to: Dear Dr. Osborn, Thank you for agreeing to review the following manuscript for Science: "Reconstructing Past Climate from Noisy Data" The paper is available as a pdf file at this www site: http://www.submit2science.org/mtsreferee/. Supplementary material and other materials attached to this email. To access this site, use the following: LoginID: Osborn Password: 175025 The www site also provides a copy of our instructions and a convenient form to upload your review. We recommend that you type your review in a separate word processor and paste the text into the review form online. If you have attachments such as figures or complex formatting such as equations that cannot be conveyed in text form, you may send these by email to science_reviews@aaas.org and cc'd to me at my email below or by FAX to 202-289-3649. In your email, please be sure to indicate the manuscript number and first author that are indicated on the www form, and if you return the full review by email or FAX, please indicate your rating and recommended action. Please note that the manuscript is a confidential document and should not be shared with any colleague. If you feel that this is necessary for a thorough review, please inform me first. Please delete the copy of the manuscript, and keep a copy of your review. Please return your review by 06/07/2004. Previous Comments: Review 1: The paper goes to the heart of a very important question for climate science, presents its arguments clearly and is likely to have a strong influence on this field in the future. I therefore recommend it for publication. A few minor things can be fixed, and I list these below. Page 2. The second last sentence on the page is bewildering to this reviewer - the reference to 20th century data and samples `from earlier periods` are mentioned I get the feeling that something should have been explained a little earlier in the text. Page 3. What is sigma_T? A brief statement should be inserted. Page 3. middle paragraph ends with `...also been used for climate reconstructions.` A reference or two here to such work would not be out of place. Page 5. Middle paragraph, second line `...and add noise to the series...` I s that `white noise`? I expect it is but in principle the authors might be considering general types of noise and the red-ness of such noise could be a factor in the analysis, so a statement about the `colour` of the noise at that place would be nice. Page 7. Middle paragraph. I do not believe the word `representativity` is really an attractive English word ... perhaps the authors could rephrase that sentence? (Not a major point!) Page 7. Middle paragraph, line 3. The word `than` should be removed, I think. Page 7. 3rd line from bottom - `the variations`. Is `variance` intended? Page 7. 2nd line from bottom - `during in the` ... Is `in` surplus? Page 10. Caption to Figure 1: The editor may like to advise whether the use of `.. in a test whether...` lacks an `of` or not. Your previous comments: The topic of this manuscript is of great interest, to a broad range of scientists, and thus this manuscript is potentially suitable for publication in Science. Interest in it would be focussed on their findings that Mann et al.’s 1998/1999 (hereafter MBH) reconstruction of Northern Hemisphere temperature over the last 1000 years might be biased to have too little variability on the century and longer time scale. MBH’s reconstruction was given a high profile within the IPCC third assessment report, and continues to attract much debate, as well as being used as (just one) part of the scientific support for climate-related policy decisions. Investigations into the reliability of MBH’s reconstruction, but also that move science forward by identifying better or improved methods, are certainly publishable. The shortcomings of this manuscript are that the investigation of potential problems with the MBH method is incomplete, identifying possible problems but not attempting to explain or understand them, and then there is no consideration to what methods would be better, or whether all methods suffer from the same potential biases. If the authors could address these concerns, plus a number of other comments that I have listed below, then the manuscript might become suitable for publication in Science, in my opinion. The study reported in the manuscript sub-samples and degrades output from a climate model, to generate a set of pseudo climate proxies. They then use the method of MBH to attempt to reconstruct the model’s known Northern Hemisphere (NH) temperature, by calibrating a transfer function relating the pseudo proxies with the NH temperature patterns. For various sets of pseudo proxies, with various levels of degradation by random noise, the reconstructions systematically underestimate the multi-century-scale variations in temperature during the simulation of the AD 1000 to AD 1990 period. Even with good proxy coverage, with no degradation of the series, the method apparently underestimates the long time scale fluctuations. This is interesting and vitally important, and it raises many questions – none of which seem to be answered in the manuscript: (1) WHY? What is the reason for underestimating the long time scale variations? The authors get about as far as saying that all regression is usually associated with some loss of variance. Is that it? Do their conclusions thus apply to all regression methods, or just to MBH? (2) Leading on from (1), the authors must therefore see if their conclusions apply to other methods. The simplest would be to take an unweighted average of all pseudo proxies and then calibrate this average using simple linear regression against the NH temperature. Does that also underestimate the multi-century variance? How about weighted averages, or the principal component regression method? (3) My pre-conception is that these methods would lose variance, but that the difference between the reconstruction and the actual temperature would be a random error, sometimes positive, sometimes negative. MBH and others already take this into account with their uncertainty ranges, plotted either side of the reconstruction. Do the results here imply that the difference between reconstructed and actual is (i) outside this uncertainty range; and/or (ii) that it is always in one direction, rather than being a random error of either sign? Related to this, perhaps the error is not always the same sign, but is always on the side nearer to zero (or nearer to the calibration period)? (4) Questions in (3) might be clarified if the authors had computed and shown the uncertainty ranges on their reconstruction attempts. These are essential for comparing the real and reconstructed temperatures. How do the uncertainty ranges expand as the noise levels increase? (5) Have they correctly implemented the (rather complex) MBH method? Not enough detail is given regarding the method – they simply say “the method of MBH”. While not wanting a full repeat of the method (though this might be useful as supplementary information, since it is central to the paper) some indication should be given as to how similar their implementation is to MBH: do they use the same number of temperature EOFs, the same way of computing the EOFs (from monthly data, not annual), the same calibration and verification periods, etc. Can they demonstrate correct implementation of the MBH method by reproducing the MBH results when using instrumental and proxy data? (6) The authors do not place their results in the context of previous work (this criticism is partly related to point (1), in that an explanation of why they obtain their results needs also to take account of whether other work fits in with their explanation). For example, Mann and Rutherford (Geophys. Res. Lett. 29, doi:10.1029/2001GL014554, 2002) use a different regression-based technique and find that the variance is well captured (though perhaps not at the longer time scales, as evidenced by apparent bias in the early reconstruction of their Figure 2). And Rutherford et al. (J. Climate 16, 462-479, 2003) discuss biased long-term variations under certain conditions (when calibration period is trend-free, but reconstruction period is not), but that does not correspond directly to the present case with a strong trend in the calibration period. And Zorita et al. (J. Climate 16, 1378-1390) is referenced (with the wrong page number!) by the present manuscript, though its results are hardly mentioned, even though Zorita is a co-author on the present manuscript. Zorita et al. also used the MBH method, albeit with a control simulation with less long time scale variance, and reported enhanced skill at long time scales (though perhaps, in agreement with the new work, loss of variance – see their Figure 11?). The authors really must tie in their results with an interpretation of these previous papers, otherwise we are no further forward in our attempts to understand how best to reconstruct past NH temperatures. Some more minor problems: (1) The equation for the variance term on paragraph 2 of page 3 is wrong. (2) Referencing to MBH98 and MBH99 is muddled – in some places it should be MBH99 instead of MBH98, others it should be MBH98 instead of MBH99, and sometimes it is actually correct! (3) The “conflicting reconstructions of the NAO before 1850 (20=Schmutz et al.)” seems irrelevant as an explanation of why variance is lost, because the conflicting reconstructions of the NAO referred to are of the real-world NAO, and thus have nothing to do with what the NAO happens to do in the model used in this study. Review 3: General remarks: The manuscript has merit and will probably make a useful contribution to Science, but requires major revisions to achieve this. I will address several topics for consideration and improvement below. Major comments: -One important point is that the authors use the 112 proxy indicators in their modelling approach back to 1000. Fact is, that Mann et al. 1998 and 1999 had this number available back to 1820. For the earlier centuries, the number of proxies decrease significantly. I therefore suggest, that the authors account for this fact and reduce the number of predictors the same way as discussed in MBH98/99. Some differences reported in this manuscript could maybe be related to this fact. -The second point is that the authors add the same noise level to each grid boxes. As discussed in Jones et al. (1998, The Holocene); Mann and Rutherford (2002, Geophys. Res. Lett), Pauling et al. (2003, Geophys. Res. Lett) and Mann and Jones (2003, Geophys. Res. Lett.) and also mentioned by the authors, the correlation between proxy data and instrumental grid point varies. Instrumental data correlate higher than natural proxies. I therefore suggest that the authors first calculate the correlation between the used proxy data and the corresponding instrumental temperature grid point over a defined period (i.e. 1902-1980), calculate the noise level separately for each gridpoint based on these findings and then rerun their analysis with the model. This would ensure to be ‘closer’ to the reality than keeping the same noise level. -As MBH98/99 are only one realisation of the past ‘reality’, the ECHO-G is only one realisation of the ‘model-world’. I therefore encourage the authors to do a similar analysis with other paleo model runs from the Hadley Center (HadCM3.0) and/or the CSM1.4 from NCAR. They both have forced and control runs for 1000 years. This would allow a comparison between at least two models, which could then be related to the Mann et al. results. This in fact would be a very nice result. -For comparison reasons it would help to plot the Mann et al. reconstructions and the uncertainties in Figure 2. -I encourage the authors also to discuss their results in terms of uncertainties, both in the model as well as in Mann et al. reconstructions. The previous point is related to this. The authors mention the model deficiencies on page 4, but do not quantify them. -The structure of the paper must be improved. Methodological and technical details, aspects and descriptions should be moved to the supplementary online section. Especially the first 6 pages need improvement, e.g. between the introduction and the methods section there is a paragraph that contains results/conclusions. This should be moved to the end. Further, the introduction is not clearly structured with several repetitions and without a real question in which aspect the paper contribute to the community? -There are a number of current studies dealing with ‘pseudoproxies’ (Mann and Rutherford, 2002, Geophys. Res. Lett., Rutherford et al. 2003, J. Climate; Pauling et al. 2003, Geophys. Res. Lett), methods, assumptions and applications to large-scale temperature data. I believe it is worth considering them as well. -Page 6, last para (The first hypothesis): More explanations for this para is needed. Maybe the authors could show this in the supplementary material section? In my opinion the inclusion of instrumental data does improve the reconstruction, at least locally or regionally. On the other hand, if all grid boxes are averaged over the NH, this contribution might get lost. -Page 7, 2nd hypothesis: I wonder why the authors have chosen gridpoints from the SH since they intend to show NH reconstructions? How much is the improvement of including the 15 series? Further, maybe gridpoint data over the ocean rather than from land could improve the results? As in my major point above, the number of proxies as well as the kind of added noise should be considered as well. -A major point is also the selection of the additional gridpoints. Are they chosen randomly? If the authors first searched for key regions for NH temperature and then selected the gridpoints maybe this would have led to bigger improvements. -Page 7, 2nd para, 3rd hypothesis: I did not fully understand this approach. I would be nice if the authors could explain this in more detail. -Page 7, concluding para: I do not fully agree with the conclusions. I wonder if the same conclusions can be drawn when the points above are addressed? -I also miss the implications of this study for future reconstructions. What do the authors suggest to capture centennial variability in reconstructions? Specific comments on the manuscript -Abstract: should be more clearly written. -Page 3, last para: Who showed that there is a loss in variance in multiple regression? Refs should be added. -Page 4, last para: Medieval Warm Period and Little Ice Age are mentioned. These terms have to be defined. See for instance the recent paper of Bradley et al. (2003, Science) and reference therein. -Page 5, last para: …,the reconstructions would coincide with the simulated temperature: As mentioned above, it would help to plot also the Mann et al. data together with the uncertainties. -Page 6, 2nd para: I do not understand the connection to reference 20. -Page 6, 2nd para: We conclude that the reconstruction of the climate I my opinion, this has not been shown. More clarification is needed for this statement. -Page 7, 1st para: This para implies the wrong conclusion that sparseness of proxy data is no problem. In fact, it is a problem as many reconstructions have shown, cf. comment on the selection of the proxies above. -Page 6, 2nd para: The large loss of variance with m=0 reflects above all deficiencies of the reconstruction method. This should be discussed. The testing of various reconstruction methods would eliminate the ‘method effect’ and greatly increase the confidence in the results. -Page 6, last sentence: is it local or NH reconstructed temperature anomalies? -Page 7, 3rd para: As the authors have shown the first conclusion is only valid if the calibration period does not cover the whole range of variability and if the proxy data contains high noise levels. This should be stated. -Caption Figure 3: Add “temperature” at the end of the first sentence. Additionally, the authors should check the colours in the figure. In my printed version the caption did not agree with the figure. -The authors cite the study of Esper et al. (2002) who discuss century scale variability. However, this paper has not been used in the discussion. -The authors may consider the references in the introduction of Mann and Rutherford (2002, GRL), which deal with reconstructions of large-scale fields based on sparse data. —-------------------------------------------------------------------------------------------- Thank you for your assistance. Please let me know if you have any further questions. Sincerely, H. Jesse Smith hjsmith@aaas.org Attachment Converted: "c:\documents and settings\tim osborn\my documents\eudora\attach\1096109s.pdf" Attachment Converted: "c:\documents and settings\tim osborn\my documents\eudora\attach\1096109coverletter.doc"