cc: "Cawley Gavin Dr (CMP)" <G.Cawley@uea.ac.uk>, "'Philip D. Jones'" <p.jones@uea.ac.uk>, Gavin Schmidt <gschmidt@giss.nasa.gov>, "Thorne, Peter" <peter.thorne@metoffice.gov.uk>, Tom Wigley <wigley@cgd.ucar.edu>
date: Fri, 31 Oct 2008 00:48:23 -0600
from: Tom Wigley <wigley@ucar.edu>
subject: Re: Possible error in recent IJC paper
to: santer1@llnl.gov

<x-flowed>
SEE CAPS

Ben Santer wrote:
> Dear Gavin,
> 
> Thanks very much for your email, and for your interest in our recent 
> paper in the International Journal of Climatology (IJoC). There is no 
> error in equation (12) in our IJoC paper. Let me try to answer the 
> questions that you posed.
> 
> The first term under the square root in our equation (12) is a standard 
> estimate of the variance of a sample mean - see, e.g., "Statistical 
> Analysis in Climate Research", by Francis Zwiers and Hans von Storch, 
> Cambridge University Press, 1999 (their equation 5.24, page 86). The 
> second term under the square root sign is a very different beast - an 
> estimate of the variance of the observed trend. As we point out, our d1* 
> test is very similar to a standard Student's t-test of differences in 
> means (which involves, in its denominator, the square root of two pooled 
> sample variances).
> 
> In testing the statistical significance of differences between the model 
> average trend and a single observed trend, Douglass et al. were wrong to 
> use sigma_SE as the sole measure of trend uncertainty in their 
> statistical test. Their test assumes that the model trend is uncertain, 
> but that the observed trend is perfectly-known. The observed trend is 
> not a "mean" quantity; it is NOT perfectly-known. Douglass et al. made a 
> demonstrably false assumption.
> 
> Bottom line: sigma_SE is a standard estimate of the uncertainty in a 
> sample mean - which is why we use it to characterize uncertainty in the 
> estimate of the model average trend in equation (12). It is NOT 
> appropriate to use sigma_SE as the basis for a statistical test between 
> two uncertain quantities. The uncertainty in the estimates of both 
> modeled AND observed trend needs to be explicitly incorporated in the 
> design of any statistical test seeking to compare modeled and observed 
> trends. Douglass et al. incorrectly ignored uncertainties in observed 
> trends.
> 
> I hope this answers your first question, and explains why there is no 
> inconsistency between the formulation of our d1* test in equation (12) 
> and the comments that we made in point #3 [immediately before equation 
> (12)]. As we note in point #3, "While sigma_SE is an appropriate measure 
> of how well the multi-model mean trend can be estimated from a finite 
> sample of model results, it is not an appropriate measure for deciding 
> whether this trend is consistent with a single observed trend."
> 
> We could perhaps have made point #3 a little clearer by inserting 
> "imperfectly-known" before "observed trend".

WE COULD ADD THIS, BUT BE CAREFUL. THE **SAMPLE** TREND **IS** PERFECTLY 
KNOWN. AFTER ALL, THIS IS A WELL-DEFINED NUMBER. WHAT IS UNCERTAIN IS 
THE POPULATION TREND THAT IT IS AN ESTIMATE OF.

  I thought, however, that
> the uncertainty in the estimate of the observed trend was already made 
> very clear in our point #1 (on page 7, bottom of column 2).
> 
> To answer your second question, d1* gives a reasonably flat line in 
> Figure 5B because the first term under the square root sign in equation 
> (12) (the variance of the model average trend, which has a dependence on 
> N, the number of models used in the test) is roughly a factor of 20 
> smaller than the second term under the square root sign (the variance of 
> the observed trend, which has no dependence on N). The behaviour of d1* 
> with synthetic data is therefore dominated by the second term under the 
> square root sign - which is why the black lines in Figure 5B are flat.
> 
> In answer to your third question, our Figure 6A provides only one of the 
> components from the denominator of our d1* test (sigma_SE). Figure 6A 
> does not show the standard errors in the observed trends at discrete 
> pressure levels. Had we attempted to show the observed standard errors 
> at individual pressure levels, we would have produced a very messy 
> Figure, since Figure 6A shows results from 7 different observational 
> datasets.
> 
I HOPE THIS IS CLEAR IN THE TEXT OR CAPTION.

> We could of course have performed our d1* test at each discrete pressure 
> level. This would have added another bulky Table to an already lengthy 
> paper. We judged that it was sufficient to perform our d1* test with the 
> synthetic MSU T2 and T2LT temperature trends calculated from the seven 
> radiosonde datasets and the climate model data. The results of such 
> tests are reported in the final paragraph of Section 7. As we point out, 
> the d1* test "indicates that the model-average signal trend (for T2LT) 
> is not significantly different (at the 5% level) from the observed 
> signal trends in three of the more recent radiosonde products (RICH, 
> IUK, and RAOBCORE v1.4)." So there is no inconsistency between the 
> formulation of our d1* test in equation (12) and the results displayed 
> in Figure 6.
> 
> Thanks again for your interest in our paper, and my apologies for the 
> delay in replying to your email - I have been on travel (and out of 
> email contact) for the past 10 days.
> 
> With best regards,
> 
> Ben
> 
> Cawley Gavin Dr (CMP) wrote:
>>
>>
>> Dear Prof. Santer,
>>
>>    I think there may be a minor problem with equation (12) in your 
>> paper "Consistency of modelled and observed temperature trends in the 
>> tropical trophosphere", namely that it includes the standard error of 
>> the models 1/n_m s{<b_m>}^2 instead of the standard deviation 
>> s{<b_m>}^2.  Firstly the current formulation of (12) seems at odds 
>> with objection 3 raised at the start of the first column of page 8.  
>> Secondly, I can't see how the modified test d_1^* gives a flat line in 
>> Figure 5B as the test statistic is explicitly dependent on the size of 
>> the model ensemble n_m.  Thirdly, the equation seems at odds with the 
>> results depicted graphically in Figure 6 which would suggest the 
>> models are clearly inconsistent at higher levels (400-850 hPa) using 
>> the confidence interval based on the standard error.  Lastly, (12) 
>> seems at odds with the very lucid treatment at RealClimate written by 
>> Dr Schmidt.

BEN -- DID YOU RESPOND TO THIS? BY THE WAY, I NOTE THAT GAVIN SCHMIDT IS
NOT A STATISTICIAN.
>>
>> I congratulate all 17 authors for an excellent contribution that I 
>> have found most instructive!

VERY PLEASING COMMENT !!!!
>>
>> I do hope I haven't missed something - sorry to have bothered you if 
>> this is the case.
>>
>> best regards
>>
>> Gavin
>>
> 
> 

</x-flowed>