To: P.Jones@uea.ac.uk, "Tom Wigley" <email@example.com>
Subject: Re: HadCRUT2v
Date: Tue Dec 13 13:07:32 2005
Cc: "Ben Santer" <firstname.lastname@example.org>
attached is a plot of the monthly anomalies from the only box with non-missing data in the
bottom row of Phil's grid (centred at 87.5 S). This is from HadCRUT2v that I picked up
from the CRU data store in June this year.
Clearly the dates Tom listed are missing in my version too. Furthermore, the values from
1971-1975 are abnormal. They are not all identical, but are all near zero. Perhaps
multiplied by 0.1?
Similar problems are apparent in HadCRUT and CRUTEM2v too.
But CRUTEM2 has no gaps and no abnormal periods at the South Pole, so perhaps CRUTEM2 is
fine? Tom - if it's urgent, you could extract the South Pole time series from CRUTEM2 and
use it to overwrite the other 3 data sets until Phil corrects them.
Regarding the weighting issue...
Given that the grid doesn't have equal-area boxes, there are always going to be compromises
with weighting. Even if you do something to sort out the problem at the S. Pole, how about
the isolated boxes around the coast of Antarctica, which will be given much less weight
than an isolated box in the tropics which might also have only 1 station in. This is
partly reasonable because of differences in spatial correlation of temperatures between
tropics and high latitudes, but I'm sure that they don't compensate exactly.
Specifically for the poles...
Putting the temperature data into a single box will clearly underweight its contribution in
area averages (is it significant from a practical point of view once you get to hemispheric
or global scales though?).
Replicating it into all boxes in the bottom row will, on the other hand, gives it too much
weight. If the area weighting is calculated simply as cos(latitude) then the South Pole
data will be given this weighting:
72*cos(87.5) = 3.14
whereas one box on the equator (or just off) will be given this weighting:
1*cos(2.5) = 1.00
so, if replicated around all boxes at 87.5 S, the South Pole would have three times the
weight of a single tropical box (compared with 23 times less weight if South Pole data
appears in only one box).
Perhaps put it in every fourth box, giving a weighting of 0.79 (bit less than tropical,
which is reasonable for spatial correlation reasons)?
At 04:11 13/12/2005, P.Jones@uea.ac.uk wrote:
In NZ at the IPCC meeting. Will be here until Dec 17.
When I get back I'm off to Switzerland for Christmas on
The South Pole shouldn't be missing. I have all the
data for Amundsen-Scott from 1957. I put the data in at
one 5 degree grid box, so it doesn't get overweighted.
The South Pole should be at the last grid box (2592)
in the 72 by 36 array. Putting the data in all 87.5-90S
boxes would overweight the S.Pole stations.
There isn't any data at the N. Pole.
Maybe Tim could check on the missing S.Pole data.
I reckon it should be there in all the datasets CRUTEM2
and HadCRUT2 and the v versions.
> Why is there so much missing data for the South Pole? The period Jan 75
> Dec 90 is all missing except Dec 81, July & Dec 85, Apr 87, Apr & Sept 88,
> Apr 89. Also, from and including Aug 2003 is missing.
> Also -- more seriously but correctable. The S Pole is just represented
> by a single
> box at 87.5S (N Pole ditto I suspect). This screws up area averaging. It
> would be
> better to put the S Pole value in ALL boxes at 87.5S.
> I have had to do this in my code -- but you really should fix the 'raw'
> gridded data.
> For area averages, the difference is between having the S Pole represent
> the whole
> region south of 85S, and having (as now) it represent one 72nd of this
> region. It
> is pretty obvious to me what is better.
> This affects the impression of missing data too of course.