Eat that GW believers!
-
- Posts: 526
- Joined: Sun Aug 31, 2008 7:19 am
The modelers paid for by the US public are a few dozen people, and their models are open source and completely transparent. Firing them would result in one less model being open source and transparent. That would be bad for climate science across the board.
What you seem to be missing here is that sitting down and looking at stations individually on a case by case method will still result in GHCN analysis methodology.
Instead of Peterson and his team grabbing the data off of the NCDC database, and using software to compile it in to something useful, you want people to do it, individually clicking "convert NCDC file to GHCN file." One by one, then taking the output, and confirming it mathematically with proof solvers or something.
It is a ridiculous premise. I have parsed huge datasets before, you build error checking that spits out a signal if the output or input is suspect. And *that* is what the GHCN guys go over with a fine tooth comb (because, while it is easy to discard a station, it's better to include it since more data is always better).
This is specifically why GHCN asks any scientists who work on the datasets to *please update them* if they find discarded sets that actually do fit with reality (such as a news paper article saying it was really 80 degrees in Fairbanks in the middle of winter).
What you seem to be missing here is that sitting down and looking at stations individually on a case by case method will still result in GHCN analysis methodology.
Instead of Peterson and his team grabbing the data off of the NCDC database, and using software to compile it in to something useful, you want people to do it, individually clicking "convert NCDC file to GHCN file." One by one, then taking the output, and confirming it mathematically with proof solvers or something.
It is a ridiculous premise. I have parsed huge datasets before, you build error checking that spits out a signal if the output or input is suspect. And *that* is what the GHCN guys go over with a fine tooth comb (because, while it is easy to discard a station, it's better to include it since more data is always better).
This is specifically why GHCN asks any scientists who work on the datasets to *please update them* if they find discarded sets that actually do fit with reality (such as a news paper article saying it was really 80 degrees in Fairbanks in the middle of winter).
Science is what we have learned about how not to fool ourselves about the way the world is.
We already have rough error checking with site ratings. Since only about 20% of the sites are any good. That would be 20% of 1221 (or so) About 250 sites give or take.Josh Cryer wrote:The modelers paid for by the US public are a few dozen people, and their models are open source and completely transparent. Firing them would result in one less model being open source and transparent. That would be bad for climate science across the board.
What you seem to be missing here is that sitting down and looking at stations individually on a case by case method will still result in GHCN analysis methodology.
Instead of Peterson and his team grabbing the data off of the NCDC database, and using software to compile it in to something useful, you want people to do it, individually clicking "convert NCDC file to GHCN file." One by one, then taking the output, and confirming it mathematically with proof solvers or something.
It is a ridiculous premise. I have parsed huge datasets before, you build error checking that spits out a signal if the output or input is suspect. And *that* is what the GHCN guys go over with a fine tooth comb (because, while it is easy to discard a station, it's better to include it since more data is always better).
This is specifically why GHCN asks any scientists who work on the datasets to *please update them* if they find discarded sets that actually do fit with reality (such as a news paper article saying it was really 80 degrees in Fairbanks in the middle of winter).
Random sample those and Check 30. See what that looks like. Then decide if more work is needed.
But without a QC gauge calibration you don't know what you have got.
Engineering is the art of making what you want from what you can get at a profit.
-
- Posts: 526
- Joined: Sun Aug 31, 2008 7:19 am
All 1219 are good enough.MSimon wrote:We already have rough error checking with site ratings. Since only about 20% of the sites are any good. That would be 20% of 1221 (or so) About 250 sites give or take.
ftp://ftp.ncdc.noaa.gov/pub/data/papers ... tpfree.pdf
I'd suggest random sampling those that were removed when going from the full NCDC database to USHCN.Random sample those and Check 30. See what that looks like. Then decide if more work is needed.
From 19,000 (nineteen thousand) to 1219, why? Possibly records weren't long enough, possibly they failed the quality control checks, who knows. It would be cool to reproduce the USHCN methods.
Note that GHCN does include discarded stations in their raw files, but a good part of their database comes from USHCN.
You can still tell if people are manipulating the data (by tweaking the raw values or not following the methodology).But without a QC gauge calibration you don't know what you have got.
Science is what we have learned about how not to fool ourselves about the way the world is.
-
- Posts: 526
- Joined: Sun Aug 31, 2008 7:19 am
I wonder if I, erm, I mean, anyone would get in terrible trouble if they made available all NCDC data. The fact that I have to use this slow ass VPN to access it really grinds my gears!
(Check my IP MSimon, posting from UCCS, heheh.)
(Check my IP MSimon, posting from UCCS, heheh.)
Science is what we have learned about how not to fool ourselves about the way the world is.
-
- Posts: 526
- Joined: Sun Aug 31, 2008 7:19 am
Notes on ROSEBURG KQEN:
Data starts 01/10/1965, original TOD was 0600, it moved to 1800 on 01/11/1965. It wasn't until 01/01/1982 that TOD moved to 2200 but then on 01/10/1982 it moved to 0800 and remains there until this day.
Relatively unchanged throughout its history (1800 vs 0800, similar temp curve), so I don't think TOD adjustments result in the normalized graph we saw on the previous page of this thread. It's probably more likely purely due to UHI adjustments.
I have all of the NCDC data but it's all in numerically named files that are random (since the NCDC database gives you a randomized link). I need to figure out how to sort it out.
Data starts 01/10/1965, original TOD was 0600, it moved to 1800 on 01/11/1965. It wasn't until 01/01/1982 that TOD moved to 2200 but then on 01/10/1982 it moved to 0800 and remains there until this day.
Relatively unchanged throughout its history (1800 vs 0800, similar temp curve), so I don't think TOD adjustments result in the normalized graph we saw on the previous page of this thread. It's probably more likely purely due to UHI adjustments.
I have all of the NCDC data but it's all in numerically named files that are random (since the NCDC database gives you a randomized link). I need to figure out how to sort it out.
Science is what we have learned about how not to fool ourselves about the way the world is.
All 1219 are good enough?
Fine. That gives an error band of about 2C. We still know nothing.
In QC to detect a signal it has to be 2 SDs outside the MOE. Let us be generous and say the MOE is 2C. So any rise/fall under 4C could very well be noise (random variation). So what do we know FOR SURE.
Not much. And that is just the temp record.
Proof is still required to show that it is not something besides CO2, (WV, ocean currents, GCRs) doing the warming.
Now I do admit that the science is somewhat more cautious than the summary for policy makers.
In any case we have 20 or 30 years of peer reviewed papers to check to see if personal biases have slanted the results.
At this point due to the corruption of the whole field (papers suppressed, data fudged) We know nothing. Certainly not enough for carbon taxes.
Fine. That gives an error band of about 2C. We still know nothing.
In QC to detect a signal it has to be 2 SDs outside the MOE. Let us be generous and say the MOE is 2C. So any rise/fall under 4C could very well be noise (random variation). So what do we know FOR SURE.
Not much. And that is just the temp record.
Proof is still required to show that it is not something besides CO2, (WV, ocean currents, GCRs) doing the warming.
Now I do admit that the science is somewhat more cautious than the summary for policy makers.
In any case we have 20 or 30 years of peer reviewed papers to check to see if personal biases have slanted the results.
At this point due to the corruption of the whole field (papers suppressed, data fudged) We know nothing. Certainly not enough for carbon taxes.
Engineering is the art of making what you want from what you can get at a profit.
I believe you about the IP (I could check). I believe you are mistaken about some things (it happens), but I have never known you to be intentionally false.Josh Cryer wrote:I wonder if I, erm, I mean, anyone would get in terrible trouble if they made available all NCDC data. The fact that I have to use this slow ass VPN to access it really grinds my gears!
(Check my IP MSimon, posting from UCCS, heheh.)
Engineering is the art of making what you want from what you can get at a profit.
It seems that the more we discover about the climate cabal the less we know about climate:
http://planetgore.nationalreview.com/po ... YzODgzMmM=
I knew things were bad before climategate. I just never imagined how bad things were.
http://planetgore.nationalreview.com/po ... YzODgzMmM=
I knew things were bad before climategate. I just never imagined how bad things were.
Burt Rutan On Global Warming:
http://wattsupwiththat.com/2010/01/03/a ... l-warming/
Strangely, he end up the same place I did, before climategate.
http://wattsupwiththat.com/2010/01/03/a ... l-warming/
Strangely, he end up the same place I did, before climategate.
-
- Posts: 526
- Joined: Sun Aug 31, 2008 7:19 am
An engineer disbelieving in AGW, shock.Jccarlton wrote:Burt Rutan On Global Warming:
http://wattsupwiththat.com/2010/01/03/a ... l-warming/
Strangely, he end up the same place I did, before climategate.
Science is what we have learned about how not to fool ourselves about the way the world is.
-
- Posts: 526
- Joined: Sun Aug 31, 2008 7:19 am
MSimon, the climate record doesn't rely on GHCN alone, the error bars are shrunk with proxies. But I know you guys don't trust any of the records. Without reason, imo, as we see here in D'Alio's case. (Disseminating mistrust on a record that is in fact being analyzed wrong.)
I do have a problem, though. I noticed that most of NCDC data is in F, over thousands of stations, converting F to C (and rounding / dropping a decimal point) I believe could introduce some bit of error. That's just my intuition speaking, though, I haven't really tested it. If I'm right then they can reduce their errors somewhat by not doing that, and sticking with F for those data which are in F.
It looks like D'Alio's "Central Park raw" data is homogenized in F, if GISS converts to C, then homogenized, it could introduce the slight discrepancy that we see in the comparison (not enough to indicate that it isn't homogenized though).
Finding the software algorithm for the homogenization would be very useful. The papers exist, and I believe with about 60-80 hours of programming I could reproduce them (this doesn't include research, which could be easily another 100 hours), but it'd be nice not to have to do that work myself. That's more than a month of my free time on one thing.
Fortunately all the stations have their GPS coordinates, so the station-to-station homogenization process should be extremely easy (distance on a quasi-sphere, very simple stuff). If I could reproduce it I think it might make more people willing to accept it as a legitimate process, plus we could always remove it from the data (it doesn't actually have a significant effect on global data, I think less than 1% total), and even adjust it with various parameters to see how it behaves.
I do have a problem, though. I noticed that most of NCDC data is in F, over thousands of stations, converting F to C (and rounding / dropping a decimal point) I believe could introduce some bit of error. That's just my intuition speaking, though, I haven't really tested it. If I'm right then they can reduce their errors somewhat by not doing that, and sticking with F for those data which are in F.
It looks like D'Alio's "Central Park raw" data is homogenized in F, if GISS converts to C, then homogenized, it could introduce the slight discrepancy that we see in the comparison (not enough to indicate that it isn't homogenized though).
Finding the software algorithm for the homogenization would be very useful. The papers exist, and I believe with about 60-80 hours of programming I could reproduce them (this doesn't include research, which could be easily another 100 hours), but it'd be nice not to have to do that work myself. That's more than a month of my free time on one thing.
Fortunately all the stations have their GPS coordinates, so the station-to-station homogenization process should be extremely easy (distance on a quasi-sphere, very simple stuff). If I could reproduce it I think it might make more people willing to accept it as a legitimate process, plus we could always remove it from the data (it doesn't actually have a significant effect on global data, I think less than 1% total), and even adjust it with various parameters to see how it behaves.
Science is what we have learned about how not to fool ourselves about the way the world is.
-
- Posts: 526
- Joined: Sun Aug 31, 2008 7:19 am
BTW, NCDC looks like where all of the weather companies get their historical data. Accuweather, Weather Underground, The Weather Channel, etc. So no wonder they make you pay for it! Good for them, though, as it probably does help pay their meteorologists and saves the taxpayers some money. They technically are "free for non-commercial use," they just make sure you use a .gov or .edu or .k12 address so if you do use it commercially you are breaking some rule of your domain (I'm sure in the UCCS VPN policy they say you can't use it for commercial reasons, etc).
Science is what we have learned about how not to fool ourselves about the way the world is.
-
- Posts: 526
- Joined: Sun Aug 31, 2008 7:19 am
NCDC raw data for Central Park fits exactly with USHCN v2 raw. But something is off. GHCN v2 raw goes from 1941 to 2009 for Central Park. None of the temperatures match NCDC raw.
305801 is the relevant station ID number.
Fortunately, GHCN does not include Central Park in their calculations because it failed quality control. Since GISS incorporates USHCN things seem to be OK for *this* station.
But it makes me wonder wtf is going on, why is their raw data so off. I will get to the bottom of this.
Several options:
1) Someone screwed up the conversion.
2) GHCN got lazy and started putting in random stuff for USHCN stations (since they incorporate USHCN anyway), this way they didn't have to significantly alter their parser (let quality control check it).
3) Someone intentionally put arbitrary values (all of which are cold) to mess with the data, quality control caught it.
Look at what GHCN Central Park looks like:
The first number is the station ID number (it's actually several stuck together, but you know what I mean), the rest are monthly averages (the last digit in each field is the decimal point). As you can see, in the twenties all year long is highly unlikely. It completely doesn't fit with USHCN or NCDC in any way. (So no conspiracies that this is really the temperature record for Central Park, please!)
Here's the USHCN/NCDC record (as I said, USHCN matches):
Note that the station ID number is the same, USHCN just removes the WBAN number from their field. This matches NCDC raw data which you can ask me to email if you want. It's 12 MB, hourly temperature measurements. Oddly the monthly averages only go back to 1941. The hourly data actually includes monthly averages, though, so it's weird.
And just so we're clear, GHCN v2 does have that station in v2.mean.failed.qc. Those arbitrarily cold values were not included in the data.
Oh, and, BTW.
9641C_200907_F52.avg.gz from the USHCN v2 ftp ( ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/v2/monthly/ ) matches “raw GHCN data + USHCN corrections” on data.giss.nasa.gov ( http://data.giss.nasa.gov/gistemp/station_data/ ).
"raw GHCN data" is a misnomer (if that isn't an understatement). It's no more raw than processed chili beans.
305801 is the relevant station ID number.
Fortunately, GHCN does not include Central Park in their calculations because it failed quality control. Since GISS incorporates USHCN things seem to be OK for *this* station.
But it makes me wonder wtf is going on, why is their raw data so off. I will get to the bottom of this.
Several options:
1) Someone screwed up the conversion.
2) GHCN got lazy and started putting in random stuff for USHCN stations (since they incorporate USHCN anyway), this way they didn't have to significantly alter their parser (let quality control check it).
3) Someone intentionally put arbitrary values (all of which are cold) to mess with the data, quality control caught it.
Look at what GHCN Central Park looks like:
Code: Select all
3058011000001941 233 236 243 228 237 232 229 232 224 222 220 220
3058011000001942 223 228 227 224 219 225 223 226 221 210 213 212
3058011000001943 214 217 208 216 221 222 230 226 230 214 218 220
3058011000001944 224 221 229 232 210 212 226 223 216 215 216 216
3058011000001945 226 229 233 215 208 228 227 224 220 206 210 216
3058011000001946 219 222 223 221 218 235 232 232 227 213 210 216
3058011000001947 219 228 230 222 215 225 221 222 214 205 211 217
3058011000001948 226 229 230 215 224 228 225 227 224 216 216 216
3058011000001949 222 219 227 222 221 220 217 216 214 208 210 215
3058011000001950 219 215 214 219 213 214 227 213 223 206 206 212
3058011000001951 211 217 227 222 225 235 224 233 220 212 214 222
3058011000001952 229 234 225 218 222 224 224 226 217 218 210 217
3058011000001953 221 232 227 221 218 227 230 235 215 212 213 220
3058011000001954 229 220 228 216 221 220 220 223 224 204 213 216
3058011000001955 221 228 218 217 223 217 218 222 215 204 210 213
3058011000001956 207 210 223 221 216 215 222 221 215 202 212 215
3058011000001957 222 224 224 220 220 228 231 230 218 210 221 225
3058011000001958 239-9999 247 243 229 236-9999 237 238-9999 221 227
3058011000001959 232 232 228 221 213 213 220 219 219 203 207 213
3058011000001960 214 220 215 212 216 214 216 216 215 209 210 210
3058011000001961 218 218 219 215 226 218 215 222 212 207 202 215
3058011000001962 215 221 211 212 210 201 225 215 216 204 203 212
3058011000001963 218 208 217 208 213 225 218 225 219 208 203 219
3058011000001964 224 227 230 208 220 205 212 211 211 204 201 203
3058011000001965 207 213 218 205 207 219 226 218 214 206 202 210
3058011000001966 222 227 222 218 217 212 218 216 214 202 199 206
3058011000001967 212 207 209 212 212 214 214 222 212 205 204 212
3058011000001968 210 210 213 210 206 208 218 219 206 204 206 210
3058011000001969 209 224 238 218 222 220 229 216 214 206 210 213
3058011000001970 218 221 226 218 212 220 213 215 208 204 202 219
3058011000001971 201 205 201 209 203 212 216 208 206 202 202 208
3058011000001972 210 214 213 214 213 216 228 222 224 213 213 218
3058011000001973 231 234 229 225 219 214 217 212 202 207 206 199
3058011000001974 203 206 204 213 213 214 219 220 204 206 208 210
3058011000001975 219 208 213 211 207 216 203 212 202 199 199 192
3058011000001976 209 205 210 215 212 215 229 228 227 206 212 212
3058011000001977 219 221 231 218 217 220 223 222 217 209 209 215
3058011000001978 219 229 216 212 217 218 221 227 216 208 211 210
3058011000001979 215 223 220 220 217 223 225 217 211 213 210 214
3058011000001980 220 219 236 226 224 229 231 225 223 215 206 207
3058011000001981 215 224 220 215 211 212 217 214 214 204 211 213
3058011000001982 212 209 214 210 209 227 228 237 220 204 212 216
3058011000001983 234 237 227 221 230 228 232 231 224 212 212 206
3058011000001984-9999 215 224 211 215 218 216 218 206 209 211 216
3058011000001985 214 226 224 220 220 230 226 221 224 212 209 217
3058011000001987-9999-9999-9999-9999-9999-9999 234 238 225 218-9999 225
3058011000001988 234 231 240 222 229 218 223 213 216-9999 210 209
3058011000001989 217 221 213 227 221-9999-9999 229-9999-9999 222-9999
3058011000001990-9999-9999 227-9999-9999-9999-9999 241-9999-9999-9
Here's the USHCN/NCDC record (as I said, USHCN matches):
Code: Select all
30580131941 294 311 359 569 648 719 758 740 700 607 500 383 549
30580131942 307 297 430 539 653 710 763 736 683 588 470 311 541
30580131943 308 346 402 461 626 762 769 759 674 558 454 330 537
30580131944 339 332 377 489 670 723 794 778 701 571 460 324 547
30580131945 252 339 511 556 592 709 746 734 703 562 476 310 541
30580131946 341 317 498 504 619 697 754 709 698 617 505 379 553
30580131947 372 293 378 505 599 684 755 760 686 637 442 340 537
30580131948 253 307 421 510 602 694 772 749 703 567 524 383 540
30580131949 386 386 429 537 631 743 796 767 662 631 463 394 569
30580131950 414 316 364 485 588 703 750 731 646 600 484 349 536
30580131951 365 362 415 530 633 698 768 746 682 586 435 377 550
30580131952 362 362 402 550 607 736 803 747 699 557 486 384 558
30580131953 376 384 433 523 634 735 778 758 704 607 497 413 570
30580131954 308 401 416 538 598 716 767 727 674 617 464 359 549
30580131955 310 350 417 535 654 689 809 781 678 598 443 297 547
30580131956 320 366 374 482 587 714 729 742 648 581 467 409 535
30580131957 285 373 419 532 631 743 777 736 697 562 494 402 555
30580131958 319 274 403 529 591 672 761 752 676 555 479 294 525
30580131959 311 321 401 538 664 712 763 775 723 598 458 384 554
30580131960 339 363 333 541 626 718 746 749 680 581 497 309 540
30580131961 277 367 415 490 599 724 781 764 736 611 488 355 551
30580131962 326 318 431 533 644 725 740 724 649 574 432 315 534
30580131963 301 283 437 537 611 709 764 721 631 618 504 312 536
30580131964 357 329 431 497 654 716 754 729 672 550 494 364 546
30580131965 297 339 400 506 664 701 743 732 675 573 468 405 542
30580131966 322 351 427 497 616 754 797 769 665 562 489 357 551
30580131967 374 292 376 496 552 728 753 739 667 572 425 382 530
30580131968 267 289 433 550 596 697 773 760 706 605 469 343 541
30580131969 318 326 401 559 653 731 748 774 690 577 464 334 548
30580131970 251 330 387 521 640 709 771 776 708 589 485 344 543
30580131971 270 351 401 508 614 742 778 759 716 627 451 408 552
30580131972 351 314 398 501 633 679 772 756 695 535 444 385 538
30580131973 355 325 464 534 595 734 774 776 695 602 483 390 561
30580131974 353 317 421 552 610 690 772 764 667 541 482 394 547
30580131975 373 358 402 479 658 705 758 744 642 592 523 359 549
30580131976 274 399 444 550 602 732 748 743 666 529 417 299 534
30580131977 221 335 468 537 650 702 790 757 682 549 473 357 543
30580131978 280 272 390 516 615 713 744 760 650 549 478 389 530
30580131979 336 255 469 526 653 692 769 768 705 573 525 411 557
30580131980 337 314 412 545 656 703 793 803 708 552 446 325 550
30580131981 263 393 423 562 648 730 785 760 676 544 477 365 552
30580131982 261 353 418 512 641 686 779 732 683 585 504 428 549
30580131983 345 364 440 523 602 734 795 777 718 579 489 352 560
30580131984 299 406 367 519 616 745 747 767 659 618 473 438 555
30580131985 288 366 458 555 653 686 762 754 705 595 500 342 555
30580131986 341 320 451 545 660 716 760 731 679 580 457 390 552
30580131987 323 332 452 534 636 728 780 742 677 538 477 395 551
30580131988 295 350 436 512 627 718 793 788 674 528 494 359 548
30580131989 374 345 424 522 621 720 750 740 681 582 457 259 539
30580131990 414 398 451 535 602 721 768 753 675 619 504 426 572
30580131991 349 400 446 557 687 741 777 771 675 584 483 396 572
30580131992 357 364 400 505 610 703 742 730 672 545 465 379 540
30580131993 363 308 397 533 657 733 802 772 673 560 488 373 555
30580131994 256 306 407 556 618 752 794 740 676 580 520 422 552
30580131995 375 316 450 519 619 718 792 786 683 616 436 324 553
30580131996 305 339 389 522 611 714 734 745 680 564 430 413 537
30580131997 322 400 419 517 594 709 758 733 670 567 445 383 543
30580131998 400 406 454 540 643 692 765 767 702 576 481 432 572
30580131999 339 389 425 535 631 732 814 755 692 560 508 400 565
30580132000 313 373 472 510 635 713 723 725 660 570 453 311 538
30580132001 337 359 396 540 636 730 732 787 677 585 527 441 562
30580132002 400 406 442 561 607 715 788 778 703 552 460 360 564
30580132003 275 301 431 498 587 684 758 767 680 551 500 376 534
30580132004 248 350 436 536 652 713 745 743 694 560 482 384 545
30580132005 313 366 395 552 589 740 776 797 733 579 497 353 557
30580132006 409 358 431 557 631 710 780 758 666 563 519 437 568
30580132007 375 282 422 504 652 713 750 740 704 636 454 370 550
30580132008 365 358 428 550 601 741 784 738 688 552 459 382 554
30580132009 280 368 424 546 625 675 727 -9999 -9999 -9999 -9999 -9999 -9999
And just so we're clear, GHCN v2 does have that station in v2.mean.failed.qc. Those arbitrarily cold values were not included in the data.
Oh, and, BTW.
9641C_200907_F52.avg.gz from the USHCN v2 ftp ( ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/v2/monthly/ ) matches “raw GHCN data + USHCN corrections” on data.giss.nasa.gov ( http://data.giss.nasa.gov/gistemp/station_data/ ).
"raw GHCN data" is a misnomer (if that isn't an understatement). It's no more raw than processed chili beans.
Science is what we have learned about how not to fool ourselves about the way the world is.
Something rotten here perhaps? That's sort of what we have been saying all along. That's what a lot of people have been saying.Josh Cryer wrote:NCDC raw data for Central Park fits exactly with USHCN v2 raw. But something is off. GHCN v2 raw goes from 1941 to 2009 for Central Park. None of the temperatures match NCDC raw.
305801 is the relevant station ID number.
Fortunately, GHCN does not include Central Park in their calculations because it failed quality control. Since GISS incorporates USHCN things seem to be OK for *this* station.
But it makes me wonder wtf is going on, why is their raw data so off. I will get to the bottom of this.
Several options:
1) Someone screwed up the conversion.
2) GHCN got lazy and started putting in random stuff for USHCN stations (since they incorporate USHCN anyway), this way they didn't have to significantly alter their parser (let quality control check it).
3) Someone intentionally put arbitrary values (all of which are cold) to mess with the data, quality control caught it.
Look at what GHCN Central Park looks like:
The first number is the station ID number (it's actually several stuck together, but you know what I mean), the rest are monthly averages (the last digit in each field is the decimal point). As you can see, in the twenties all year long is highly unlikely. It completely doesn't fit with USHCN or NCDC in any way. (So no conspiracies that this is really the temperature record for Central Park, please!)Code: Select all
3058011000001941 233 236 243 228 237 232 229 232 224 222 220 220 3058011000001942 223 228 227 224 219 225 223 226 221 210 213 212 3058011000001943 214 217 208 216 221 222 230 226 230 214 218 220 3058011000001944 224 221 229 232 210 212 226 223 216 215 216 216 3058011000001945 226 229 233 215 208 228 227 224 220 206 210 216 3058011000001946 219 222 223 221 218 235 232 232 227 213 210 216 3058011000001947 219 228 230 222 215 225 221 222 214 205 211 217 3058011000001948 226 229 230 215 224 228 225 227 224 216 216 216 3058011000001949 222 219 227 222 221 220 217 216 214 208 210 215 3058011000001950 219 215 214 219 213 214 227 213 223 206 206 212 3058011000001951 211 217 227 222 225 235 224 233 220 212 214 222 3058011000001952 229 234 225 218 222 224 224 226 217 218 210 217 3058011000001953 221 232 227 221 218 227 230 235 215 212 213 220 3058011000001954 229 220 228 216 221 220 220 223 224 204 213 216 3058011000001955 221 228 218 217 223 217 218 222 215 204 210 213 3058011000001956 207 210 223 221 216 215 222 221 215 202 212 215 3058011000001957 222 224 224 220 220 228 231 230 218 210 221 225 3058011000001958 239-9999 247 243 229 236-9999 237 238-9999 221 227 3058011000001959 232 232 228 221 213 213 220 219 219 203 207 213 3058011000001960 214 220 215 212 216 214 216 216 215 209 210 210 3058011000001961 218 218 219 215 226 218 215 222 212 207 202 215 3058011000001962 215 221 211 212 210 201 225 215 216 204 203 212 3058011000001963 218 208 217 208 213 225 218 225 219 208 203 219 3058011000001964 224 227 230 208 220 205 212 211 211 204 201 203 3058011000001965 207 213 218 205 207 219 226 218 214 206 202 210 3058011000001966 222 227 222 218 217 212 218 216 214 202 199 206 3058011000001967 212 207 209 212 212 214 214 222 212 205 204 212 3058011000001968 210 210 213 210 206 208 218 219 206 204 206 210 3058011000001969 209 224 238 218 222 220 229 216 214 206 210 213 3058011000001970 218 221 226 218 212 220 213 215 208 204 202 219 3058011000001971 201 205 201 209 203 212 216 208 206 202 202 208 3058011000001972 210 214 213 214 213 216 228 222 224 213 213 218 3058011000001973 231 234 229 225 219 214 217 212 202 207 206 199 3058011000001974 203 206 204 213 213 214 219 220 204 206 208 210 3058011000001975 219 208 213 211 207 216 203 212 202 199 199 192 3058011000001976 209 205 210 215 212 215 229 228 227 206 212 212 3058011000001977 219 221 231 218 217 220 223 222 217 209 209 215 3058011000001978 219 229 216 212 217 218 221 227 216 208 211 210 3058011000001979 215 223 220 220 217 223 225 217 211 213 210 214 3058011000001980 220 219 236 226 224 229 231 225 223 215 206 207 3058011000001981 215 224 220 215 211 212 217 214 214 204 211 213 3058011000001982 212 209 214 210 209 227 228 237 220 204 212 216 3058011000001983 234 237 227 221 230 228 232 231 224 212 212 206 3058011000001984-9999 215 224 211 215 218 216 218 206 209 211 216 3058011000001985 214 226 224 220 220 230 226 221 224 212 209 217 3058011000001987-9999-9999-9999-9999-9999-9999 234 238 225 218-9999 225 3058011000001988 234 231 240 222 229 218 223 213 216-9999 210 209 3058011000001989 217 221 213 227 221-9999-9999 229-9999-9999 222-9999 3058011000001990-9999-9999 227-9999-9999-9999-9999 241-9999-9999-9
Here's the USHCN/NCDC record (as I said, USHCN matches):
Note that the station ID number is the same, USHCN just removes the WBAN number from their field. This matches NCDC raw data which you can ask me to email if you want. It's 12 MB, hourly temperature measurements. Oddly the monthly averages only go back to 1941. The hourly data actually includes monthly averages, though, so it's weird.Code: Select all
30580131941 294 311 359 569 648 719 758 740 700 607 500 383 549 30580131942 307 297 430 539 653 710 763 736 683 588 470 311 541 30580131943 308 346 402 461 626 762 769 759 674 558 454 330 537 30580131944 339 332 377 489 670 723 794 778 701 571 460 324 547 30580131945 252 339 511 556 592 709 746 734 703 562 476 310 541 30580131946 341 317 498 504 619 697 754 709 698 617 505 379 553 30580131947 372 293 378 505 599 684 755 760 686 637 442 340 537 30580131948 253 307 421 510 602 694 772 749 703 567 524 383 540 30580131949 386 386 429 537 631 743 796 767 662 631 463 394 569 30580131950 414 316 364 485 588 703 750 731 646 600 484 349 536 30580131951 365 362 415 530 633 698 768 746 682 586 435 377 550 30580131952 362 362 402 550 607 736 803 747 699 557 486 384 558 30580131953 376 384 433 523 634 735 778 758 704 607 497 413 570 30580131954 308 401 416 538 598 716 767 727 674 617 464 359 549 30580131955 310 350 417 535 654 689 809 781 678 598 443 297 547 30580131956 320 366 374 482 587 714 729 742 648 581 467 409 535 30580131957 285 373 419 532 631 743 777 736 697 562 494 402 555 30580131958 319 274 403 529 591 672 761 752 676 555 479 294 525 30580131959 311 321 401 538 664 712 763 775 723 598 458 384 554 30580131960 339 363 333 541 626 718 746 749 680 581 497 309 540 30580131961 277 367 415 490 599 724 781 764 736 611 488 355 551 30580131962 326 318 431 533 644 725 740 724 649 574 432 315 534 30580131963 301 283 437 537 611 709 764 721 631 618 504 312 536 30580131964 357 329 431 497 654 716 754 729 672 550 494 364 546 30580131965 297 339 400 506 664 701 743 732 675 573 468 405 542 30580131966 322 351 427 497 616 754 797 769 665 562 489 357 551 30580131967 374 292 376 496 552 728 753 739 667 572 425 382 530 30580131968 267 289 433 550 596 697 773 760 706 605 469 343 541 30580131969 318 326 401 559 653 731 748 774 690 577 464 334 548 30580131970 251 330 387 521 640 709 771 776 708 589 485 344 543 30580131971 270 351 401 508 614 742 778 759 716 627 451 408 552 30580131972 351 314 398 501 633 679 772 756 695 535 444 385 538 30580131973 355 325 464 534 595 734 774 776 695 602 483 390 561 30580131974 353 317 421 552 610 690 772 764 667 541 482 394 547 30580131975 373 358 402 479 658 705 758 744 642 592 523 359 549 30580131976 274 399 444 550 602 732 748 743 666 529 417 299 534 30580131977 221 335 468 537 650 702 790 757 682 549 473 357 543 30580131978 280 272 390 516 615 713 744 760 650 549 478 389 530 30580131979 336 255 469 526 653 692 769 768 705 573 525 411 557 30580131980 337 314 412 545 656 703 793 803 708 552 446 325 550 30580131981 263 393 423 562 648 730 785 760 676 544 477 365 552 30580131982 261 353 418 512 641 686 779 732 683 585 504 428 549 30580131983 345 364 440 523 602 734 795 777 718 579 489 352 560 30580131984 299 406 367 519 616 745 747 767 659 618 473 438 555 30580131985 288 366 458 555 653 686 762 754 705 595 500 342 555 30580131986 341 320 451 545 660 716 760 731 679 580 457 390 552 30580131987 323 332 452 534 636 728 780 742 677 538 477 395 551 30580131988 295 350 436 512 627 718 793 788 674 528 494 359 548 30580131989 374 345 424 522 621 720 750 740 681 582 457 259 539 30580131990 414 398 451 535 602 721 768 753 675 619 504 426 572 30580131991 349 400 446 557 687 741 777 771 675 584 483 396 572 30580131992 357 364 400 505 610 703 742 730 672 545 465 379 540 30580131993 363 308 397 533 657 733 802 772 673 560 488 373 555 30580131994 256 306 407 556 618 752 794 740 676 580 520 422 552 30580131995 375 316 450 519 619 718 792 786 683 616 436 324 553 30580131996 305 339 389 522 611 714 734 745 680 564 430 413 537 30580131997 322 400 419 517 594 709 758 733 670 567 445 383 543 30580131998 400 406 454 540 643 692 765 767 702 576 481 432 572 30580131999 339 389 425 535 631 732 814 755 692 560 508 400 565 30580132000 313 373 472 510 635 713 723 725 660 570 453 311 538 30580132001 337 359 396 540 636 730 732 787 677 585 527 441 562 30580132002 400 406 442 561 607 715 788 778 703 552 460 360 564 30580132003 275 301 431 498 587 684 758 767 680 551 500 376 534 30580132004 248 350 436 536 652 713 745 743 694 560 482 384 545 30580132005 313 366 395 552 589 740 776 797 733 579 497 353 557 30580132006 409 358 431 557 631 710 780 758 666 563 519 437 568 30580132007 375 282 422 504 652 713 750 740 704 636 454 370 550 30580132008 365 358 428 550 601 741 784 738 688 552 459 382 554 30580132009 280 368 424 546 625 675 727 -9999 -9999 -9999 -9999 -9999 -9999
And just so we're clear, GHCN v2 does have that station in v2.mean.failed.qc. Those arbitrarily cold values were not included in the data.
Oh, and, BTW.
9641C_200907_F52.avg.gz from the USHCN v2 ftp ( ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/v2/monthly/ ) matches “raw GHCN data + USHCN corrections” on data.giss.nasa.gov ( http://data.giss.nasa.gov/gistemp/station_data/ ).
"raw GHCN data" is a misnomer (if that isn't an understatement). It's no more raw than processed chili beans.