|
Post by curiousgeorge on Apr 5, 2010 11:54:49 GMT
|
|
|
Post by socold on Apr 5, 2010 13:39:06 GMT
I bet climate models are better tested and quality controlled compared to most scientific software. There are also different teams implementing the same systems independently - not many software systems can claim to enjoy that level of redundancy and independent validation!
Climate models can be compared against each other and against hundreds of aspects of observed climate. This represents a massive bank of regression tests that will help prevent bugs being introduced. There are also independently organized test sets which climate model developers apply to take part in, the objective being sheer competition - to see which models better represent various aspects of observed climate.
Climate models are also heavily developed over time, they are extensible with extra components being added over time - even with separate models being merged together (eg carbon cycle model being inserted into a GCM). Given all the above any climate model that was insufficiently software engineered would fall by the wayside and be overtaken by those that are better developed. So there is an active pressure for good software engineering practices in the field.
|
|
|
Post by curiousgeorge on Apr 6, 2010 1:52:33 GMT
Steve Easterbrook has a list of climate models here: www.easterbrook.ca/steve/?p=667 . I checked a few of the links, and not much info about the software engineering/reliability/quality control of these models. Mostly focused on advertising new features, which as any software engineer knows, tends to result in new problems for a variety of reasons. CCSM4 www.ccsm.ucar.edu/csm/models/ccsm4.0/ at least provided release notes and a test record - www.ccsm.ucar.edu/models/ccsm4.0/tags/ccsm4_0_rel/ - kudos for that much anyway. The others, nothing that I could find; some went to a 404 error page. Overall, I'm not reassured that the many different models and the updates to them (which happen on a variety of schedules ) are properly managed (configuration control, etc. ) in a way that would provide confidence in their outputs. There are some statements on some of the model web pages that "plug-in" modules have been incorporated in the model software. I'm very uneasy with this since potential problems may have been introduced via modules of unknown pedigree or from unknown sources. Problems could manifest themselves in any number of ways; similar to the sort of issue's a browser can have with plugins such as Flash, or others. Although it is currently fashionable to sing the praises of Open-Source software, this practice when applied to this level of software , does not inspire confidence. Especially given the downstream social, political and economic consequences this has. Some will allow download of source code, some won't. Enjoy. PS: It's understandable that these models are still "in-work" in many cases, but as a corollary; Who among us would fly in a plane in which the flight control software was still in beta-test?
|
|
|
Post by scpg02 on Apr 6, 2010 4:57:01 GMT
Mostly focused on advertising new features, which as any software engineer knows, tends to result in new problems for a variety of reasons. LOL from a joke about engineers:
|
|
|
Post by poitsplace on Apr 6, 2010 5:48:56 GMT
I bet climate models are better tested and quality controlled compared to most scientific software. There are also different teams implementing the same systems independently - not many software systems can claim to enjoy that level of redundancy and independent validation! Climate models are among the LEAST verified. Do not ever presume that just because there aren't a lot of errors in the code that cause crashing or oddball behavior...that the underlying concepts of the models are intact. A video game might not crash but the world it models has nothing at all to do with the REAL world (at least, I haven't seen any zombies about). Again you mix up software errors with errors in the basic concepts behind the models. You cannot verify a model with other models...I'm sorry but you're just an idiot if you believe this. They just released my favorite video game on the Mac...and STILL there are no zombies in the real world. By your logic, if there are two or more versions of the same simulation there MUST be zombies (if the simulation is of zombies). Reality is simply...inconvenient. We had an underlying warming trend. Is our warming that continued warming? Is our warming entirely CO2? Is it a mix of the two? We had a change of the ocean currents right when there was cooling...as well as an increase in industrial output...both starting at the end of WWII. Was there suddenly a MASSIVE increase in aerosol output that caused cooling? Was it almost exclusively the PDO? Was it a mix of the two? Then...darn it...we had ANOTHER change in the PDO right around the time we pushed newer clean air standards. Was the warming from the PDO and other currents? Was it form the loss of aerosol cooling? Was it some of both? What about that pre-existing warming trend? There is just no way the basic premise of the models can be verified. That any "scientist" would continue to assert their validity should be a warning flag.
|
|
|
Post by steve on Apr 6, 2010 10:40:29 GMT
Steve Easterbrook's preliminary validation of one major climate model identified 0.3 errors per 10,000 lines of code. Space Shuttle software has around 3 times as many errors per 10,000 lines of code according to Easterbrook. He reported that here, but was sent away with a flea in his ear.
Obviously, given the complexity of the problem, identifying that a model is doing what you intended it to do is just a first step. But surely even a headline figure that suggests such a low error level should undermine any claim that the software standards are uniformally bad.
Poitsplace claims it is impossible to verify a model. I don't know what he means by that since most models in many areas of science are currently unverifiable yet produce successful predictions. Demanding that GCMs accurately forecast a certain weather phenomenon when we know that the supercomputers are not powerful enough to reach the required resolution and that the observations are not good enough to initialise such a model is simply an attempt to set the bar impossibly high.
The converse argument is not to allow any major changes to atmospheric content until an engineering quality study is complete. I think I will stay away from mining analogies today, but hopefully you will get my drift.
If we could produce a model that was as good (or bad) at simulating the weather and climate as the current crop, yet demonstrated feedbacks substantially less than those existing GCMs, then that would be interesting.
|
|
|
Post by poitsplace on Apr 6, 2010 11:03:21 GMT
I said "You cannot verify a model with other models...I'm sorry but you're just an idiot if you believe this."
If you write two models based on the same hypothesis...they may not be identically coded but they should get the same results (or fairly close). But if your hypothesis is crap...then both of the models will be wrong in spite of the fact that they were in agreement. I used an example of a video game written for two different platforms. The fact that both model an imaginary world does not mean their results (which match extremely well) apply to the REAL world.
ALSO Verify (verb) to confirm the truth of. Validate (verb) to prove valid; show or confirm the validity of something.
The models have not been verified or validated. They do terribly in the short term and if it's medium to long term the predictions have not come to pass. The models are NOT validated or verified and I'll thank you to not make this assertion again as you are now aware that it is without any shadow of a doubt...a lie
|
|
|
Post by curiousgeorge on Apr 6, 2010 12:14:11 GMT
It seems that folks tend to focus on coding errors. Coding errors are only one of several issues, and can be equated to typo's. Buffer overflow's, logic and syntax errors, security holes, and relational anomalies in databases are much more difficult to track down and eliminate. Proper design and documentation (including version control ) can prevent much of this, and that is what seems to be lacking. A software development process that would provide confidence in that software would follow IEEE/EIA 12207 or similar standards. An overview of this standard which replaced the earlier MIL-STD-498 can be found here: sepo.spawar.navy.mil/SW_Standards.html . As can be seen, it encompasses far more than a simple "bug hunt" .
|
|
|
Post by steve on Apr 6, 2010 12:20:32 GMT
True. But you have ignored the fact that the models *have* been validated against the real world (which is not the same as saying that they are perfect representations of the real world). As you said in your follow-up post you are unhappy with the level of validation. But you seem not to understand that 98% of model validation is done before CO2 levels in the model are increased. The models are validated against climatology, not against global warming. Good examples of the 2% of validation done after changes in atmospheric components would be projections of warming done in the 1970s and 1980s that were followed by sequentially warmest decades, cooling following Pinatubo, stratospheric cooling and increases in humidity. Though for many phenomena, the climatological changes are hard to determine due to poorer levels observations in the past.
Technically, verification is shorthand for proving that you've followed good procedures to convert the plan or theory into good quality code. The error levels in the code measured by Steve Easterbrook suggest that the models are well-verified - the projected warming might be incomplete science but is not a memory leak.
|
|
|
Post by steve on Apr 6, 2010 12:39:17 GMT
It seems that folks tend to focus on coding errors. Coding errors are only one of several issues, and can be equated to typo's. Buffer overflow's, logic and syntax errors, security holes, and relational anomalies in databases are much more difficult to track down and eliminate. Proper design and documentation (including version control ) can prevent much of this, and that is what seems to be lacking. Errors in climate models can be and are detected by running them repeatedly, varying input parameters and varying computing platform (eg. the climateprediction.net project involved running a climate model on a PC which I expect has a different configuration to many supercomputers!) My own experience of building models (not climate models) is that you rely as much on a physics-based understanding of what should or should not happen happen which gives you lots of ideas of how to test the model to make it show its bugs. Most of the current crop of models have evolved over many years and many generations of supercomputer technology. This sort of testing would identify buffer overflows, logic and syntax errors, and so forth which I assume would be included in the 0.3 errors per 10,000 LOC figure that was given. www.cs.toronto.edu/~sme/papers/2008/Easterbrook-Johns-2008.pdf
|
|
|
Post by curiousgeorge on Apr 6, 2010 13:00:32 GMT
It seems that folks tend to focus on coding errors. Coding errors are only one of several issues, and can be equated to typo's. Buffer overflow's, logic and syntax errors, security holes, and relational anomalies in databases are much more difficult to track down and eliminate. Proper design and documentation (including version control ) can prevent much of this, and that is what seems to be lacking. Errors in climate models can be and are detected by running them repeatedly, varying input parameters and varying computing platform (eg. the climateprediction.net project involved running a climate model on a PC which I expect has a different configuration to many supercomputers!) My own experience of building models (not climate models) is that you rely as much on a physics-based understanding of what should or should not happen happen which gives you lots of ideas of how to test the model to make it show its bugs. Most of the current crop of models have evolved over many years and many generations of supercomputer technology. This sort of testing would identify buffer overflows, logic and syntax errors, and so forth which I assume would be included in the 0.3 errors per 10,000 LOC figure that was given. www.cs.toronto.edu/~sme/papers/2008/Easterbrook-Johns-2008.pdfI should have waited a few minutes before editing my last, so I'll repeat it here as a courtesy: " A software development process that would provide confidence in that software would follow IEEE/EIA 12207 or similar standards. An overview of this standard which replaced the earlier MIL-STD-498 can be found here: sepo.spawar.navy.mil/SW_Standards.html . As can be seen, it encompasses far more than a simple "bug hunt" . " Simply because a system has been in development for many years is not evidence of it's quality - MS Windows for example. If those models have been developed with the above standard or similar adhered to, then it would be to the developers benefit to say so, and provide documentation of same. I haven't been able to find such documentation (of adherence to best practices/standards ), so if you have please share.
|
|
|
Post by steve on Apr 6, 2010 13:50:03 GMT
curiousgeorge, I did not reference to the longevity to validate the development process, I referenced it to point out that the error count of 0.3 errors per 10,000 more than likely would have included the sorts of errors you referred to, as they are the sorts of errors that get picked up when you move from platform to platform and from compiler to compiler. eg buffer overflows on one machine will silently overwrite another array with garbage, and on another machine will cause a fatal error.
There are many standards around described by acronyn+number, but essentially they amount to procedures for ensuring that code does what is expected, usually by ensuring, through process of recording issues, reviewing changes and doing appropriate testing, that it is has an acceptable design and contains few errors. What is your evidence that there is no acceptable procedure for climate models?
The Easterbrook paper says in its conclusions:
|
|
|
Post by curiousgeorge on Apr 6, 2010 14:29:53 GMT
curiousgeorge, I did not reference to the longevity to validate the development process, I referenced it to point out that the error count of 0.3 errors per 10,000 more than likely would have included the sorts of errors you referred to, as they are the sorts of errors that get picked up when you move from platform to platform and from compiler to compiler. eg buffer overflows on one machine will silently overwrite another array with garbage, and on another machine will cause a fatal error. There are many standards around described by acronyn+number, but essentially they amount to procedures for ensuring that code does what is expected, usually by ensuring, through process of recording issues, reviewing changes and doing appropriate testing, that it is has an acceptable design and contains few errors. What is your evidence that there is no acceptable procedure for climate models? The Easterbrook paper says in its conclusions: I didn't say there was no evidence of acceptable procedure, I said I couldn't find any documentation to support adherence to recognized standards, which I think would be of value in making the case for believing the predicted outcomes of these models; which are being used to justify enormous expenditure and redistribution of wealth on a global scale, radical modification of living standards, population reduction, abandonment of fossil fuels, etc., etc. If I had some model which I expected to use as justification to literally change the world, you can bet I would have every single duck I could find lined up and marching in step, and there would be a brass band advertising it. That isn't happening from what I can tell.
|
|
|
Post by steve on Apr 6, 2010 15:37:01 GMT
curiousgeorge,
OK. Well whether or not you could find any documentation, it appears that an external expert with experience with space flight software is more than happy with the procedures.
But given the way that models are validated (through comparison with elements of the real world), arguably the most important, and certainly the most interesting, documentation is the validation documentation which is the results published in scientific papers. The most perfectly designed and structured model is uninteresting if it predicts an ice age next Christmas. The worst designed piece of code written with lots of GOTO statements, recursive loops in one giant subroutine, that manages to predict weather and climate for the next 2 months would be very interesting, though its design would probably make adding new science very hard.
Also, do you believe the results from interplanetary space missions even though you have never seen the documentation for the space craft or their instrumentation?
|
|
|
Post by poitsplace on Apr 6, 2010 17:04:25 GMT
Good examples of the 2% of validation done after changes in atmospheric components would be projections of warming done in the 1970s and 1980s that were followed by sequentially warmest decades, cooling following Pinatubo, stratospheric cooling and increases in humidity. Though for many phenomena, the climatological changes are hard to determine due to poorer levels observations in the past. It is always amazing to me...even though I know what deficiency causes the problem...when people do things like this. Yes, they made a prediction for the 80s and 90s...and then at the end of the 90s they suddenly discovered the PDO and the temperature increase leveled off NOT where the CO2 forcing hypothesis said...but where an ocean-current dominated model said. Before this period you people literally have nothing. There are literally no explanations of the numerous flip-flops of the holocene. Also, the behavior relative to CO2 during the glacial stages simply doesn't rule out CO2 forcing with absolute certainty. Most importantly it, sure as heck doesn't support it in any way.
|
|