Skip to content

Reproducibility problems in the natural sciences

After reading my news article on the replication crisis, Mikael Wolfe writes:

While I’m sure there is a serious issue about replication in social science experiments, what about the natural sciences? You use the term “science” even though you don’t include natural sciences in your piece. I fear that climate and other science deniers will use your piece as ammunition that peer-reviewed science is “junk” and therefore no action on climate change and other environmental problems is warranted.

My reply: In climate science it is difficult to do any real replication because we can’t rerun the global climate, so any replications will necessarily be very model based. Regarding the natural sciences more generally, there have been many high-profile replication failures in biology and medicine. In biology it can be notoriously difficult for people from one lab to replicate studies from other labs. Finally, regarding your last point: I don’t think that uncertainty should be a reason for doing nothing. After all, we are uncertain about what countries might attack us, but that does not stop us from spending money on national defense. We should be able to acknowledge uncertainty and still make decisions.


  1. Bob says:

    Who cares who uses what as ‘ammunition’?

    That’s a very revealing turn of phrase. For example, say you had a ‘fear’ that ‘science deniers’ (whatever they are) would misuse a piece of information. Would you be tempted to suppress that piece of information? Falsify a result? Take steps to stop it being ‘peer reviewed’ and hence capable of being dismissed?

    Put differently, given the problems with peer review, academic incentives, prevalence of activists and true believers vs truth seekers, what are the chances that ‘climate science’ (whatever that is) is immune? That the issues raised in this blog exist in every area of science except this one?

    If you discovered it was not immune, would you speak up for the truth or not?

  2. PG says:

    I am not sure if your analogy with national defense works.
    1. Wars have been happening since forever, we have countless cases to learn from. Human caused climate change never happened before.
    2. Wars are caused by humans only, while climate change has a significant natural component.
    3. Costs of war are well understood, costs of climate change are based on projections. There are hardly any benefits of war, while there might be benefits of climate change.

    In a nutshell these two cases are marked with significantly different uncertainty.

    • Andrew says:


      I’m not sure how much human caused climate change has happened before, but human caused environmental change has definitely happened, associated with deforestation, agriculture, etc. Also, I don’t know that the costs of war are so well understood. In any case, I agree that the two cases are different. The general point is that we need to make decisions under uncertainty.

  3. Matt Skaggs says:

    “Climate change” is actually an engineering problem known as a “control volume” (can also be considered a “control mass”) in that all inputs and outputs in and out of a black box are estimated. There are very well-established methods in engineering for evaluating control volumes.

    The model predictions could indeed be “replicated.” But there is only one way to do it, since as Andrew pointed out, we cannot re-run the climate, nor do we have a laboratory mock-up. Engineers usually have a mock-up – a test stand – that can be used for experimentation.

    Climate models could be validated with direct measurement data. The problem is that it is really hard to do and we are only beginning now to have enough useful data to even start doing anything. As a specific example, a key intermediate step in global warming theory is outward migration of the “CO2 Pause,” essentially the height above the earth where CO2 concentration goes to zero. That could be detected with satellites. One problem is that we don’t have data from before the onset, but even data showing the predicted change within the satellite era would help a lot. AGW theory also involves changes in magnitude of energy at specific wavelengths, which can be measured. “Capacitance,” the engineering term for the energy stored within the system, can be measured by estimating ocean heat content. If we had all this data, we could start to build the confidence that climate scientists claim to already have.

    • gec says:

      > the confidence that climate scientists claim to already have

      I think this is missing the point Andrew was trying to make—it is possible to have certainty about the best course of action to pursue separately from the degree of certainty associated with the components of evidence used to motivate that decision. We can be reasonably certain of needing at least X many troops, tanks, bombs, etc. not because we are certain we are going to have a battle that needs exactly that many to win, but because it is clear that the consequences of *not* having at least X many of those things are far worse than the consequences of having more than we needed.

      When you are using the term “confidence”, it is not that most scientists are supremely confident in the precision of their predictive models, it is that they are confident in the courses of action that should be taken since the consequences are so potentially dire. Decision making involves combining predictive uncertainty with expected payoffs, and sometimes those payoffs are enough to outweigh uncertainty.

    • Adrian says:

      Isn’t this more of a Decision Theory problem than one related to replication? I don’t think the data can provide a clear answer on the possible causal mechanisms related to climate change (particularly since we can’t rerun the climate) but we can try to see if the past predictions have been accurate and, if not, how have climate models been wrong in the past.

      My understanding so far (this is far from being my expertise) is that, if anything, their past predictions regarding climate-related outcomes have been more optimistic than what’s actually happened in these last few years (Although I’m not sure about their predictions on other outcomes such as mortality, migration and GDP growth).

      • Matt Skaggs says:

        “I don’t think the data can provide a clear answer on the possible causal mechanisms related to climate change (particularly since we can’t rerun the climate)”

        No, that is just not true. If there were sufficient data, it absolutely could be used to determine whether the surmised processes are actually operative.

        You can’t rerun a Concorde crash either, but there are powerful analytical approaches that can determine why it crashed beyond any reasonable doubt.

        To wade into Decision Theory when you can just go get the data…not my idea of good science practice.

        • Martha (Smith) says:

          Matt asid,
          ” If there were sufficient data, it absolutely could be used to determine whether the surmised processes are actually operative….
          To wade into Decision Theory when you can just go get the data…not my idea of good science practice.”

          Please elaborate on what you think the needed data are, and how they could be obtained. Your statement “when you can just go get the data” sounds like you think that there are not great obstacles to getting the data. Is that is indeed the case? If that is not what you intend, please clarify what you were trying to say.

        • Adrian says:

          Hi Matt,

          I don’t think I agree you can’t rerun the Concorde crash all over again. Every launch that was carried out under comparable conditions and after carrying out the changes suggested by the inquiry on the crash could be regarded as a “rerun”, or at least a test I don’t see how the same can plausibly hold for climate change research.

          I also don’t understand why would it be an issue to use Decision Theory (or perhaps another optimization approach) to make decisions on how to set policy to deal with climate change. Ultimately, there’s an optimization involved here. I also think one should distinguish between setting policy and doing science, any climate change policy will (obviously) be grounded on science, but the mere exercise of choosing a policy doesn’t involve using the scientific method (it’s not an experiment after all, and furthermore while it may be grounded on scientific evidence this doesn’t mean it will be the only criteria that will be used to decide on a climate policy).

          • Mikhail Shubin says:

            I guess the better analogy would be analyzing a Concorde crash when this Concorde crash was the first and only heavier-than-air vehicle we had ever seen and it crashed on the first fly. This is much harder problem, even assuming you have all the data and you know all the relevant laws of physics and have technical documentation.

  4. Martha (Smith) says:

    “Finally, regarding your last point: I don’t think that uncertainty should be a reason for doing nothing.”

    Agreed. Uncertainty is a fact of life. We can either accept it and take it into account when making decisions, or we can ignore it and give up on trying to make the world a better place. I advocate the former.

  5. yyw says:

    I don’t think any reasonably informed person advocate doing nothing. Rather it’s disagreement on how much resource as well as how, when, and where to allocate the resource given the uncertainty. Using national defense as analogy, soviet union likely would have had a much better outcome by shifting most of their military spending towards economic growth. Arguably, US today could benefit from less military spending too.

    • Joshua Brooks says:

      “Rather it’s disagreement on how much resource as well as how, when, and where to allocate the resource given the uncertainty.”

      Which necessarily moves into another area of uncertainty. We can’t really evaluate (IMO) resource allocation without deep knowlwdge about the balance of negative to positive externalities w/r/t differential allocation of resources for different pathways forward. For example, if we devote resources to renewable generation instead of protecting fossil fuel access (i.e., spending on wars in the ME to keep oil flowing), what will the relative benefits be?

      IOW, the disagreement about resource allocation is imbedded in deep uncertainty. Yet that deep uncertainty doesn’t prevent people from having high levels of confidence when they determine where resources should be allocated.

      Which suggests that differences about resource allocation are more about “who you are” than “what you know”.

  6. yyw says:

    It’s no coincidence that problems of replication mostly were from biology and medicine, two fields with most measurement issues in natural science. There is quite a bit of junky research in even hard STEM, but most seem to be just useless (except for advancing career) rather than harmful.

  7. Michael Nelson says:

    If someone isn’t persuaded to believe in climate change by hundreds of scientists and thousands of studies, is an oblique reference in the NYT really going to be what pushes them over the edge to becoming a denier? “You know, that graph of global temperatures against historical carbon emissions almost had me, but then someone showed me this article about the replication crisis and now I realize global warming is just a giant hoax.” Not to diminish Andrew’s rhetorical skills. :)

    • bop says:

      Really? That was the bullet point you came up with to summarize the supposedly indisputable evidence for human-induced climate change?

      Do you also think vaccines cause autism?

  8. jim says:

    I read a lot of natural science literature (geology / biology / paleo sciences / environmental issues). There isn’t a reproducibility problem in the natural sciences.

    The approach is quite a bit different. Acquiring data can be very expensive and time consuming and usually there is a very strong rationale and methodology at the proposal stage. Frequentists statistics aren’t as frequent :) and r2 < 0.8 for regression is considered weak. There's lots of mathematical talent in fields like atmospheric sciences and geophysics, so there are fewer people torturing data with procedures they don't understand. Also, there are many different approaches to backing out an answer to a problem, so there's not all that much in the way of "lets redo this experiment" kind of work. Even when the original work is viewed as suspect, often the challenge to it will be from a different angle, rather than an effort at an exact reproduction of a previous experiment.

    In climate science, however, a group of people outside academia frequently challenge work by leading research groups by trying to exactly reproduce results and, in the process, have uncovered a number of significant errors. It doesn't appear that there's a general sloppy problem: a lot of this work requires tremendous expertise and is very difficult do well in the first place let alone to reproduce on a the basis of a description in a paper.

    There are other problems though, some similar to and some different than social science. Like social sciences, work in the natural sciences frequently requires a variety of assumptions. This is a key issue in long term forecasts, since the assumptions basically determine the result.

    And while there is often lots of data in the natural sciences – apparently contrary to popular belief – there is also an incredible amount of noise and variation many phenomena (ENSO for example), making them difficult to both identify and predict, even on a short-term basis, and generating a fair amount of controversy.

    Because there is usually more than one approach to a problem, methods are frequently a source of controversy.

    All in all, there is still a fair amount of researcher degrees of freedom.

    • Stevec says:

      The actual problem is more complicated. Peer reviewed climate science demonstrates:
      a) AGW (anthropogenic global warming) is real, so therefore if we keep burning fossil fuels the world will continue to get warmer, and
      b) everything else following is unclear

      For example, have droughts increased globally in the last 50 years? Sheffield & Wood (2008) say they have decreased. Auigo Dai (2010) says they have increased.

      How about tropical storms in the last 100 years? The Special Report in Extremes (SREX) by the IPCC (2012) examines the claims – there are authors saying increase, and authors saying decrease. SREX sides against an increase.

      What about the future for storms with increasing temperatures? Kerry Emanuel says they will increase. Thomas Knutson and a cast of luminaries say there is no theory of tropical cyclone genesis in different climates, and explain the flaws in Emanuel’s theory.

      And so on.

      Obviously (almost) no one reads important reports and instead rely on journalists who haven’t read the reports.

      The most bizarre part of commentary on “climate change” is the unquestioned idea that all propositions can be lumped together with “good people” on one side and “deniers” on the other side.

      • The main takeaway from climate studies is that the future can be very different from the past, and the uncertainty about what will happen increases with more energy trapped in the environment.

        Inherently, predicting which things will or will not happen is extremely difficult. Nevertheless, we have to give more weight to the possibilities that were previously unlikely. This widening of the distribution of possibilities affects decision making even if the average thing that happens doesn’t change at all. For example the average earthquake is so small we can’t feel it, there are hundreds of magnitude 2 or less earthquakes daily. Nevertheless, we design buildings to protect lives during quakes like Northridge even though they happen less than once every few decades.

      • Bob says:

        Absolutely right. I’d also echo what jim said above “…assumptions basically determine the result.” The simulations which make the news are begging the question, they assume there is a ‘tipping point’ which determine the behavior of the system, then trumpet the fact the system has changed.

        I just find it strange that “peer review” is invoked as validation of a position when there are so many discussions here about how the peer review process is broken.

        Also, a lot of climate change activism is a vehicle for anti-capitalism. If there were no such thing as climate change a lot of activists would be agitating for the very same policies, just with a different justification.

  9. jim says:

    “I fear that climate and other science deniers will use your piece as ammunition that peer-reviewed science is “junk” and therefore no action on climate change and other environmental problems is warranted.”

    Incidentally, it doesn’t appear that anyone needs Andrew’s piece to prevent action on climate change. Here in WA, “Sunny Jay” the self-proclaimed Climate Governor and Climate Presidential Candidate couldn’t even pass a climate bill with a favorable legislature and billionaire backing (note his position the Presidential poles for public enthusiasm level of his Climate-First agenda). The same thing happened in Oregon in much more dramatic fashion.

  10. Peter Dorman says:

    Since the topic of climate change has come up again, I will make the same two points I have made in the past:

    1. The fundamental science of climate change follows directly from the well-established knowledge we have of the relevant physics, geology, chemistry and biology. Arrhenius outlined the problem correctly at the turn of the 20th century with almost no empirical evidence. It is a logical inference, and if anthropogenic climate change were to be a mistake we would need strong arguments for why the logic doesn’t apply. To be blunt: the onus is on those who doubt ACC to explain why the biogeochemical story shouldn’t be accepted.

    2. Individual studies on which the pertinent earth history and parameterization of climate models are based are often open to dispute, but what is striking in most of the literature is the extent to which different types of evidence converge. I don’t know if there is a formal epistemological theory that endorses this, but I believe that the variety of evidence has great independent importance as justification for belief. The few areas of climate science where this convergence is lacking, e.g. the role of methane in the carbon cycle (such as how much of it is stored in marine deposits and what proportion has been or could be mobilized), demonstrate how convergent research is in the rest of it.

    • Joshua says:

      “To be blunt: the onus is on those who doubt ACC to explain why the biogeochemical story shouldn’t be accepted.”

      How do we determine who has what obligation? Those who doubt the GHE, or whether it poses a serious long term risk, don’t need to do anything to keep resisting policies to mitigate any risks. They don’t need to convince anyone else that their interpretation of the evidence is correct, to remain steadfast in their opposition to mitigation. And as long as they can continue to block the implementation of substantial mitigation policies, they don’t particularly care which arguments other people do and don’t accept.

      The onus, of there is one, is on those who disagree with them, to either convince them that their wrong (which won’t happen, because people don’t decide on this issue based on evaluation of the evidence – this is about “who you are,” not “what you know”), or to win the ability to create policies through superior political power, or to address the polarization by other means (e. g., stakeholder dialogue, socioctaric decision-making, participatory democracy). Otherwise, AGW will not be addressed until such time that the signal of climate change against the noise of variability in weather is so unambiguous that the impact can’t be avoided, even by people in relatively rich counties. (Of course, given the warming in the pipeline, by that point it may well be to late to do anything about it.

    • jim says:

      ” Arrhenius outlined the problem correctly at the turn of the 20th century with almost no empirical evidence. ”

      This statement isn’t accurate.

      “In his calculation Arrhenius included the feedback from changes in water vapor as well as latitudinal effects, but he omitted clouds, convection of heat upward in the atmosphere, and other essential factors.” (Wikipedia).

      The modern question of *how much* global warming hinges *entirely* on the amplification of the effects of CO2 by feedback effects such as changes in water vapor (which Arrhenius did include in his temp estimates), albedo, plant response, arctic methane release etc. Some of these are well constrained (water vapor), some are almost completely unconstrained (methane hydrates).

      • Peter Dorman says:

        We seem to disagree on what “outlined” means. It is quite correct that Arrhenius ignored many pertinent factors back around 1900; how could it have been otherwise? It’s coincidental that his calculations are roughly on the mark (so far); his downside errors approximately cancel out his upside errors.

        But we certainly agree on the issue of how much global temperatures are likely to increase dependent on carbon loading and other factors. And yes to methane hydrates.

        My comment was directed against those who dispute a detailed study or two and think, by doing so, they are making a credible argument against AGW. They aren’t. I’m also trying to push a different framing of why the majority of us who aren’t specialists should regard AGW as real. What we usually see are statements about the percentage of climate scientists who agree with it (the “scientific consensus”), but I’m not comfortable with that approach. Sometimes the consensus really is wrong. I think a better framing is in terms of (a) the logic of the biogeochemical carbon cycle and its role in earth history, and (b) the tremendous diversity of evidence for the fundamentals of climate change, if not the specifics of all its drivers.

        (Galileo once said, if science were like hauling grain, a hundred horses would always be better than one. But science, he said, is like a race, and one fast steed is superior to a hundred work horses. We know who he thought the steed was, and in his case he was right.)

Leave a Reply