Minds open brains not falling out?

First of a sequence of comments on Lomax’s recent blog here on Shanahan’s review of Storms posted in LENR Forum.

Lomax writes:

Ah, Shahanan, obsessed with proof, lost science somewhere back. Science is about evidence, and testing evidence, not proof, and when our personal reactions colour how we weigh evidence, we can find ourselves way out on a limb. I’m interested in evidence supporting funding for research, and it is not necessary that anything be “proven,” but we do look at game theory and probabilities, etc.

I agree with Lomax’s second statement here. Science is exactly about weighing evidence. And I understand the explicitly acknowledged bias: Lomax wants more research in this area. I disagree with the statement that “Shanahan is obsessed with proof”. It would be accurate to say that Shanahan, both implicitly and explicitly, is looking for a much higher standard of evidence than Lomax. There is no proof in science but when evidence reaches an amount that overwhelms prior probabilities we think something is probably true. 99.99% and we call it proof. The numbers are arbitrary – some would set the bar to 99.9999% but this does not matter much because of the exponential way that probabilities combine.

Let us see in detail how this works.

There are, in an exemplary case, three types of error:

  1. Experimental noise: provides a Gaussian probability distribution for results. Marginal results keep false positive probability high. Non-marginal and false positives exponentially reach a negligible value. The Gaussian distributions from experiments are such that 0.01% or 0.0001% is easily reached in non-marginal results.
  2. Deliberate error: people are people, and judgments of uprightness don’t help in this: people can do weird things. Say 1%.
  3. Protocol error: something unexpected and unrecognised that makes the assumptions on which the experiment is based break down. This may or may not be obvious to another observer reading the write-up. CCS would be a putative example. Let us say 10%. This error can be either one-off (cat got in at night and did something) or systematic (the batch of fuel we all know to use because it works well is contaminated in a way that breaks the thermocouples).

Independent validation by reputable different and completely independent groups using different protocols even once will reduce the probability of deliberate error to our benchmark 0.01% but the 10% probability of an unexpected and unknown protocol error gets cut best case to 1%. Worst case the systematic error is shared by the validator and the original. In this example the same fuel might be used, for obvious reasons. The same TCs might be used because they are what would normally be used and no-one thought to change this aspect of the protocol. More independent validation helps, but does not deal with the systematic error if the experiments remain very similar. However confirmatory evidence from completely different experiments or observations, if of similar quality, would get this down to 0.01%.

The numbers here are all hand-waving. And the 99.99% benchmark is arbitrary. If the proposition is:

We live in a simulation and the aliens controlling it have switched the revolution of the earth so now the sun rises in the West and sets in the East

then you’d need better than 99.99% before you felt it was likely (at least I would – I’d go for other things first even though the hypothesis can never be ruled out).

Shanahan has different hand-waving numbers here from Lomax. Personally, I’d agree with Shanahan. But claiming that one of these how likely is it a priori estimates is better than another is very difficult. We all have different judgments, and they can in good faith and without any irrational clinging to bias lead to radically different views.

Shanahan would require a much higher level of evidence here than Lomax to think the hypothesis probable. There is no inherent merit in requiring a high or low level of evidence even though one or other may in fact be accurate. Recalibrating this level is a very difficult because the information is so complex, and human feelings and unconscious bias affect judgement. That must be true of Lomax, Shanahan, and anyone else.

Shanahan might also claim, more strongly, that Lomax’s evaluation of the existing evidence, irrespective of his a priori probabilities, is incorrect, or vice versa. That is more possible of resolution since the reasons for either claim can be stated and tested.

Print Friendly, PDF & Email

9 thoughts on “Minds open brains not falling out?”

  1. To a large extent, these questions should be settled by Plan B. Where it’s a matter of proposing that Miles made some errors that Shanahan pointed out, but Miles didn’t accept that those errors were possible, I’d go with Miles. I’m not a specialist in calorimetry or electrochemistry, but the degree of care Miles took to avoid errors and to quantify what errors were left was pretty monumental.

    There’s a story about an Oxford professor who was known for being pedantic, so one of his students asked him what colour was the cow in the field opposite. The professor said “Brown, on the side I can see”. I figure Miles may be a bit that way….

    Our theories are based on observations and the (normally hidden) assumption that if you do exactly the same thing, you’ll get exactly the same result every single time you do it. LENR is somewhat different, in that when people do what they think is exactly the same thing, they get different results. That may mean that it’s all a big error and that people are fooling themselves (see N-rays) or that we aren’t doing exactly the same thing each time. For Miles, all the cathodes were cut from the same block and the cells were as equal as was possible, and the results were nevertheless variable between cells. Despite that variability of performance, the heat was correlated with the Helium produced. Plan B will likely have this variability of performance between cells, yet should produce a better correlation of heat/Helium providing LENR is in fact real.

    The question then will be “what is the variable we didn’t spot”, since at heart we still assume that if we do the same thing we’ll get the same result (is this a valid assumption?). Maybe analysis of the results and samples will provide some answers, but it’s hard to predict here simply because we don’t know what we don’t know. That means that with the Plan B results we still won’t know what it is we are supposed to be looking for as the hidden variable, and though it’s possible that a long contemplation of all the data may provide that insight, it’s also possible (and more likely) that we won’t see what was different between the cells. My personal hunch is that there will be a subtle (and critical) level of impurity that will make the difference, but then I’ve spent a long time with semiconductors where a few ppb can affect the results. Impurities are more mobile than you’d think, too, and atoms can be shifted around by electrical currents – when you look at things on an atomic level they aren’t as stable as we think from everyday experience. Dislocations and atoms move with stresses, and the conditioning process of the cathodes is intended to put them under a lot of stress. Are we looking for a geometrical variable in the lattice or specific impurities, or a combination of those, or something else?

    As I see it, Plan B should start a whole new phase, and we can largely dismiss the arguments about what errors did or did not happen in earlier experiments. Since there seems to be consensus amongst the successful LENR researchers as to how to get a good proportion of working cathodes, there should be no problem such as no excess heat at all (so perfect correlation between heat and Helium…) and we should have some good data to chew over. Since Plan B is financed and going to run, maybe chewing over the old experiments is a bit like picking old chewing-gum off a school desk of uncertain vintage. Some things just weren’t documented since people didn’t know what was or might be critical. Where P+F or Miles maybe didn’t state certain things since everyone in the business knew it was SOP, I expect Plan B to document everything they can think of.

    I recall Dirac’s comment about quantum theory – ” it’s crazy, but is it crazy enough?”. Get the rules sorted out though, and after a while we’ll make some sense of them and why the rules are what they are (though that still hasn’t happened with QM – it’s still crazy). Still, with LENR the first task seems to be to prove that it happens to a high-enough probability that most people accept that it is real.

    1. The major variable is reasonably well-known, for PdD experiments. Surface structure. Not only do various batches of palladium behave differently, as to the evolution of surface structure, but that structure does change with loading and deloading. The history of the metal matters. An electrochemical cathode will attract every anion in the solution, so it’s true that impurities may loom large, but the most likely source of Nuclear Active Environment is structure that grows — and can be lost — during the conditioning of the cathode, which can be a long process. It is with this part of his theory that I think Storms is likely right. It’s surface cracks. However, other similar structures might work, cracks are merely ubiquitous, and if a particular size crack is needed, perhaps for some resonance, it’s plausible.

      The helium work shows that the reaction is surface, because if it were taking place in the bulk from high loading (the original Pons and Fleischmann idea), the helium would be trapped, helium is only mobile in palladium within a grain boundary, so it can (partly) escape from the surface, but is trapped along grain boundaries within the metal, and it will stay there for a very long time, that’s been shown by loading palladium with tritium and then observing the behavior of generated 3He as the tritium decays. Helium also will not enter palladium (that’s how it’s trapped in grain boundaries), though obviously it could enter a large crack.

      The first part of Plan B is to nail the science, the reality of the effect, by increasing precision on heat and helium measurements and generating many more samples from controlled experiments (as many as practical being identical).

      1. Thanks Abd and Simon.

        The thrust of this post is: how can we avoid bad judgement through wrong calibration (in either direction) of our willingness to listen to anomaly-driven hypotheses such as LENR.

        The thrust of comment on Plan B here is easy to agree. And I also think it probable that if LENR is a real nuclear effect both excess heat vs 4He and excess heat on its own, measured carefully, would give incontrovertible results. If it is some weird larger-than expected chemical effect then excess heat, measured correctly, would do this.

        We are all here mostly agreed about CCS, but i’ll summarise the issue as I see it from the point of view of someone with a relatively skeptical filter on anomaly-driven hypotheses. You might view this as a relatively unskeptical attitude towards undiscovered new anomaly mechanisms.

        The issue with excess heat and CCS is easily resolved (but it is not clear that this has been done in past experiments). I’m going to define CCS here in its most general form, as Shanahan did, although this is often forgotten. CCS supposes some difference in cell temperature profile caused by conditions changing in the cells. The different temperature profile inevitably modulates heat loss paths and therefore changes the cell characteristics from those found when calibrating.

        The key insight here is that such an error (as a percentage) multiplies by the total cell power, not the excess power. Since in typical good quality LENR results total power is much larger than excess power this is significant. Also worth noting is that temperature is not heat. It requires nothing other than some change in physical conditions between active and control cases to make CCS happen. It is very difficult to rule that out.

        There is then the possible issue of ATER as a mechanism: this would be a way to change temperature profile in the cells (in all cases), or a way to get absolute changes in heat in an open cell (if not measured). The merit of ATER is that it is unexpected, so might be overlooked, and could have unusual surface-condition triggers that match observations. Other mechanisms could change temperature profiles: for example bubble formation. These have less merit because would be commonplace and less likely to break normal assumptions of electrochemists doing calorimetry. But, because we are in anomaly territory, the best way to eliminate CCS as a hypothetical explanatory anomaly is to kill it at root. It is inherently weak to say that CCS cannot work here because we have not thought of a precise mechanism, even though we cannot rule it out. It is also unnecessary. CCS can, relatively easily, be ruled out. In which case, if this can be done, it strengthens the evidence for LENR.

        In line with this post I’d say that strengthening the evidence for a hypothesis must always be useful until that hypothesis is clearly accepted by everyone. Arguing that evidence is strong enough already and those who disagree are biased is not helpful.

        With a firm and skeptical (in the sense of examining every assumption carefully) approach these matters can be put to bed. When anomalies are reported I always try to simplify things, and the best simplification here is to make calorimetry provably independent of in-cell temperature profile to within the claimed calorimetry error. That means looking carefully at the integrity of isothermal surfaces and thermal gaps and bounding the irregularities (if isoperibolic) or simply minimising the total heat loss (if mass flow). As Marwhan et al pointed out a low-loss mass flow calorimetry setup can make measurements immune to any possible CCS since CCS cannot possibly have a larger effect than the total heat loss.

        To my knowledge this has not been done, but it could be done. Since 10% heat signals are easy to obtain, and <1% total heat loss mass flow calorimetry is entirely possible, that as Plan B would put matters to rest as far as well above chemical level anomaly existence goes.

        There is then the more complex matter of 4He measurement and what excess heat 4He correlations bring to the party.

        1. Tom – the thing about the calibration constant shift is that it can be tested for by putting a heat-source in various locations inside it and seeing whether the calibration constant does shift. I do not understand, though, why the CCS applies to the total power put into the cell when most of that power will be dissipated in exactly the same place (resistances, electrolyte) and any anomalous power generation in some other location will be only a proportion of the total power. IIRC, in any case Miles checked the calibration constant using a resistor heater, and checked that in different locations.

          An anomaly that could give a false signal would be something like a phase-change (or re-ordering of the crystal lattice) but that would give a one-off gain or loss of heat and would be limited to chemical energy levels. A change of the specific heat for the electrolyte would also produce that, if we postulate some previously-unknown equivalent of a phase-change. Again it would be limited to chemical energy levels.

          ATER (at the electrode recombination) does seem to be reaching for any explanation other than nuclear. Covered by Abd in http://coldfusioncommunity.net/if-im-stupid-its-your-fault/ section 7, and the reason why it would be noticed is stated there. However, bubbles rise vertically in the electrolyte, and the only way I can see those bubbles of D2 and O2 recombining without going pop is if a bit of the anode (Platinum) breaks off and floats on the surface of the electrolyte. Somewhat noticeable for a start, and again the dearth of gas would be noticed as well as the little blue light dancing around on the surface.

          Since however we’re in the region of removing as much as possible of the ground behind those goalposts so they can’t be moved further back, I would suggest using a bubble-barrier between the anode and cathode. Two concentric tubes of tissue-paper (or preferably glass fibre) should suffice. The gases can’t then touch each other while they are bubbles in the electrolyte, and the space between the two tubes will be a no-bubble zone. This will have some effect on the circulation of the electrolyte and may itself have an effect on the calibration.

          Similarly, using an array of resistive heaters inside the cell will allow the experimenter to move the heat-source around in the cell without needing to dismantle or move anything, and it should be able to be shown that the location of the heat has no measurable effect. Of course you won’t convince someone who says “if you’d moved that resistor half an inch to the left you’d have seen the shift”, but Oh Well….

          Although I think the evidence is strong enough, this is obviously not true for a lot of people, thus making the evidence stronger is important if we want to convince more people. Including you, Tom…. As such, though it’s impossible to remove all possible causes of error (almost by definition we won’t know all of them), we should try to remove all the causes that real sceptics can think up.

          The 4He measurements should be handled blind in the same way Miles did it, so the people doing the analyses don’t know which flask to mark as high or low Helium content.

          After that, if the excess heat and Helium correlate as we hope, and the heat is greater than chemical, we’re back to where we were after Miles’ experiment except that this time far fewer people will be able to discount it as experimental error.

  2. Thanks for the tags Abd.

    It will help discussion with we can disentangle two distinct things. How strong is the evidence? and What strength of evidence is needed?

    In this post I’m not addressing How strong is the evidence? although as I indicated in the first sentence I will want to say some things about that later. I am suggesting that people of good will and acting scientifically can reach very different conclusions as to the strength of evidence necessary for any specific hypothesis to be thought likely.

    I also understand the point that here we have differential hypotheses: any hypothesis must be supported if there is no alternative that fits the facts. Of course, in my example case here, there are alternatives.

    I don’t see, from Shanahan in his arguments that I have read, any problematic lack of care in examining arguments. I can criticise such examination from anyone, including me, we just don’t have time, and all cherry-pick the things that stand out as important. I don’t see inability to engage seriously with contrary arguments.

    I do see that anyone looking at these questions will start will a level of evidence they need to entertain specific hypotheses, and for some it will be lower than others. There is no easy way to work out what is the right level, and clearly we must implicitly have such a level in order to judge the evidence.

    I think some of the arguments about LENR come from people with different levels of evidence required looking in a broadly similar way at evidence but judging it differently.

    This point is the least interesting thing you can say about this topic. Maybe it is also the most important, if the unscientific accusations of bad faith and bad science are to be reduced.

    1. It will help discussion with we can disentangle two distinct things. How strong is the evidence? and What strength of evidence is needed?

      Indeed. However, those are both secondary questions. The first thing is always “What is the evidence?” Yes, in considering evidence, we will consider “strength” but this can tend to be more subjective. The strong tendency is to deprecate evidence we don’t like, reasoning from conclusions; and then the next question has a lost performative: “needed for what.” Below, you write, “for any specific hypothesis to be thought likely.” Likely for whom? What I have seen, over and over, are arguments that assume that “likely” is some kind of physical fact, when it’s obviously a human judgment, which would vary with the knowledge and assumptions of the judge.

      In this post I’m not addressing How strong is the evidence? although as I indicated in the first sentence I will want to say some things about that later. I am suggesting that people of good will and acting scientifically can reach very different conclusions as to the strength of evidence necessary for any specific hypothesis to be thought likely.

      While I appreciate your intention, “acting scientifically” is undefined. I would take this issue to another level: People of good will may differ, period. However, something else is possible, with adequate good will: we can explore the foundations of our “conclusions.” In interreligious dialog, years ago, I learned a principle: learn enough about a differing point of view, sufficiently to be able to explain it — and arguments for it — well enough that the other person will say, “Yes, that’s what I believe or think.” This does not require agreeing with that explanation as complete or agreeing with the conclusion, but what is seen, over and over, in these seemingly interminable arguments, is that this isn’t done.

      Much “scientific debate” isn’t scientific, it can be more like bulls butting horns.

      I also understand the point that here we have differential hypotheses: any hypothesis must be supported if there is no alternative that fits the facts. Of course, in my example case here, there are alternatives.

      Much of the debate is ontologically unsophisticated. To be sure, training in ontology is not particularly common. You use the words “must be.” According to what? What is compelling that? There is a mode of thought, not uncommon, that purports to require that everything be “logical.” Yet we are animals. We have the capacity to use logic, but are not, in fact, compelled to do so. Logic works well for some purposes and not for others. And we imagine that if we can show a logical flaw in the belief-structure of another, why, of course, they will be convinced! This almost never works, so … is it logical if it rarely works? Something is obviously missing!

      I don’t see, from Shanahan in his arguments that I have read, any problematic lack of care in examining arguments. I can criticise such examination from anyone, including me, we just don’t have time, and all cherry-pick the things that stand out as important. I don’t see inability to engage seriously with contrary arguments.

      I have differing impressions, because I have, I suspect, a longer and deeper history of interaction with Shanahan. It’s not terribly important here, because Shanahan’s possible personal defects are not actually relevant to our main purpose, as I’m seeing it here, to examine the evidence and ideas. They are relevant in a different way: Shanahan is obviously frustrated that few are listening to him, irritated that LENR researchers ignore his ideas, yet he is quite ineffective in communicating with them. I see skeptical researchers participating in the CMNS mailing list, communicating far more effectively. Shanahan is shut out. Why? He seems to imagine that it’s because he’s right, and they don’t like it. That expresses a stand, a position, and it is one that is inherently disempowering if the goal is communication (and then collaboration and cooperation — and progress). There is a path forward, and you may be able to help with it, THH.

      I do see that anyone looking at these questions will start will a level of evidence they need to entertain specific hypotheses, and for some it will be lower than others. There is no easy way to work out what is the right level, and clearly we must implicitly have such a level in order to judge the evidence

      To attempt to assess the necessary “level of evidence” to entertain a hypothesis, in advance of knowing what evidence exists, seems doomed to likely error, to me. Again, with no clear definition of the standard of “right,” how could agreement be found? The right level of convince you of something is whatever level it takes, isn’t that obvious?

      However, even the word “convince” takes us away from my own understanding of genuine science. It has an operational meaning, not an absolute one, and there are levels of conviction. With cold fusion, our level of conviction has effects on our behavior. If, for example, I am “convinced” that there is no decent evidence for any LENR effect, I’m not at all likely to make the effort to look for what is claimed as evidence. That is, in fact, still within ordinary skepticism, even if it is “wrong,” i.e., that there is evidence that, if I knew it, I might agree is “decent.” I won’t know it, I won’t get that far unless somehow I’m dragged through it.

      It is pseudoskepticism when the “conviction” becomes an offensive certainty that others are wrong.

      I think some of the arguments about LENR come from people with different levels of evidence required looking in a broadly similar way at evidence but judging it differently.

      Not sure I’m parsing that correctly. There are stages in discussion. Often arguments begin over conclusions, not so much over evidence. It is progress in discussion if the parties are looking at the same evidence, but then there is assessment of evidence, comparison, etc., and that can require even deeper work. Here, you have mentioned the strength of evidence. What are objective measures of this? There are some. For an academic discussion, those would best be on the table, I’d think.

      This point is the least interesting thing you can say about this topic. Maybe it is also the most important, if the unscientific accusations of bad faith and bad science are to be reduced.

      You use “unscientific” as if it were a synonym for “bad.” It’s obvious that those accusations are generally not “scientific.” They are human-social and psychological. However, what are the intentions of the participants? Those may dominate the interaction, we can expect.

      1. A don’t think I was adding moral freight. I’m saying what is helpful; to facilitate scientific examination of these issues, which is what personally I’m interested in.

        Defining the word scientific is a big issue, which as you allude many philosophers have argued. I have personally very specific views on the epistomology here and the framework within which I can understand what is scientific argument. No time for that now. The uncontentious, I expect, part is that in science it is helpful never except in very rare and extreme cases of dishonesty (wholesale fabrication of results) to argue bad faith or intellectual lack from the arguer. Better to treat all arguments on their merit without personalisation or assumptions about the quality of the arguer. That is what peer review – blunt instrument though it is – tries to do. And as with democracy, though it is manifestly highly imperfect, we have no better way to do things.

  3. I edited this post slightly to add a More tag (as well as categories noted in my first comment). To improve overall blog presentation, all blog posts should have introductory material to provide a relatively quick sense of what the post is about — or a teaser, sometimes — and then a More tag, available from the edit menu. THH may change or reverse what I’ve done, particularly to edit the introductory material. This creation of hierarchy for readers is important to my concept of where CFC will go.

  4. Thanks, THH, and welcome. I have added categories. I think you can change those if you don’t like them.

    I stated that the chances of the heat and helium results matching by chance were less than one in a million. Miles calculated, for his data, one in 750,000, but there has been confirmation. I would less conservatively estimate the probability of error, at this point, on “correlation” (the value of the ratio is not so clear) at something approaching one in a billion. You wanted 99.99%? Perhaps we have it.

    The big worry with cold fusion is the file drawer effect. I am not aware of any helium studies where decent data is available that show evidence contrary to there being a correlation. We could go over all the studies we have.

    There is the Morrey collaboration. It had problems. Pons and Fleischmann gave them a poor sample, with much less XE than they had been reporting, and their as-received material had substantial surface helium, as if it had accidentally been implanted along with the deliberately ion-implanted material.

    Miles remains the best data set for studying correlation, because it is a reasonably large set of samples, with mostly the same protocol used. There are obvious ways to improve that, but we are, in these studies, considering what already exists. It’s an exercise. I have little doubt that we will know better soon.

    As to Shanahan and “proof,” Shanahan is holding the idea that LENR is extremely unlikely, and t herefore “proof” — i.e., very strong evidence — is needed. Needed for what?

    I’m looking at a BBC video on cold fusion from 1994. It misstates — as is quite common — what Pons and Fleschmann found and ultimately claimed. By asserting that the claim is fusion as found in the Sun, it is dead from the beginning. That’s obviously false!

    Instead, there are two claims. The first is a claim of anomalous heat. That is not a question for nuclear physics, and the absence of major neutron radiation was never evidence of the absence of heat. Pons and Fleischmann actually claimed an “unknown nuclear reaction,” and “nuclear” was rooted in their chemistry expertise: to them, this was not chemistry. Fleischmann later said he regretted the nuclear claim: it was premature, speculative.

    But so are Shanahan’s theories. This is what we have: many, many reports of anomalous heat, hundreds of papers covering this in mainsteam peer-reviewed journals. Shanahan is claiming systematic artifact, but, if we look more closely, he is claiming a systematic anomaly, generally unexpected recombination. Then he uses other, sometimes Rube Goldberg, ad hoc explanations for other observed effects. I agree that recombination should remain on the table for some work, but for some work, it’s weak, inadequate.

    Then there is helium. Helium was recognized as very important, early on. The main and most cogent theoretical argument against the heat being more than artifact was the missing ash. Helium was very much not expected. (As Huizenga emphasized, no gammas.) After noting the significance — superlatively — Huizenga then expected that Miles would not be confirmed, like much cold fusion work.

    But Miles was confirmed. To me, it is quite obvious where the preponderance of the evidence lies, and, THH, I don’t see that you have done more at this point than read Shanahan and consider him reasonable.

    This is an example of the long-time CF researcher’s complaint of moving goalposts. I do understand the skeptical effort to rule out all competing hypotheses, but I’m not seeing any competing hypothesis for the correlation, only for heat by itself, generally of the nature of “they must be making some mistake, i.e., Garwin theory,” and with no experimental evidence backing it, or the obvious for helium itself, leakage from ambient, which also doesn’t match the experimental results. Unless we can figure out how a hydrogen cell will not leak helium while a deuterium cell will. And how, in some studies, helium can rise rapidly above ambient. The nice little series of three experiments reported by Apicella et all made no effort to exclude ambient helium, but measured rise above it. (Krivit, not realizing this, went completely bananas re-analyzing that data. It made no sense, so he accused Violante of altering data and generally demonstrated his own incompetence.)

Leave a Reply

Your email address will not be published. Required fields are marked *

WordPress Anti Spam by WP-SpamShield