Ah, Shahanan, obsessed with proof, lost science somewhere back. Science is about evidence, and testing evidence, not proof, and when our personal reactions colour how we weigh evidence, we can find ourselves way out on a limb. I’m interested in evidence supporting funding for research, and it is not necessary that anything be “proven,” but we do look at game theory and probabilities, etc.
I agree with Lomax’s second statement here. Science is exactly about weighing evidence. And I understand the explicitly acknowledged bias: Lomax wants more research in this area. I disagree with the statement that “Shanahan is obsessed with proof”. It would be accurate to say that Shanahan, both implicitly and explicitly, is looking for a much higher standard of evidence than Lomax. There is no proof in science but when evidence reaches an amount that overwhelms prior probabilities we think something is probably true. 99.99% and we call it proof. The numbers are arbitrary – some would set the bar to 99.9999% but this does not matter much because of the exponential way that probabilities combine.
Let us see in detail how this works.
There are, in an exemplary case, three types of error:
- Experimental noise: provides a Gaussian probability distribution for results. Marginal results keep false positive probability high. Non-marginal and false positives exponentially reach a negligible value. The Gaussian distributions from experiments are such that 0.01% or 0.0001% is easily reached in non-marginal results.
- Deliberate error: people are people, and judgments of uprightness don’t help in this: people can do weird things. Say 1%.
- Protocol error: something unexpected and unrecognised that makes the assumptions on which the experiment is based break down. This may or may not be obvious to another observer reading the write-up. CCS would be a putative example. Let us say 10%. This error can be either one-off (cat got in at night and did something) or systematic (the batch of fuel we all know to use because it works well is contaminated in a way that breaks the thermocouples).
Independent validation by reputable different and completely independent groups using different protocols even once will reduce the probability of deliberate error to our benchmark 0.01% but the 10% probability of an unexpected and unknown protocol error gets cut best case to 1%. Worst case the systematic error is shared by the validator and the original. In this example the same fuel might be used, for obvious reasons. The same TCs might be used because they are what would normally be used and no-one thought to change this aspect of the protocol. More independent validation helps, but does not deal with the systematic error if the experiments remain very similar. However confirmatory evidence from completely different experiments or observations, if of similar quality, would get this down to 0.01%.
The numbers here are all hand-waving. And the 99.99% benchmark is arbitrary. If the proposition is:
We live in a simulation and the aliens controlling it have switched the revolution of the earth so now the sun rises in the West and sets in the East
then you’d need better than 99.99% before you felt it was likely (at least I would – I’d go for other things first even though the hypothesis can never be ruled out).
Shanahan has different hand-waving numbers here from Lomax. Personally, I’d agree with Shanahan. But claiming that one of these how likely is it a priori estimates is better than another is very difficult. We all have different judgments, and they can in good faith and without any irrational clinging to bias lead to radically different views.
Shanahan would require a much higher level of evidence here than Lomax to think the hypothesis probable. There is no inherent merit in requiring a high or low level of evidence even though one or other may in fact be accurate. Recalibrating this level is a very difficult because the information is so complex, and human feelings and unconscious bias affect judgement. That must be true of Lomax, Shanahan, and anyone else.
Shanahan might also claim, more strongly, that Lomax’s evaluation of the existing evidence, irrespective of his a priori probabilities, is incorrect, or vice versa. That is more possible of resolution since the reasons for either claim can be stated and tested.