Random news

The preprint on subcritical reactors I had written about two months ago has finally appeared on arxiv. Apparently it was “on hold” for that long because the primary category (applied physics) was not the right one, and the moderators thought “instrumentation and detectors” was better. Honestly, I am not sure they are right, given how theoretical the paper is, but I am just glad the paper appeared (I would have just preferred it to be faster).

I also have one article written with Vincent Vennin and one interview by Philippe Pajot in the latest edition of La Recherche. Both are, of course, in French. This edition, with a new format, features many interesting articles, in particular an interview of the president of the Max Planck Society.

Le dernier La Recherche en kiosques

Sustaining the chain reaction without criticality

I have found a neat nuclear physics problem, and have written a draft of article about it. Below I explain how I came to be interested in this problem, give some context most physicists may not be familiar with, and explain the result briefly.

A while back, I took a small part in the organization of the French public debate on radioactive waste management. My work was not very technical, but got me interested in the rich physics involved. The debate itself didn’t allow me to do real research on the subject: it was certainly not what was asked from me, and there was already so much to learn about the practical details.

After the debate ended, I kept on reading about the subject on the side. In particular, I read more on advanced reactor designs, where I found a neat theoretical question which I believe was unanswered.

Continue reading

The sound of quantum jumps

Before I finally go on holidays, I put on arxiv an essay on quantum jumps, in fact rather on collapse models, that I initially submitted to the FQXI essay contest.

In this essay/paper, I just make a simple point which I have made orally for years at conferences. Every time, people looked quite surprised and so I thought it made sense to write it down.

The argument is simple enough that I can try to reproduce it here. Collapse models are stochastic non-linear modifications of the Schrödinger equation. The modification is meant to solve the measurement problem. The measurement problem is the fact that in ordinary quantum mechanics, what happens in measurement situations is postulated rather than derived from the dynamics. This is a real problem (contrary to what some may say), the dynamics should say what can be measured and how, it makes no sense to have an independent axiom (and it could bring contradictions). Decoherence explains why the measurement problem does not bring contradictions for all practical purposes, but again contrary to what some may say, it certainly does not solve the measurement problem. So the measurement problem is a real problem and collapse models provide a solution that works.

The stochastic non-linearity brought by collapse models creates minor deviations from the standard quantum mechanical predictions (it makes sense, the dynamics has been modified). This is often seen, paradoxically perhaps, as a good thing, because it makes the approach falsifiable. It is true that collapse models are falsifiable. What is not true, is that collapse models modify the predictions of quantum mechanics understand broadly. This is what is more surprising, sometimes seems contradictory with the previous point, and is the subject of my essay.

How is it possible? Collapse models are non-linear and stochastic, surely ordinary quantum mechanics cannot reproduce that? But in fact it can. As was understood when collapse models were constructed in the eighties, the non-linearity of collapse models, which is useful to solve the measurement problem, has to vanish upon averaging the randomness away. Since we have no a priori access to this randomness, all the things we can measure in practice can be deduced from linear equations, even in the context of collapse models. This linear equation is not the Schrödinger equation, but one that does not preserve purity, the Lindblad equation. However, it is also known that by enlarging the Hilbert space (essentially assuming hidden particles), Lindblad dynamics can be reproduced by Schrödinger dynamics. Hence, the predictions of collapse models can always be reproduced exactly by a purely quantum theory (linear and deterministic) at the price enlarging the Hilbert space with extra degrees of freedom. Collapse models do not deviate from quantum theory, they deviate from the Standard Model of particle physics, which is an instantiation of quantum theory. Even if experiments showed precisely the kind of deviations predicted by collapse models, one could still defend orthodox quantum mechanics (not that it would necessarily be advisable to do so).

Collapse models are still useful in that they solve the measurement problem, which is an ontological problem (what the theory says the world is like or what the world is made of). However, the empirical content of collapse models (what the theory predicts) is less singular that one might think. In the essay, I essentially make this point in a more precise way, and illustrate it on what I believe is the most shocking example, the sound of quantum jumps, borrowed from a paper by Feldmann and Tumulka. I doesn’t make sense to write more here since I will end up paraphrasing the essay, but I encourage whoever is interested to read it here.

The skyscraper and pile of dirt approaches to QFT

Quantum field theory (QFT) is the main tool we use to understand the fundamental particles and their interactions. It also appears in the context of condensed matter physics, as an effective description. But it is unfortunately also a notoriously difficult subject: first because it is tricky to define non-trivial instances rigorously (it’s not known for any one that exists in Nature), and also because even assuming it can be done, it is then very difficult to solve to extract accurate predictions.

There is a subset of QFTs where there is no difficulty: free QFTs. Free QFTs are easy because one can essentially define them in a non-rigorous way first, physicist style, then “solve” them exactly, and finally take the solution itself as a rigorous definition of what we actually meant in the first place. Then, to define the interacting theories, the historical solution has been to see them as perturbations of the free ones.  This comes with well known problems: interacting theories are not as close to free ones as one would naively think, so the expansions one obtains are weird: they diverge term by term, and if the divergences are subtracted in a smart way (renormalization), the expansions still diverge as a whole.

Continue reading

Bien plus qu’il n’en faut sur le cycle du combustible nucléaire et le taux de recyclage associé

Les combustibles nucléaires usés issus des réacteurs électrogènes français sont recyclés. D’aucuns arguent que ce recyclage permet de récupérer 96% du combustible, l’industrie est très vertueuse et exemplaire ! D’autres disent au contraire que le recyclage ne permet qu’1% de réutilisation, le recyclage c’est du bullshit ! Qui a raison ?

Je trouve que cette controverse est un bon prétexte pour expliquer la physique associée, qui est intéressante. Mon objectif est d’expliquer en détails l’aval du cycle du combustible pour son intérêt propre, la résolution de la controverse étant ensuite un corollaire trivial. Au passage, c’est l’occasion d’apprendre un peu plus sur l’histoire civile et militaire du nucléaire, le principe d’un réacteur nucléaire, et les subtilités des différents isotopes de l’uranium et du plutonium.

Continue reading

Nice quantum field theory videos

These days I am trying to improve my understanding of quantum field theory with as little perturbation theory as possible. I came across videos from a workshop at IHES in Bures-sur-Yvette on Hamiltonian methods for QFT and videos from a semester at the Newton institute in Cambridge, which both happened about a year ago. Both events are quite well filmed (especially at IHES), most presentations are made on a blackboard, and most talks I checked were well explained and interesting so I definitely recommend them.

The workshop at IHES could have been called 50 shades of \phi^4_2, since many talks try to find the critical point of the theory with more or less elaborate methods (8 loop perturbation theory, and various non-perturbative Hamiltonian methods). I recommend in particular the talk of Joan Elias Miro on renormalized Hamiltonian truncation methods, which I found very clear and interesting. There are also nice tensor network talks by the usual suspects (Mari Carmen Banuls, Frank Pollman, Guifre Vidal, Karen Van Acoleyen, Philippe Corboz). Finally there is an intriguing talk by Giuseppe Mussardo on the sinh-Gordon model.

The semester at the Newton Institute was clearly geared more towards mathematics, with important emphasis on modern probabilistic approaches, starting from the stochastic quantization of Euclidean field theories. The semester opens with 4 really amazing lectures by Antti Kupiainen on the renormalization group (supplemented by lecture notes). He works with Euclidean \phi^4 in all dimensions, on the lattice and in the continuum limit, and explains everything that can happen. He distinguishes very well the IR scaling limit and UV continuum limit problems, the various fixed point structures, the easy and hard problems, many issues which had always been quite confused in my mind. It’s a pleasure to listen to people who understand what they are doing. There is another talk, more like a work in progress, where Martin Hairer attempts the stochastic quantization of Yang-Mills (which starts from a quite original explanation of what a gauge theory is!). I have not had much time to check the other talks, but the whole program looks really interesting (with a lot of different ways to define rigorously \phi^4_3). I watch these while ironing my shirts, so I will know more at the next laundry.

Great work from friends

My smart friends have been doing great work recently and I think it deserves attention.

I Understanding deep neural networks theoretically

Jonathan Donier, who now works for Spotify in London after a PhD in applied maths in Paris, has put a series of 3 fundamental articles on theoretical machine learning:

1) Capacity allocation analysis of neural networks: A tool for principled architecture designarXiv:1902.04485
2) Capacity allocation through neural network layers arXiv:1902.08572
3) Scaling up deep neural networks: a capacity allocation perspective arXiv:1903.04455

In these papers, Jonathan defines and explores the notion of capacity allocation of a neural network, which formalizes the intuitive idea that some parts of a network encode more information about certain parts of the input space. The objective is to understand how a given architecture of network manages to capture the structure of correlations in the input. Ultimately, this should allow one to go beyond fuzzily grounded heuristics and expensive trial and error in order to design networks with a topology adapted to the problem right from the start.

Jonathan very progressively builds up the theory from basic definitions to non-trivial scaling prescriptions for deep networks. The first paper defines the capacity rigorously in the simplest settings and deals mostly with the linear case. The second one considers special non-linear settings where the capacity analysis can still be carried out exactly and where one gets insights about the decoupling role of non-linearity. The final one puts all the pieces together and, among other things, allows to rigorously recover many initialization prescriptions for deep networks that where known only from heuristics. This super quick summary does not do justice to the content: this series of papers is, in my opinion, a major advance in the theoretical understanding of deep neural networks.

II Making measurements crystal clear in Bohmian mechanics

Dustin Lazarovici, who is now a philosopher of physics in Lausanne after a PhD in mathematical physics in Munich, has put online a very clear paper explaining how position measurements work in Bohmian mechanics and what their relation with particle positions is.

Position Measurements and the Empirical Status of Particles in Bohmian MechanicsarXiv:1903.04555

Dustin is perhaps one of the people who has the clearest mind on foundations and Bohmian mechanics in particular. The notion of measurement in Bohmian mechanics is usually so deeply misunderstood that Dustin’s concise explanation is a great reference for anyone interested in these questions. I particularly enjoyed the very end, where the link (or lack of link) with consciousness is precisely discussed. I think it exemplifies what useful work by philosophers of physics can be like: not muddling the water (as physicists usually think philosophers do) but sharpening the reasoning to save physicists from their own confusion.

III Popularizing tricky mathematical notions

Antoine Bourget, who is now a postdoc at Imperial College in London, after a postdoc in Oviedo and a PhD at ENS in Paris (in the same office as me), has put a series of pedagogical videos on Youtube, through his account Scientia Egregia.

The videos are in French, and I recommend in particular the dictionnaire entre algèbre et géométrie. Antoine starts with many simple examples to show the subtleties and motivate the definitions. He explains very well how one constructs mathematical notions to fit a certain intuition, a certain purpose, and thereby manages to make “obvious” really non-trivial concepts. Go check his videos so that he gets pressure to make more.

Heisenberg’s final theory

Last month, I was at Foundations 2018 in Utrecht. It is one of the biggest conferences on the foundations of physics, bringing together physicists, philosophers, and historians of science. A talk I found particularly interesting was that of Alexander Blum, from the Max Planck Institute for the History of Science, entitled Heisenberg’s 1958 Weltformel & the roots of post-empirical physics. Let me briefly summarize Blum’s fascinating story.

In 1958, Werner Heisenberg put forward a new theory of matter that, according to his peers (and to every physicist today) could not possibly be correct, failing to reproduce most known microscopic phenomena. Yet he firmly believed in it, worked on it restlessly (at least for a while), and presented it to the public as a major breakthrough. How was such an embarrassment possible given that Heisenberg was one of the brightest physicists of the time? One could try to find the answer in Heisenberg’s personal shortcomings, in his psychology, in his age, perhaps even in his deluded attempt at making a comeback after the sad episode of his work on the nazi Uranprojekt during World War II. Blum’s point is that the answer lies, rather, in the very peculiar nature of modern physical theories, where mathematical constraints strongly guide theory building.

Heisenberg’s theorizing was allowed by the strong constraints that quantum field theory (QFT) puts on consistency. His goal was to find the ultimate theory not with the help of empirical results (like those coming from early colliders), but from pure theory, with one principle in addition to those of QFT. His idea was to ask for radical monism: deep down, there has to be just one fundamental featureless particle. It has to be spin 1/2 so that integer spin particles can be effectively obtained as bound states. The only non-trivial option is then to add a non-renormalizable quartic interaction term to the Dirac free Lagrangian.


Heisenberg’s Weltformel at the 1958 Planck Centenary in West Berlin. Source: DPA

With only a single fundamental self-interacting spin 1/2 particle, the theory seems far removed from the physics we know. Sure, it could be that all the physics we know, with leptons, hadrons, and electromagnetic forces, could be obtained effectively, from non-trivial bound states made from this fundamental particle. It could be, but most likely it is not the case and so Heisenberg’s crazy conjectures should be easy to disprove. But here comes the catch: the theory is non-renormalizable, and there existed no reliable way to extract predictions from it at the time. It is impossible to falsify something that is not even consistent in the weakest sense available. Heisenberg could argue: maybe the theory is just non-renormalizable at the perturbative level, maybe the singular behavior of the propagator is just a feature of the free theory… Heisenberg could exploit the fact that there were strong doubts about the consistency of QFT anyway, with the Landau pole, and Dyson’s argument about the necessary divergence of perturbative expansions.

Interestingly, it is partly to conclusively disprove Heisenberg’s proposal that rigorous approaches to quantum field theory were developed. Working at the same institute as Heisenberg but deeply skeptical of his theory, Harry Lehman, Kurt Symanzik, and Wolfhart Zimmermann laid the basis of axiomatic field theory. The Källen-Lehmann  (K-L) spectral representation theorem, showing as a corollary that an interacting propagator cannot be more regular than a free propagator, provided a no-go theorem disproving Heisenberg’s speculations.

But Heisenberg could fight back. It was understood at the time that Quantum Electrodynamics contained (at least in some formulations) quantum states with negative norm, the so called “ghosts”. Maybe such ghosts could be exploited to bypass the K-L theorem, yielding cancellations of divergences in the expression of the interacting propagator. This speculation lead to an intense fight with Wolfgang Pauli in 1957, the “battle of Ascona”. Pauli argued that ghosts, if exploited in this fashion, would never “stay in the bottle”, and would necessarily make the theory inconsistent. After 6 weeks of intense work, Heisenberg came up with a toy model combining a unitary S-matrix (hence consistent in the sense required) and containing ghosts.

So Heisenberg’s theory was not easy to unequivocally kill, which of course does not make it correct. Heisenberg tried extracting predictions from his theory using new (unreliable) approximation methods, giving essentially random results. Hence he had no option but to fall back on beauty, the only justification for his theory being its radically simple starting point. Nothing ever came from his line of research which no one ever pursued after him. Blum ended his talk with a timely warning: One still needs to beware of falling into Heisenberg’s trap.

In a previous post, I made the simple point that theoretical physicists put too much trust in notions of beauty and mathematical simplicity because of survival bias: we remember the few instances in which it worked, but forget the endless list of neat constructions by excellent physicists that eventually proved empirically inadequate. I did not know of Heisenberg’s theory, but I gladly add it to the list.

Blum’s talk was a teaser for an article he told me is about to be finished. More generally, his study of Heisenberg’s Weltformel is the first step in an inquiry into theorist’s attempts at coming up with a theory of everything from post-empirical arguments (see a well explained description of his group program). This is a timely research program.

One does not need to think too hard to see the obvious parallel between Heisenberg’s story and current attempts at coming up with a theory of everything (or of quantum gravity). One easily finds popular theories that are not manifestly fitting known physics but also not obviously not fitting known physics. They could be correct, but we cannot know for lack of proper non-perturbative tools. Should we trust them only because they are so hard to conclusively disprove and obey some (quite subjectively) appealing principles?

Update 30/05/2019: There is now a book about this story.

Through two doors at once

I have really enjoyed Anil Ananthaswamy’s latest book: Through two doors at once: The Elegant Experiment That Captures the Enigma of Our Quantum Reality. It is very well written and one reads through it like a novel. But, most importantly, it gets the physics right, and the subtleties are not washed away with metaphors. Accurate and captivating, the book strikes a balance rarely reached in popular science books.

The foundations of quantum mechanics is a difficult branch of physics, and almost every narrative shortcut that was invented to convey its subtlety is, strictly speaking, a bit wrong. Further, foundations is an unfinished branch of physics: different group of experts disagree about what the main message of quantum mechanics is and what should be done to make progress in understanding. This makes it hard to popularize the subject without writing incorrect platitudes or pushing one orthodoxy.

Anil’s strategy is to use the simplest experiment illustrating quantum phenomena: the double slit experiment. He discusses the results and shows why they are so counter-intuitive. However, the simple double slit experiment is not enough to go to the bottom of the mystery. Anil thus very progressively refines the experimental setup to gradually add the subtleties that prevent naive stories from explaining away the weirdness of quantum theory. As in a police investigation, Anil interviews the experts of the main interpretations of quantum mechanics, and guides the reader through the explanations they give for each setup. The reader can then decide for herself which story she finds most appealing.

Crucially, I think the different interpretations are presented fairly. Anil does not take a side. I personally much prefer “non-romantic and realist” interpretations of quantum theory: I find accounts of the world where stuff simply moves, be it with non-local laws of motion, far more convincing than alternatives (where there are infinitely many worlds, or where “reality” has a subjective nature). The “realist” view is well represented in the book (which is rare, because it is not “hype”), but I was not annoyed by the thorough discussion of the other possibilities. More radical proponents of one or the other interpretation may however be annoyed by this attempted neutrality.

amazon_coverAnil’s writing style is very enjoyable. He does not make the all too common mistake of using cheap metaphors which are dangerous in the context of quantum mechanics where they provide a deceiving impression of depth and understanding. In this book, you actually learn something. Sure you do not become an expert in foundations, but you get an accurate sense of what motivates researchers in the field. This is both nice in itself, and if you want to keep on digging with a more specialized book. Even though I already knew the technical content of the book, I found the inquiry captivating. I definitely recommend Through two doors at once, especially to my friends and family who want to quickly yet genuinely understand the sorts of questions that drive me.

Disclaimer: I have provided minor help for the rereading of an almost finished draft of the book.

Survival bias and the non-empirical confirmation of physical theories

Survival Bias


drawing by McGeddon

During World War II, the US military did statistics to see where its bombers got primarily damaged. The pattern looked like the picture on the right. The first intuition of the engineers was to reinforce the parts that were hit the most. Abraham Wald, a statistician, realized that the distribution of impacts was observed only for the airplanes that actually came back from combat. Those that were hit somewhere else could probably not even make it back home. Hence it is precisely where the planes seemed to be the least damaged that adding reinforcements was the most useful! This famous story illustrates the problem of survival bias (or survivorship bias).

Definition (Wikipedia): Survival bias is the logical error of concentrating on the people or things that made it past some selection process and overlooking those that did not.

Survival bias is the reason why we tend to believe in planned obsolescence and more generally why we sometimes have the nostalgia of a golden age that never existed. “Back in the days, cars and refrigerators were reliable, unlike today! And back then, buildings were beautiful and lasted forever unlike the crap they construct today!”

But actually none of this is true. Most refrigerators from the sixties stopped working in a few years and the very few that still function today are just in the 0.1% that made it. The same goes for cars which are more reliable than they used to be: the vintage cars we see  around show an impressive number of kilometers, but only because they are part of the infinitesimal fraction that miraculously survived. Finally, most buildings in earlier centuries were poorly constructed, lacking both taste and resistance. Most of them collapsed or got destroyed and this is why new buildings now stand in place of them. The few old monuments that remain are still there precisely because they were particularly beautiful and well constructed for the time. More generally, the remnants of the past we see in our everyday life are not a fair sample of what life used to be. They are, with rare exceptions, the only things that were good enough to not be replaced.


from the great xkcd

Survival bias can explain an impressively wide range of phenomena. For example, most Hedge Funds show stellar historical returns (even after fees) while investing in hedge funds is not profitable on average. This is easy to understand if hedge funds simply have random returns: the hedge funds that lose money after a period go bankrupt or have to downsize for lack of investors while hedge funds that made money survive and increase in size. The same bias explains why the tech success stories are often overrated and why it seems cats do not get more injured when they fall from a higher altitude (wikipedia).

This bias very often misleads us in our daily life. My worry is that it may also mislead us in our assessment of physical theories, especially when we lack experimental data. To understand why, I need to discuss the problem of the “non-empirical confirmation” of physical theories.

Non-empirical confirmation of physical theories

Physicists always use some form of non-empirical assessment of physical theories. Most theories never get the chance to be explicitly falsified experimentally and are just abandoned for non-empirical reasons: it is just impossible to make computations with them or they turn out to violate principles we thought should be universal. As the time between the invention of new physical theories and their possible experimental test widens, it becomes important to know more precisely what non-empirical reasons we use to temporarily trust theories. The current situation of String Theory, which predicts new physics that seems untestable in the foreseeable future, is a prime example of this need.

This is a legitimate question that motivated a conference in Munich about two years ago “Why trust a theory? Reconsidering scientific methodology in light of modern physics“, which was then actively discussed and reported on online (see e.g. Massimo PigliucciPeter Woit and Sabine Hossenfelder). Among the speakers was philosopher Richard Dawid, who has come up with a theory (or formalization) of non-empirical confirmation in Physics, notably in the book String Theory and the Scientific Method.

Dawid contends that physicists so far use primarily the following criteria to assess physical theories in the absence of empirical confirmation:

  • Elegance and beauty,
  • Gut feelings (or the gut feelings of famous people),
  • Mathematical fertility.

I think Dawid is unfortunately correct in this first analysis. The reasons why physicists momentarily trust theories before they can be empirically probed are largely subjective and sociological. This anecdote recalled by Alain Connes in an interview about 10 years ago is quite telling:

“How can it be that you attended the same talk in Chicago and you left before the end and now you really liked it. The guy was not a beginner and was in his forties, his answer was ‘Witten was seen reading your book in the library in Princeton’.”

Note that this does not mean that science is a mere social construct: this subjectivity only affects the transient regime when “anything goes”, before theories can be practically killed or vindicated by facts. Yet, it means there is at least room for improvement in this transient theory building phase.

Dawid puts forward 3 principles, which I will detail below, to more rigorously ground the assessment of physical theories in the absence of experimental data. Before going any further I have to clarify what we may expect from non-empirical confirmation. There is a weak form: we mean by non-empirical confirmation simply a small improvement in the fuzzily defined Bayesian prior we have that a theory will turn out to be correct. This is the uncontroversial understanding of non-empirical confirmation, but one that Dawid deems too weak. There is also a strong form, where “confirmation” is understood in its non-technical sense, that of definitely validating the theory without even requiring experimental evidence. This one, which some high energy theorists might sometimes foolishly defend, is manifestly too strong. Part of the controversy around non-empirical confirmation is that Dawid wants something stronger than the weak form (which would be trivial in his opinion) but weaker than the strong form (which would be simply wrong). However, because it is quite difficult to understand precisely where this sweet spot would lie, Dawid has often been caricatured as defending an unacceptably strong form of his theory.

What we may expect from non-empirical hints is an important question and I will come back to it later. Right now, I ask: can we find good guides to increase our chances to stay on the right track whilst experiments are still out of reach?

Dawid’s principles

  1. No Alternative Argument (aka “only game in town”):
    Physicists tried hard to find alternatives, they did not succeed.
  2. Meta-Inductive Argument:
    Theories with the same characteristics (obeying the same heuristic principles) proved successful in the past.
  3. Unexpected Explanatory Interconnections:
    The theory was developed to solve a problem but surprisingly solves other problems it was not meant to.

These principles are manifestly crafted with String Theory in mind for which they seem to fit perfectly. String Theory is not the only game in town but it is arguably more developed than the alternatives (and arguably more developed than some alternatives I find interesting). String Theory also fares well on the Meta Inductive Argument: it uses extensively the ideas and principles that made the success of previous theories, especially those of quantum field theory. In the course of the development of String Theory, a lot of unexpected interconnections also emerged. Many of them are internal to the theory: different formulations of String Theory actually seem to describe different limits of the same thing. But there are also unexpected byproducts: e.g. a theory constructed to deal with the strong nuclear force ends up containing gravitational physics.

At this stage, one may be tempted to nitpick and find good reasons why String Theory does not actually satisfy Dawid’s principles, possibly to defend one’s alternative theory. However, I am not sure this is a good line of defense and think it draws the attention away from the interesting question: independently of String Theory, are Dawid’s principles a good way to get closer to the truth?

Naive meta check

We may do a first meta check of Dawid’s principle, i.e. ask the question:

Would these principles have worked in the past?
Would they have guided us to what we now know are viable theories?

We may carry this meta check on the Standard Model of particle physics (an instance of Quantum Field Theory) and General Relativity.

At first sight, both theories fare pretty well. It seems that quantum field theory quickly became the main tool to describe fundamental particles while being the simplest extension of the principles that were previously successful (quantum mechanics and special relativity). Further, quantum field theoretic techniques unexpectedly applied to a wide range of problems including classical statistical mechanics. General relativity also seemed like it was the only game in town, minimally extending the earlier principles of special relativity introduced by Einstein. The question of the origins of the universe, which General Relativity did not primarily aim to answer, was also unexpectedly brought from metaphysics to the realm of physics. I chose simple examples, but it seems that for these two theories, there are plenty of details which fit into the 3 guiding principles proposed by Dawid. The latter look almost tailored to get the maximum number of points in the meta check game.

Fooled by survival bias

As convincing as it may seem, the previous meta check is essentially useless. It shows that successful theories indeed fit Dawid’s principles. But we have looked only at the very small subset of successful theories. It does not tell thus that following the principles would have led us to successful theories rather than unsuccessful ones. In the previous assessment, we were being dangerously fooled by survival bias. We looked at the path ultimately taken in the tree of possibilities, focusing on its characteristics, but forgetting that what matters is rather the difference with other possible paths.

To really meta check Dawid’s principles, it is important to study failures as well: the theories that looked promising but then were disproved and ultimately forgotten. For obvious reasons, such theories are no longer taught thus all too easy to overlook.

A brief History of failures

Let us start our brief History of promising-theories-that-failed by Nordström’s gravity. This theory is slightly anterior to General Relativity and was proposed by Gunnar Nordström in 1913 (with crucial improvements by Einstein, Laue, and others; see Norton for its fascinating history). It is built upon the same fundamental principles as General Relativity and differs only subtly in its field equations. Mainly, General Relativity is a tensor theory of gravity, in that the Einstein’s tensor G_{\mu\nu}= R_{\mu\nu} - \frac{1}{2} R g_{\mu\nu} is proportional to the matter stress-energy tensor T_{\mu\nu}:

R_{\mu\nu} - \frac{1}{2} R g_{\mu\nu} = \frac{8 \pi G}{c^4} T_{\mu\nu}

Nordström’s theory is a simpler scalar theory of gravity. The curvature R is sourced by the trace T:=T_{\mu}^\mu of the stress-energy tensor. This field equation is insufficient to fully fix the metric and one just adds the constraint that the Weyl tensor C_{abcd} is zero:

R = \frac{24  \pi G}{c^4} T

This makes Nordström’s theory arguably mathematically neater than Einstein’s theory. Further, while it brings all the modern features of metric theories of gravity, its prediction are in many cases quantitatively closer to the predictions of Newton’s theory. Finally, for two years, it was the only game in town as Einstein’s tensor theory was not yet finished.

But Nordström’s theory predicts no light deflection by gravitational fields and the wrong value (by a factor -\frac{1}{6}) for the advance of the perihelia of Mercury. These experimental results were not known in 1913. If we had had to compare Nordström’s and Einstein’s theories with Dawid’s principles, I think we would have hastily given Nordström the win.

Another example of a promising theory that was ultimately falsified is the SU(5) Grand Unified theory, proposed by Georgi and Glashow in 1974. The idea is to embed the Gauge groups U(1)\times SU(2) \times SU(3) of the Standard Model into the simple Gauge group SU(5). In this theory, the 3 (non gravitational) forces are the low energy manifestations of a single force. Going towards greater unification had been a successful way to proceed, from Maxwell’s fusion of electric and magnetic phenomena to Glashow-Salam-Weinberg’s electroweak unification. Further, the introduction of a simple Gauge group mimics earlier approaches successfully applied to quarks and the strong interaction. The theory of Georgi and Glashow seems to leverage the unreasonable effectiveness of mathematics (coined by Wigner) in its purest form.

The SU(5) Grand Unified Theory predicts that protons can decay and have a lifetime of ~10^{31} years. The Super-Kamiokande detector in Japan has looked for such events, without success: if protons actually decay, they do so at least a thousand times too rarely to be compatible with SU(5) theory. Despite the early enthusiasm and its high score at the non-empirical confirmation game, this theory is now falsified.

Physics is full of such examples of theoretically appealing yet empirically inadequate ideas. We may mention also Kaluza-Klein type theories unifying gauge theories and gravity, S-matrix approaches to the understanding of fundamental interactions, and Einstein’s and Schrödinger’s attempts at unified theories. We can probably add many supersymmetric extensions of the Standard Model to this list given the recent LHC null results. In many cases, we have theories that fit Dawid’s principles even better than our currently accepted theories, but that nonetheless fail experimental tests. The Standard Model and General Relativity do pretty well in the non-empirical confirmation game, but they would have been beaten by many alternative proposals. Only experiments allowed to choose the right yet not-so-beautiful path.


Looking at failed theories makes Dawid’s principles seem less powerful than a test on a surviving subset. But I do not have a proposal to improve on them. It may very well be that they are the best one can get: perhaps we just cannot expect too much from non-empirical principles. In the end, I am not sure we can defend more than the weakest meaning of non-empirical confirmation: a slight improvement of an anyway fuzzily defined Bayesian prior.

Looking at modern physics, we see an extremely biased sample of theories: they are the old fridge that is still miraculously working. Their success may very well be more contingent than we think.

I think this calls for more modesty and open-mindedness from theorists. In light of the historically mixed record of non-empirical confirmation principles, we should be careful not to put too much trust in neat but untested constructions and remain open to alternatives.

Theorists often behave like deluded zealots, putting an absurdly high level of trust in their models and the principles on which they are built. While it may be efficient to obtain funding, it is suboptimal to understand Nature. Theoretical physicists too can be fooled by survival bias.

This post is a write up of a talk I gave for an informal seminar at MPQ a few months ago. As main reference, I have used Dawid’s article The Significance of Non-Empirical Confirmation in Fundamental Physics (arXiv:1702.01133) which is Dawid’s contribution to the “Why trust a theory” conference.