soft question – Why are mathematical proofs that rely on computers controversial?

The Question :

106 people think this question is useful

There are many theorems in mathematics that have been proved with the assistance of computers, take the famous four color theorem for example. Such proofs are often controversial among some mathematicians. Why is it so?

I my opinion, shifting from manual proofs to computer-assisted proofs is a giant leap forward for mathematics. Other fields of science rely on it heavily. Physics experiments are simulated in computers. Chemical reactions are simulated in supercomputers. Even evolution can be simulated in an advanced enough computer. All of this can help us understand these phenomena better.

But why are mathematicians so reluctant?

The Question Comments :
  • A proof is a social process.
  • The part of a proof you let the computer do is extremely hard to check for correctness (except in simple cases). Software has a pronounced tendency to have bugs. You have to check the programme for correctness, and you have to check the compiler for correctness. And the hardware too, it is not unheard-of that hardware is buggy.
  • The veracity of a conjecture is often less important than the techniques required to demonstrate it. We could, for instance, happily assume the Riemann Hypothesis is true (many already believe it is true on faith alone). Yet doing so does nothing to enhance our understanding of it.
  • @DanielFischer Our heads have a pronounced tendency to have bugs too.
  • @DanielFischer “The part of a proof you let the computer do is extremely hard to check for correctness”. That depends. E.g., Metamath and HOL-Light have a really simple easy to understand trusted kernel with multiple cleanroom implementations in different programming languages, some with independent correctness proofs, which leaves negligible room for incorrectly-approved theorems. See for example section 4.1 in, and, and

The Answer 1

111 people think this answer is useful

What is mathematics? One answer is that mathematics is a collection of definitions, theorems, and proofs of them. But the more realistic answer is that mathematics is what mathematicians do. (And partly, that’s a social activity.) Progress in mathematics consists of advancing human understanding of mathematics.

What is a proof for? Often we pretend that the reason for a proof is so that we can be sure that the result is true. But actually what mathematicians are looking for is understanding.

I encourage everyone to read the article On Proof and Progress in Mathematics by the Fields Medalist William Thurston. He says (on page 2):

The rapid advance of computers has helped dramatize this point, because computers and people are very different. For instance, when Appel and Haken completed a proof of the 4-color map theorem using a massive automatic computation, it evoked much controversy. I interpret the controversy as having little to do with doubt people had as to the veracity of the theorem or the correctness of the proof. Rather, it reflected a continuing desire for human understanding of a proof, in addition to knowledge that the theorem is true.

On a more everyday level, it is common for people first starting to grapple
with computers to make large-scale computations of things they might have
done on a smaller scale by hand. They might print out a table of the first
10,000 primes, only to find that their printout isn’t something they really
wanted after all. They discover by this kind of experience that what they
really want is usually not some collection of “answers”—what they want is

Some people may claim that there is doubt about a proof when it has been proved by a computer, but I think human proofs have more room for error. The real issue is that (long) computer proofs (as opposed to, something simple like checking a numerical value by calculator) are hard to keep in your head.

Compare these quotes from Gian-Carlo Rota’s Indiscrete Thoughts, where he describes the mathematicians’ quest for understanding:

“eventually every mathematical problem is proved trivial. The quest for ultimate triviality is characteristic of the mathematical enterprise.” (p.93)

“Every mathematical theorem is eventually proved trivial. The mathematician’s ideal of truth is triviality, and the community of mathematicians will not cease its beaver-like work on a newly discovered result until it has shown to everyone’s satisfaction that all difficulties in the early proofs were spurious, and only an analytic triviality is to be found at the end of the road.” (p. 118, in The Phenomenology of Mathematical Truth)

Are there definitive proofs?
It is an article of faith among mathematicians that after a new theorem is discovered, other simpler proofs of it will be given until a definitive one is found. A cursory inspection of the history of mathematics seems to confirm the mathematician’s faith. The first proof of a great many theorems is needlessly complicated. “Nobody blames a mathematician if the first proof of a new theorem is clumsy”, said Paul Erdős. It takes a long time, from a few decades to centuries, before the facts that are hidden in the first proof are understood, as mathematicians informally say. This gradual bringing out of the significance of a new discovery takes the appearance of a succession of proofs, each one simpler than the preceding. New and simpler versions of a theorem will stop appearing when the facts are finally understood. (p.146, in The Phenomenology of Mathematical Proof).

In my opinion, there is nothing wrong with, or doubtful about, a proof that relies on computer. However, such a proof is in the intermediate stage described above, that has not yet been rendered trivial enough to be held in a mathematician’s head, and thus the theorem being proved is to be considered still work in progress.

The Answer 2

22 people think this answer is useful

This answer began as a sequence of comments to a great article linked by Joseph Geipel in his answer. Eventually I decided to leave it as an answer in its own right.

It is a great article: I found it to be very balanced and accurate, and they did a good job of finding the right people to interview: thoughtful, veteran mathematicians at several points on the computers/no-computers spectrum. (Well, to be fair the “no computers guy” was rather gentle in his expression of his views. If they had looked hard enough they could probably have found some eminent mathematician to take a more extreme “computers are gross” stance. I didn’t miss hearing that viewpoint!) But I didn’t see anything in the article defending the position that computer-assisted proofs are inherently controversial. I think that is in fact a backward position that very few contemporary mathematicians take and that the profession as a whole has moved past.

My former colleague Jon Hanke makes a great point in the article about not blindly trusting computer calculations. But (as Jon well knows) that point can and should be made much more widely: it applies to any long, difficult calculation or intricate casewise argument for which the full details have not been provided or with which the community of referees chooses not to fully engage.

Let me respond briefly to some of the arguments expressed here (and certainly elsewhere) against computer assisted proofs.

The goal of mathematical proof is increased insight and understanding, and computer-assisted proofs just give us the answer.

There is something to this sentiment, but it is aiming at the wrong target. Mathematics has a venerable tradition of long, intricate computational arguments. Would it be better to have shorter, more conceptual, more insightful proofs instead? Of course it would! Everyone wants that. But proving theorems is really, really hard, and the enterprise of mathematics should be viewed as an accretion of knowledge over time rather than a sequence of unrelated solutions to problems.

“Pioneer work is clumsy” someone famous said: the first proof of an important result is very often not the best one in any way: it is more computational, or longer, or harder, or uses extraneous ideas,…,than the proofs that the community eventually finds. But finding the first proof — even a very clumsy one indeed — is clearly one of the most important steps in the accretion process. Gauss’s first proof of quadratic reciprocity was by induction! But he proved it, and in so doing it opened the door to later and better proofs (including another six or so of his own). Gauss’s Disquisitiones Arithmeticae is full of what to contemporary eyes looks like long, unmotivated calculations: nowadays we can give proofs which are shorter and less computational (most of the time: Gauss was such a blazing pioneer that he found many results that he had “no right to find”, and his work retains a distinctly nontrivial aura to this day).

People who argue against computer assisted proof for this reason seem to have the wrong idea about the culture of mathematics. One might naively worry that the Appel-Haken long computational proof of the Four Color Theorem would have stopped people from thinking about this problem and that for ever after we would be stuck with this long, uninsightful (apparently; I am not an expert here) argument. But of course that’s not what happened at all: their work increased the interest in the Four Color Theorem, and by now we have a still-computer-assisted but much simpler proof by Robertson, Sanders, Seymour, and Thomas, as well as a fully formalized proof by Gonthier. (One of the main differences in their proof from the original proof — which I did not fully appreciate until I got some helpful comments on this answer — is that the Appel-Haken proof also had extensive and poorly documented hand computations. This then becomes an instance of “doing one’s computational business properly” as discussed elsewhere in my answer.)

A key point is that you don’t need a computer to give a long, uninsightful, calculational proof: lots of mathematicians, including some of the very best, have offered such proofs using hand calculation. Because computers are so much better at calculations than human beings, we can use computers to calculate in different and deeper ways than is possible by hand.

Doron Zeilberger likes to exhibit single screens of code that prove more general versions of theorems that appear in published papers, often with lengthy (and certainly computational) proofs. Sometimes though Zeilberger (whom I have corresponded with and I hold to be a truly great mathematician; he has, though, decided to take the extreme stance in this particular debate, which I feel is not always to the credit of his cause) can be disingenuous in describing the origin of this code: he looks forward to the day when his computer will itself read and respond to the work of these human colleagues and figure out for itself what calculations to do. It is reasonable to expect that something like this will occur at some point in the future. But he sometimes likes to facetiously pretend that that’s what’s happening in the present. Of course it isn’t: Zeilberger is brilliantly insightful on how to find exactly the right routine calculation which, if it works, will essentially immediately prove a big theorem. What he’s doing is actually much more impressive than he makes it look: he’s a little like Harry Potter using real magic to do what appears to you to be a card trick. But he’s making a very important point: what you can do with computers is absolutely confluent with the goals of proof and progress in mathematics, especially increased conceptual understanding.

Computer proofs are less trustworthy than human proofs, because we cannot check what the computers are doing.

This argument really is twaddle, it seems to me. First of all, computers are as good as the people who program them, and there are good and bad ways to do one’s programming business (again, this is Jon Hanke’s point). What is frustrating (to Jon, and also to me) is that the mathematical community has been slow to regulate this programming business, to the extent that it is possible to publish a paper saying “Proof: I got my computer to do this calculation. (You can write to me if you want to see the details.)” Is that good practice? Of course not. But it’s not particular to computers. It is indicative of a larger standard of careless and disdain that we have for calculations over (easier, in many cases) big ideas. Many of us when refereeing a paper — even an important one — are too lazy or too snobbish to actually roll up our sleeves and check a calculation. (I have been a referee for a paper which turned out to have a significant error, coming from a calculation that I was too snobbish to check. At first I was actually relieved to hear that the mistake was “only in a calculation”: in some sense it hadn’t really gotten by me. And then I realized how thoroughly stupid that was, and that the error “hadn’t really gotten by me” in the same sense that a ball doesn’t really get by the shortstop who has decided that he won’t even make a play for it: in some narrow technical sense it may not be an error on my part, but the outcome is the same as if it were. Defensive indifference should not be a proud mathematical position.)

If you do your computer proof well it becomes so much more reliable than a long hand calculation that probably no human being will even seriously claim to have checked. Just ask Thomas Hales, who proved the Kepler Conjecture and published it in the Annals…but the referees admitted that they couldn’t check all the details, so eventually they gave up and published it anyway. He has spent a lot of time since then working on a proof that can be checked by a computer proof assistant in the technical sense. Of course this type of proof is more reliable than a messy hand calculation: isn’t it just silly to think otherwise?

The flip side of this coin is that many results which are held up as being pillars of conceptual mathematics have proofs that are not perfectly reliable. In mathematics we have a tradition of not speaking badly on each other’s work in a subjective way. I can’t publish a paper saying that I find your paper to be fishy and I encourage others not to accept it. I need to actually say “Theorem X.Y is faulty”. That gentility has negative consequences when you combine it with other bad practices; nevertheless overall I really like it. But for instance when in the 1980’s people talked about the grand completion of the classification of finite simple groups, they were being very gentle indeed. There was a key part of the proof that was never published but appeared only privately in an incomplete manuscript of about 200 pages in length. In an interview about ten years later (I believe), Serre made the point — nicely, of course — that a 3000 page proof in which 200 pages are missing [and missing because there may be a snag somewhere!] is really not a complete proof, and he speculated rather wryly that the question was not whether there was a gap in the proof — of course there was — but whether the gap was large enough to allow a further finite simple group to pass through it! Things were later patched up better in the “second generation proof”, and there is a better — shorter, more conceptual, more powerful — “third generation proof” coming. Pioneer work is clumsy.

I honestly think that if circa 1982 you believed more in the classification of finite simple groups than the four color theorem then you would simply be showing religious prejudice. There is no rational justification.

Again, the key point here is that the dichotomy between computers / not computers is a very false one: if we do our business correctly, we can use computers to make many (I won’t say “most” or “all”, but who knows what the future will bring) of our arguments more reliable.

Added: Thanks to some comments from Will Orrick, I looked (for the first time) at the RSST paper. Their introduction contains some issues that I was unaware of. In particular they say that they were motivated by doubts on the validity of the Appel-Haken proof. For this they cite two reasons (I may as well quote):

(i) Part of the A&H proof uses a computer and cannot be verified by hand, and
(ii) even the part of the proof that is supposed to be checked be hand is extraordinarily complicated and tedious, and as far as we know, no one has made a complete independent check of it.
Reason (i) may be a necessary evil, but reason (ii) is more disturbing…

I find it remarkable how closely these comments parallel the points I made in my answer. As for (i): sure, yes, we would like a pure thought proof. As yet we don’t have one, and a computationally intensive proof is much better than no proof at all. As for (ii): that is exactly the point that Jon Hanke is still trying to make today! In fact, the real issue with their proof is that given that it is highly computational, they did not use computers as much as they should have. (I hope it is clear that I am not really criticizing Appel-Haken. As I said several times, pioneer work is clumsy, and their work was pioneer work in a very strong sense.) The latter computer assisted proofs really do let the computers do the computation, which especially in the case of Gonthier’s proof is a big improvement.

The Answer 3

14 people think this answer is useful

Here’s a great article on why.

There are several reasons. One of the biggest ones is that how one solves the problem is frequently more interesting and useful than the actual result. If that how is “we did an exhaustive search of all the possibilities,” you don’t really get much other than the result. There’s also the risk that relying on computers removes the need to improvise with novel ideas.

Another reason is that bugs and faults in the computing are sometimes considered too much of a risk to mathematical certainty.

The Answer 4

11 people think this answer is useful

See also this article:

Ugly Mathematics: Why Do Mathematicians Dislike Computer-Assisted Proofs?
The Mathematical Intelligencer, December 2012, Volume 34, Issue 4, pp 21-28

Here is an abstract found here:

The author discusses an analogy between narratives and mathematical proofs that tries to account in a simple manner for the ugliness of computer-assisted proofs. He mentions that the ugliness is not essentially associated to methodological or epistemic problems with the evidence. He states that nonbeautiful proof may just be an uninspiring toward where mathematicians reveal indifference.

For another summary, see this ZMATH review.

The Answer 5

5 people think this answer is useful

I think the big issue here is not whether computer assistance is used, but whether the resulting proof is human-comprehensible: it is quite unsatisfying to have a proof that you cannot actually wrap your head around, and can certainly leave one wondering if there aren’t bugs in the software.

I don’t think anybody really minds the use of a proof assistant where it can generate reasonably-comprehensible proofs, and is just used out of laziness and/or in order to prevent stupid mistakes.

(And of course a proof which cannot itself be comprehended by humans really shouldn’t be trusted at all unless the code to generate it is available so that the output can be checked using various proof-checking tools, like the Coq kernel.)

Note that the 4-colour theorem has now been proven using Coq, not just with that implausibly-large “hand-checked” computer output.

The Answer 6

3 people think this answer is useful

here is an interesting/deeper perspective/angle in addition to those worthwhile/more standard/surface answers so far.

there is known to be a strong correspondence between proofs and programs/algorithms. this was formalized decades ago in the Curry-Howard correspondence.

a proof in many ways is similar to “an algorithm that runs in a mathematicians head”. they both have divide-and-conquer aspects (where theorems/lemmas are similar to subroutines), iteration/loops, etcetera. unfortunately some proofs are so complex that they span thousands of pages, and cannot be checked/conceptualized by single mathematicians, very much like complex computer programs. so there is an aversion in this case that may be related to the large complexity of the proofs.

in other words if a “proof is like a program that runs in a mathematicians head”, unfortunately human brains are limited in range/span of what they can conceptualize. the recognized psychological concept of chunking is highly leveraged in mathematics but has limitations. so any aversion may be related to the natural human tendency to try to conceptualize as much as possible and running into limitations.

mathematics has been carried out for thousands of years by mathematicians, and by that measure, computers are an upstart.

another possible angle is that there is somewhat a split in mathematics between continuous/topological type approaches and combinatorics, the latter being more prone to automated theorem proving (and eg the area where major theorem proving proofs have occurred such as the 4 color conjecture/theorem). this split is characterized in a famous quote attributed to Whitehead, quoted by a student Higman[1]:

Combinatorics is the slums of topology. —Whitehead

the controversiality of automated theorem proving is somewhat lessening in recent years due to major advances and in many ways CS/mathematics fields are becoming more interlinked, and its plausible to expect that trend to continue and heighten this century.

an excellent ref envisioning the future of man/machine interaction in this area is [2] where a hypothetical conversation is given of the mathematician engaging with a Turing-test like dialogue with a computer to derive new results, in nothing less than a collaboration. the author Gowers a Fields medalist has also recently been researching this area (automated theorem proving) and collaborating with a computer scientist
Ganesalingam in a way hinting/indicative of a paradigm shift.[3]

[1] Combinatorics entering the third millennium, Peter J. Cameron

[2] Rough structure and classification Gowers

[3] A fully automatic problem solver with human-style output M. Ganesalingam, W. T. Gowers

The Answer 7

2 people think this answer is useful

I think a computer-assisted or computer-generated proof can be less convincing if it consists of a lot of dense information that all has to be correct. A proof should have one important property: a person reading it must be convinced that the proof really proves the proposition it is supposed to prove.

Now with most computer generated proofs that in fact is not the case. What often happens is that the proof is output of an algorithm that for instance checks all possibilities in a huge state space. And then we say the computer proved it since all states has been checked and they were all ok.

An important question here is: can there be a simple, elegant proof if we have to check a million cases separately and cannot use an elegant form of induction or other technique to reduce the number of logical cases?

I would say no, so in this case it does not matter if the output is from a computer or not. However, most mathematicians would look for a more elegant solution where the number of distinct cases to check fits one page of paper, and would only consider a proof of that size a nice proof.

The Answer 8

1 people think this answer is useful

Proofs are supposed to be understood, which is impossible if they are non repetitious and extremely long, as is possible for computers but not as much for humans. I think a big source of controversy also is that people fear that computers will take over at some point, forcing them to give up maths. Also computers give connected people advantage, while without computers math is very open.

The Answer 9

1 people think this answer is useful

I am going to give a contrary answer.

There have been some misguided points that computers can’t do certain proofs beyond brute force. Those are obviously wrong, for the same reasons that it is often incorrect to say that computers cannot represent $\sqrt{2}$ exactly. Computers are great at symbol manipulation.

But there have been several comments on how mathematicians like beauty and simplicity, which try to explain some other dimension that mathematicians look for. Unfortunately, these are also horribly misguided answers. There is no reason that computers cannot find elegant answers among the many they find. There are already a number of good metrics and heuristics computers can use, including:

  • proof size (computers already can find shorter proofs than literature on a number of areas and clearly can search faster for reductions)
  • use of ideas from other areas (computers are good both at classification and ignoring irrelevant classification in choosing candidate proof “moves” – so valuation of a proof may be done by metrics of distance between fields used in the steps)
  • metrics of consonance with other results
  • metrics of generality or counts of earlier results generalized

and so on.

I fear the real answer for this question is that there is no valid reason some mathematicians distrust the general notion of computer proofs. There are reasons why particular proofs may be disliked (no matter their origin), but when you see mathematicians straining to give reasons that are so clearly false, it is almost certain that they are really hiding very human and natural fears that they may one day become less relevant. The same reaction can be seen in manufacturing workers in areas that are moving towards greater robotic manufacture and generally, people want to believe that there is something uniquely great about humanity that their field will always require the actions they have trained to.

I personally think we will always have a fundamentally important relationship with the automation we create, but I think the capabilities we give that automation will change the nature of that relationship over time. That kind of speculation, though, is more off-topic to the question.

The Answer 10

1 people think this answer is useful

While I totally agree with ShreevatsaR and other answers and comments in the same school of thought, I wanted to avert your attention to the fact that the phrases “proofs relying on computers” or “computer assisted proofs” should be used with care.

There are two ways that computers help in proving mathematical theorem. On is (as discussions above allude to) having a computer consider all different possible states checked to affirm or reject certain property (or theorem). In this case I agree that proofs are more like “phone book” than a “poem” (as said by Kenneth Appel and referred to by K. Rmth on a comment on the question itself). These kind of proofs are only worth as a mere assurance that the theorem is indeed true and mathematicians should not consider refuting that theorem and work towards finding a well formalized and understandable proof.

On the other hand, we can have computers help in the process of proving mathematical theorem in a much different way, i.e., as proof checkers, e.g., Coq, Isabelle, etc. These proof checkers are mere tools where a mathematician can formally write their proofs and have them checked by the computer to be valid proofs. The proof checking mechanism used in these proof checkers is based on curry-howard correspondence and therefore, would require a high level of rigour and guarantee the correctness of proofs. The correctness of proofs is a consequence of De Bruijn’s criterion which requires them to have a tiny core (the correctness of which is proven by hand) to which all proofs are sent for being checked.

In the latter case the computer assisted proofs are not only not controversial but rather much desired as they leave no room for any kind of human error in proving mathematical theorems.

The Answer 11

0 people think this answer is useful

“The part of a proof you let the computer do is extremely hard to check for correctness (except in simple cases). Software has a pronounced tendency to have bugs. You have to check the programme for correctness, and you have to check the compiler for correctness. And the hardware too, it is not unheard-of that hardware is buggy”

Today’s Microsoft .NET software is maintained and updated constantly by thousands of people worldwide. It is common practice for well designed code to build in testing facilities as you write the code, or Built In Self Test(BIST). These facilities are built in to run on PCs, run on embedded micro-controller targets. As you develop the code, side by side with the test code, numerical calculations and their results can use the BIST to ensure that i.e. windowed values, tolerances, floating point calculations, and other parameters are what they should be. Of course all of this is quantified based on the design specification as this drives the design of the application and electronic hardware itself.

Provided that the design has been implemented according to design specification, and design itself has been correctly defined to do the intended job, then expectation is that the developed system will work. Of course there are differences between hardware and software design practices. As signals travel between hardware performing data acquisition and convert them into software quantities, there is ‘loss’ in the conversion. not just quantization in the analog to digital conversion but also passive filtering, active filtering, DSP processing, firmware processing and software post processing to name a few.

The Answer 12

0 people think this answer is useful

Mathematicians seek not only the answer to a question, but the reason for that answer. Human limitations, and to a lesser extent computer limitations. require us to reduce, and reduction is achieved by the discovery of structure that characterizes large or infinite numbers of arbitrary cases. The discovery of the structure providing the reason for the yes or no answer is often more important than the answer itself.

Much as a constructive proof is more valuable than one merely of existence, a method of reduction that manages to reduce a problem to human scale is more valuable than one that merely manages to reduce a problem to computer scale.

So computer proofs are valuable, but the greater prestige and utility comes from human proofs.

Add a Comment