Friday 9 January 2009

Justifying schemata

An old and now well-familiar adage of Kreisel's has it that our justification for the induction schema in PA, replacement schema in ZFC, and so on, must necessarily come from acceptance of the corresponding second-order principles. There is something very obvious to this observation: it seems difficult to imagine any natural way of justifying the instances of the shcemata, involving arbitrarily complex formulas in this or that formal language, without going through the second-order principle, and noting that the instances of the first-order schema all trivially follow. In such a justification we would presumably use in some ingenious way the syntactic structure of the formulas, and so on. Setting imaginings aside, it might be simply observed there is at the moment anything resembling such a justification. (We can of course imagine a justification of sorts, e.g. by means of some restricted form of transfinite induction, as in Gentzen's proof of (1-)consistency of Peano arithmetic -- but it is difficult to imagine such a justification for the truth of all the instances.)

Many have observed, though, that it is not obvious in just what sense the induction principle, replacement, and so on, are "second-order principles". For, in their natural formulations in ordinary mathematical English
Whenever P is a well-defined determinate property of naturals, if it holds of 0, and whenever it holds of n, of n + 1 also, it holds of all naturals.

Whenever R is a well-defined determinate functional binary relation between sets, for any set A, the image of A, i.e. the set {y | (Ex in A) R(x,y)}, exists.
they involve a notion, of a property being "well-defined" and "determinate", that has no apparent mathematical definition. So, we may say they are faithfully captured by their second-order formulations if we stipulate the higher-order quntifiers range over "well-defined and determinate" properties (and relations, functions and so on). But since this is not a mathematical notion we have achieved nothing, and it is not at all obvious what these principles so formulated have to do with their second-order formulations on the more usual set-theoretic semantics, at least in so far as the epistemological use we make of them is concerned. Perhaps more natural approach is to regard these principles not as formal principles at all, but as informal principles that have, strictly speaking, no definite set of consequences, that are not necessarily captured by any formalisation at all, but rather have an indefinite range of mathematical applications we regard as correct. This range is indefinite, since we can't beforehand restrict the totality of properties we recognise as determinate and well-defined to any mathematically defined definite totality -- it is dependent on our decisions on what properties we regard as determinate and well-defined, and there is no reason to think that there must always be some matter of fact on which such decisions turn. (And even if there are in fact always such matters of fact, and some determined totality of properties that are, in some idealised sense, recognisable by us as determined and well-defined, for Gödelian reasons we can never recognise this totality as such.)

So, disregarding the notion the "informal" induction principle or replacement are "second-order principles", we may describe the resulting picture in the following not very precise terms. Whenever, for whatever reason, we recognise some language defined with mathematical precision, a formal language, as meaningful, so that formulas in the language define well-defined and determinate properties, and sentences meaningful statements that are either true or false as a matter of mathematics, we get, from the "informal" principles, a principle of mathematical precision, by restricting "well-defined and determinate property" in these "informal" principles to range over properties definable in the formal language. Of course, the resulting mathematical principle does not cover all the applications of the informal principle recognisable by us as valid, since, given we recognise the formal language as meaningful, we must surely recognise the notion of a sentence in the language being true, and a formula being true of given objects, as determinate and well-defined (on the picture painted here, we regard this observation not as something to be argued for, but as a triviality, since it is not obvious what sort of meaningfuless of a formal language would license the conclusion of the applicability of the induction principle for the language that did not involve accepting the determinateness and well-definedness of the formal language); but, by well-known results in mathematical logic, these notions are not expressible in the formal language itself.

So far, so standard. There is, however, a small glitch in our development. Recall the replacement schema as formalised in ZFC
(*) (x)(E!y)R(x,y) implies (x)(Ey)(z)(z in y iff (Ew in x)R(x,z))
On the face of it, it's easy to see every instance of the scheme is true given the informal replacement principle and the meaningfulness of the language of set theory. Alas, the instances of the schema (*) are in fact universal closures of sentence obtained by replacing R with formulas possibly with more free variables than just two. That is, they have in fact the form
(*') (a1)(a2) ... (an)(x)(E!y)R(x,y,a1, ..., an) implies (x)(Ey)(z)(z in y iff (Ew in x)R(x,z, a1, ..., an))
The same of course applies to the induction schema, its instances, and their justification given the informal induction principle (except that the language of arithmetic has a name for every natural, so in so far as we're dealing with naturals only the problem has a rather trivial solution, using substitutional quantification). The problem here is that a formula R(x,y, z1, ..., zn) in the language of set theory does not define a well-defined determinate functional binary relation of sets -- it does not define a binary relation at all. We need some further argument to justify the truth of (*') given the informal replacement principle and the meaningfulness of the language of set theory. One would be to argue that the language of set theory augmented with a constant for every set is meaningful, and formulas in it express properties. For this language, there is no difference between schemas (*) and (*'). Alas, if we go this way, we must explain what it means for a language the size of a proper class to be meaningful, why we should recognise the extended language of set theory as such given the usual conception of the world of sets (the cumulative hierarchy, sets as arbitrary extensional collections, and so on), why the meanigfulness of such language guarantees that all formulas define determinate well-defined properties (are there a proper class of properties?), and so on and so forth. Fortunately, we may shift the burden directly to the informal principles themselves, so we get
Whenever P is a well-defined determinate m-ary relation between a natural and j = m-1 objecs of any sort, if given some objects a1, ..., aj if it holds of 0 and a1, ..., aj, and whenever it holds of n and a1, ..., aj, of n + 1 and a1, ..., aj also, it holds of every natural n and a1, ..., aj.

Whenever R is a well-defined determinate functional 2+n-ary relation between two sets and n objects a1, ..., an, given any objects a1, ..., an, for any set A, the image of A under R(x, y, a1, ..., an), i.e. the set {y | (Ex in A) R(x, y, a1, ..., an)}, exists.
These principles are just as evident (on the relevant mathematical picture) as the originals, and have the nice property that they in fact directly yield the corresponding formal schemata. (We have used the notion of natural in the formulation of the induction principle, in referring to relations of arbitrary arity. This is not in itself worrying, since there is no pretence that the induction principle serves in a definition of what a natural is in any non-circular sense. Rather, it functions as a part of explanation or explication of this notion, and in our conceptual and philosophical analysis of the justification of various mathematical principles. Such an analysis need not in any way justify the correctness of that justification, or refrain from making use of concepts justified or explained by what is under analysis.)

Monday 5 January 2009

On the force of "in principle"

"But even if -- imagining a man quite exempt from all influences, examining only his momentary action in the present unevoked by any cause -- we were to admit so infinitely small a remainder of inevitability as equalled zero, we should even then not have arrived at the conception of complete freedom in man, for a being uninfluenced by the external world, standing outside time, and independent of cause, is no longer a man."
Count Leo tolstoy
Second epilogue to War and Peace


It strikes some people as perverse that some, such as Edward Nelson, seriously doubt the consistency of Peano arithmetic (and even weaker theories such as EFA). Such doubts, and Nelson not only doubts the consistency of EFA but is serious about finding a contradiction, are indeed perverse from any ordinary point of view, but they are not unprincipled, that is, they are not examples of arbitrary skepticism. It is an instructive philosophical exercise to make sense of these doubts, not to address or refute them, but rather to articulate the basic preconceptions inherent in our thinking, to see what rests on what. To that end, I'll say a few words about the force of "in principle" in such turns of phrases as "in principle, we can find a counter-example to Goldbach's conjecture if there is one", "if ZFC is inconsistent we can in principle find this out" and so on.

We are in principle capable of wonderful feats. Perusing texts in logic and the philosophy of mathematics we learn for example that we are in principle capable of verifying the correctness of arbitrarily complex formal proofs, of finding a counter-example to Goldbach's conjecture if there is one, of finding a contradiction in ZFC should it be inconsistent, of multiplying arbitrarily large natural numbers, of normalising arbitrarily complex intuitionistic proofs to extract witnesses for existential statements, and so on. It is not a particularly original observation that even though we are in principle able to do all these things, we are, in reality, quite incapable of doing so. It is nevertheless an instructive exercise to examine this apparent tension, and ask just what is the force of ``in principle'', and how it is that we are, so it seems, in principle capable of doing all sorts of things it is utterly fantastic to consider to lie within our actual abilities.

Let's call abilities such as ours of factoring arbitrarily large numbers ("in principle") counterfactual abilities. Such abilities are counterfactual in being counter to actual facts about our abilities. In contrast, I have the real ability to factor any number less than 1000, in that, should I so choose, I could in fact, with some patience or with the aid of a computer or a calculator, go through all numbers less than 1000 and check whether any of them divides the given number. In case of arbitrarily large naturals, my counterfactual ability is obviously not grounded in any such facts about my real abilities. So, what is it grounded in? The answer is rather trivial: we are in principle capable of factoring arbitrarily large naturals in the sense that we are in possession of an algorithm for doing so, an algorithm of which we can mathematically prove that it returns the factors of any given number, and a model of mechanical computability on which it is obvious that in any step of computation we are not required to do anything that is beyond our actual capabilities. That is, even though we will never find ourselves in the position of having carried out 2^65536 long divisions, what is required of us in the hypothetical situation is something we can actually do, such as writing down a single digit, striking out a symbol and so on. (In contrast, with such notions as "a mathematical principle in principle acceptable to us" or "unassailably true arithmetical statement" we have no idea what the relevant counterfactual situations are, or what sort of idealisations regarding our abilities are in question.)

There is a strong intuition that the truth or falsity of Goldbach's conjecture is determined as a matter of mathematical fact, that either there is a natural with these or those properties or not (and, for the intuitionistically minded reader, that either there is a canonical proof of this or that or there is not -- Dummett is one of the few thinkers who have seriously rejected an implicit sort of realism about intuitionistic proofs, leading in the good philosophical tradition to many very obscure musings he blithely ignores when going on about other matters). One source of this intuition is our image of ourselves going through the naturals, carrying out computations and so on. It seems almost unintelligible to suppose there is no fact to unearth, since we can picture ourselves having found out that an even number greater than two is not the sum of two primes, by calculation. (I don't recall who it was, but it has been observed that talk about "metaphors" and "pictures" in philosophy is usually metaphorical...) But, since we are not dealing with our actual abilities, all of our reasoning about arbitrarily complex calculations and such like is necessarily theoretical, grounded on some conception of what it would be like if we were not limited the way we are. On closer inspection, it turns out that in fact this conception does not turn on our nature, that is, it is not based on any physiological, psychological, neurological analysis. Rather, it is based on a mathematical image. In other words, the claim that if Goldbach's conjecture is false we can in principle find a counter-example is not really about us at all, but is merely a restatement of the fact that being a counter-example to Goldbach's conjecture is a decidable property.

Let us return to radical finitism, or ultra-finitism. The rejection of the "in principle" is nicely captured by Gandy in Limitations of Mathematical Knowledge
In discussions about the foundations of mathematics it's usual to suppose that the inscriptions which are used to communicate mathematics (e.g. numerals, formulae, proofs) form a potentially innite totality. Despite the fact that this suppostion runs directly counter to everyday experience, no effort is made to justify it. It is of course admitted that in practice concrete inscriptions are limited in length; but an author who has made this assertion is likely to claim in the very next sentence that in principle arbitrarily long inscriptions may be written and the operations and tests of elementary syntax can in principle effectively performed on them. This seems to me as objectionable as to say that in principle pigs can fly while admitting that in practice they have no wings.
The rejection of the "in principle" is not non-sensical or unprincipled. As noted, our abilities "in principle" to do this or that in fact depend on mathematical theorems, on mathematical pictures. We can't address the ultra-finitist doubts by these "in principles" without running foul of circularity. Of course, this is not to say there is something objectionable or dubious in claims about our counterfactual abilities of this sort -- as noted, it is obvious in what sense we have the ability to factor arbitrarily large naturals etc. The point is rather to make the ultra-finitist position understandable, and to note that our reasoning about counterfactual abilities presupposes a substantial amount of mathematics. It therefore cannot serve as a justification of our mathematical thinking, even though it is undoubtedly very useful in explaining our thinking, and making it palatable. We should be thankful for such people as Wittgenstein and Nelson for providing us with actual examples of real anti-realism, of really taking the notion that understanding of mathematics should be grounded in our actual practice seriously. They serve as a healthy corrective to such thinkers as Dummett, who happily go about "our" abilities and "our" reasoning while obviously describing some unknown species of immortal and indefatigable demi-gods.

Incidentally, Peter Smith now has a link to this blog in his. Peter's blog is recommended to everyone interested in logic and philosophy.

Consistency and existence, with some random ranting

The idea that the completeness theorem vindicates the very natural notion that in mathematics consistency implies existence, that mathematics is a free creation, subject only to the requirement of consistency, and so on, crops up now and then. Some time ago it did just that in sci.logic, but I failed to subject the poor poster bringing it up to my characteristic tedious pedantry and obnoxiousness. Happily, I can do that here in this blog.

The completeness theorem, proven in the usual Henkin style, establishes that if a theory is consistent it has a model, the domain of which is the set of naturals, constant symbols naming particular naturals, and the interpretations of the relation and function symbols being horribly complex random configurations of naturals. Consider now the claim that infinite sets exist if the story we tell about them, as formalised in e.g. ZFC, is consistent. As noted, this is a very natural idea. Alas, the completeness theorem does nothing to support it, because even though it does guarantee the existence of something for a consistent theory, this something is not what we'd like to have. For certainly the claim that infinite sets exist does not mean that there is some horribly complex configuration of naturals under which the axioms of ZFC come out true. It is also apparent that for example inaccessible cardinals do not exist if ZFI = ZFC + "there are inaccessible cardinals" is consistent but proves "ZFI is inconsistent". If we wish to take the natural idea seriously and flesh it out in some philosophically fruitful way, we must explain just what sort of consistency is at issue here, as it is apparent, at least in light of rather trivial observations such as the above, that it is not consistency of formal theories in the technical sense.

The confused idea at issue is a perfect illustration of a very real danger in philosophy of mathematics, of reading philosophical significance into technical results without special argument. Before we get very worked up over and excited about this or that piece of mathematics in a philosophical context, we must carefully figure out just how it is related to the notions under philosophical scrutiny. Usually, the answer is that, despite superficial similarities -- sometimes on terminological level! -- the technical result is just that, a technicality. Another aspect the confused idea nicely illustrates is a sort of intellectual equivocation very common when thinking about mathematics: we wish to view claims about sets and what not from a distance, instead of taking them at face value, as something requiring explanation, philosophical justification and so on, while simultaneously interpreting a technical mathematical result -- in this case, the completeness theorem -- in the usual straightforward manner. This leads to nothing but confusion and bad philosophy, but has its appeal for those of certain bent, in that it seemingly allows us to wax philosophical with mathematical precision.

This is not to say that the completeness theorem is devoid of philosophical significance, or that mathematical results never have philosophical relevance. Perhaps the most famous example of "informal rigour" (if we don't count Turing's analysis of mechanical computability) is Kreisel's proof (in the sense of compelling piece of reasoning, not in the mathematical sense -- hence the "informal" in "informal rigour") that logical validity in the "informal" sense is coextensive with validity in the technical sense. (There's a bit about this in my sci.logic post on formalisation.) In case of completeness and the idea that consistency implies existence what is missing is just such a proof, a piece of compelling reasoning, a careful explanation and analysis. What we have instead is simply a wonderful philosophical revelation, of mathematical clarity, that unfortunately evaporates when subjected to scrutiny. Beware of revelations! Most real insights turn out to be rather dull and not at all exciting, the way things we are intimately familiar and comfortable with are. I suggest that instead of seeking exciting revelations we seek dullness of this kind, that is, dullness that is the opposite of bewilderment; this without any suggestion that philosophy should be dull the way a phonebook is.

Wise words from yesteryear

Peter J. Ross, a notorious poet and a troll, once claimed my posts in Usenet are almost always either enjoyable or informative, and often both. I wouldn't know about such matters, but it is certainly true that Torkel Franzén's contributions, in the news and on various mailing lists, were virtually always enjoyable and informative. I've been scrounging the archives of FOM for nuggets of insight and wisdom, and stumbled upon an old message from Torkel. Neil Tennant repeated the standard line, that in order for a theory to be epistemologically relevant axiomhood must be decidable, which prompted the following reflections from Torkel:
  Although I've said similar things myself often enough, it
only now occurs to me that this isn't really convincing, if
one takes the epistemological motivation seriously.

The basic premise is that we need to able to decide whether
a proposed proof is a proof. I will accept this basic premise,
although it is not an immediate consequence of the necessity
of avoiding an infinite regress in proving theorems. (Merely
being able to verify that a correct proposed proof is a proof,
while coming to no conclusion in the case when it is not a
proof, would be sufficient to avoid such a regress.)

But then it's not enough that the set of axioms *is* recursive.
To be able to check a proposed proof we also need to know that
the set of axioms is recursive, and in fact we need to have an
algorithm for deciding whether or not a formula is an axiom.
Such an algorithm, then, is an implicit part or parameter of
the proof, from an epistemological point of view.
To understand what Torkel's on about, consider the following example, which he himself provided in a later post. Let T be the theory with all arithmetical truths with less than k symbols as axioms. Since there are only finitely many such truths, axiomhood is obviously recursive (and hence decidable). This fact is however of no help whatsoever in deciding whether a given sentence is an axiom or not, and consequently whether a sequence (or a labeled tree or whatever) p is a proof or not -- for sentences with less than k symbols deciding whether it is an axiom or not is equivalent to finding out whether it's true.

The upshot of this is that the usual epistemological argument for requiring theories to be recursively axiomatisable is in fact wanting; by the argument we are only warranted to require recursive enumerability of the axioms. That is, in order to recognise a formal proof p as a proof in a recursively axiomatised theory T, we must have at hand an algorithm for deciding axiomhood (and apply that algorithm to the premises of p, verifying they are indeed axioms) -- the recursiveness of the axioms of T and p in itself are insufficient. Similarly, in order to recognise a formal proof p in a theory T with a recursively enumerable set of axioms, we must have at hand an algorithm for listing the axioms, and a computation that in fact yields the axioms used in p. There is no apparent epistemologically relevant distinction between these cases. As Torkel writes
  If the set of axioms is not recursive, but recursively
enumerable, the property of being a proof will still be
decidable, if we regard as part of the proof the generation,
using an algorithm for enumerating axioms, of the axioms
actually used in the proof. The basic premise, then, is
satisfied even if we only require the axioms to be
recursively enumerable.

I therefore think that what the epistemological argument
supports is only the need for the set of axioms to be
recursively enumerable. A slightly different formal model
of proofs is presupposed - one in which a proof incorporates
an algorithm for generating axioms together with computations
generating the axioms used, rather than just incorporate an
algorithm for deciding axiomhood - but I see no good reason
why the latter model should be preferred.
Of course as everyone knows we can always replace a recursively enumerable set of axioms with a (very unnatural) recursive set with the same deductive closure, by Craig's trick. It is still of independent interest whether epistemological considerations alone justify restricting our attention to recursively axiomatised theories.

Many an interesting observation may be found buried in the archives of FOM and those of Usenet newsgroups. (Alas, it seems Google has decided to utterly wreck its Usenet search function, rendering it almost unusable.)