FotFS VI

FOUNDATIONS OF THE FORMAL SCIENCES VI

Reasoning about Probabilities and Probabilistic Reasoning

Universiteit van Amsterdam

Institute for Logic, Language and Computation

May 2-5, 2007

Abstracts


A Case Where Chance and Credence Coincide

David Atkinson (Groningen, The Netherlands)

Attempts have been made to provide a bridge between chance and credence, such as David Lewis's Principal Principle and Howson and Urbach's transmogrification of von Mises' frequency theory into a subjectivist account. I propose a contribution to the general project of connecting chance and credence by showing that Reichenbach's objectivistic approach is intimately linked to subjectivistic Jeffrey conditionalization.

Reichenbach defended his method of the "appraised posit" in his frequentistic theory of chance, according to which a chain of objective probabilities was provisionally grounded on a posited chance that could be partly determined by empirical success. Jeffrey's technique of updating credences by means of incoming evidence whose reliability remains uncertain resembles Reichenbach's method in purpose if not in form.

I will first derive a compact formula for the probability of an event that is conditioned by an arbitrary, finite chain of events, given the relevant conditional probabilities and a posit. Next I will show that, in the event that the posit is "ideal", in the sense that the probability it generates is exactly right, the Jeffrey update of the posit would be "invariant" (i.e. unchanged). The converse is also proved, namely that an invariant Jeffrey update implies an ideal Reichenbach posit.


Certainly Probable: Belief and Aleatory Probability in Hume's Theory of Induction

Conor Barry (Paris, France)

The emergence of the mathematical concept of probability in the decades before the publication of both the 'Treatise of Human Nature' and the 'Enquiry Concerning Human Understanding' had, as Ian Hacking has most perceptively noted, a critical influence upon the development of HumeÕs conception of induction. Although pure mathematics had traditionally been regarded by philosophers as the touchstone of epistemic certitude, probability constituted a sub-branch of mathematics that was, by nature, indeterminate.

Inspired by Hacking's researches and the critical responses of Larry Lauden and Robert Brown in philosophy of science, as well as comparative researches of ancient and modern skepticism undertaken by Malcolm Schofield, Hume scholars such as Dorothy Coleman and Kevin Meeker have drawn attention, in Hume scholarship, to the ambiguities and inconsistencies in Hume's use of the term "probable". Derived from the Latin probare, meaning to approve or believe, the word "probable", for Hume's contemporaries, was simply a non-mathematical synonym for "credible". Certain passages of Hume make definite use of the term "probable" in the strictly mathematical sense. However, other passages waver between the two meanings of the term. This ambiguity may be resolved, I maintain, if we recognize the equivalence of both.

This essay, thus, unites two uncontroversial points concerning Hume's epistemology into a potentially controversial one. The first is that, according to Hume, the aleatory probability or chance that an event will occur in the future is the ratio of positive outcomes that the event has occurred in the past. The second is that, according to Hume, induction is the process by which a belief forms, involuntarily, from the repeated impressions of sense. In other words, the belief that arises in a human subject is an expression of qualitative, sense vivacity rather than quantitative, reflective ratiocination. The potentially controversial point is as follows. According to Hume, the positive frequency with which an effect follows a cause determines the qualitative vivacity with which we hold belief in that specific cause-effect relationship. Therefore, HumeÕs notion of induction is an epistemic formulation of mathematically probabilistic justification. Hume regards an inductively arrived at belief as the credence in the future occurrence of a particular event based on the ratio of positive outcomes with which that particular event has previously occurred.

Employing the theory of chance, Hume manages to codify, rationally, indefinite qualitative belief. In ordinary life, the human subject may not, of course, take stock, enumerating, precisely, the proportion of confirming and disconfirming instances with which a specific effect follows from a given cause. However, the force and vivacity of sense impressions repeated with a high positive frequency involuntarily generates, within him or her, a corresponding degree of qualitative belief. Probability is, therefore, the operation of subjective credence based on the ratio of past positive outcomes of an event. Hume thinks of probability as a rigorous codification of sense belief, that probability defines, with quantitative precision, a cognitive process that is, in ordinary experience, purely qualitative.


Dutch Books, Group Decision-Making, the Tragedy of the Commons and Strategic Jury Voting

Luc Bovens (London, United Kingdom), Wlodek Rabinowicz (Lund, Sweden)

Distribute white and black hats in a dark room to a group of three rational players with each player having a fifty-fifty chance of receiving a hat of one colour or the other. Clearly, the chance that, as a result of this distribution, (A) "Not all hats are of the same colour" is 3/4. The light is switched on and all players can see the hats of the other persons, but not the colour of their own hats. Then no matter what combination of hats was assigned, at least one player will see two hats of the same colour. For her the chance that not all hats are of the same colour strictly depends on the colour of her own hat and hence equals 1/2.

On Lewis's principal principle, a rational player will let her degrees of belief be determined by these chances. So before the light is switched on, all players will assign degree of belief of 3/4 to (A) and after the light is turned on, at least one player will assign degree of belief of 1/2 to (A). Suppose a bookie offers to sell a single bet on (A) with stakes $4 at a price of $3 before the light is turned on and subsequently offers to buy a single bet on (A) with stakes $4 at a price of $2 after the light is turned on. If, following Ramsey, the degree of belief equals the betting rate at which the player is willing to buy and to sell a bet on a given proposition, then any of the players would be willing to buy the first bet and at least one player would be willing to sell the second bet. Whether all hats are of the same colour or not, the bookie can make a Dutch book - she has a guaranteed profit of $1.

However, it can be shown that a rational player whose degree of belief in (A) equals 1/2 would not volunteer to sell the second bet on (A), neither when her aim is to maximise her own payoffs, nor when she wants to maximise the payoffs of the group. The argument to this effect shares a common structure with models (i) for the tragedy of the commons and (ii) for strategic voting in juries.


Non-pragmatic arguments for Bayesianism

Timothy Childers (Prague, Czech Republic)

Dutch Book arguments purport to establish that degrees of beliefs should correspond to probabilities. However, they are, at least in their traditional form, linked to notions of actual behaviour. Since betting behaviour is influenced by myriad factors including willingness to bet, the capacity to understand odds and risk-aversion, the arguments become untenable.

Recent attempts to remove these links have been provided by Colin Howson (2003 and in Howson and Urbach 2006). We explore HowsonÕs account of Dutch Book arguments as soundness and completeness proofs, contrasting his "Fregean" account with the traditional "pragmatist" account. We also note that similar contrasting accounts are offered in other areas, namely current philosophy of language and logic.

We conclude that there are good reasons to accept Howson's account of the purpose of Dutch Book arguments, but that there are better arguments which will serve that purpose. In particular we employ a representation theorem developed by de Groot 1970 and French 1982 on the basis of some ideas found in de Finetti. The theorem does not rely on a preference ordering, but does rely on the notion of a flat distribution, assuming directly a likelihood ordering. We argue that this assumption is not "too strong", and we further argue that such theorems provide a better framework for non-pragmatic accounts of subjective probability.

DeGroot, Morris H., 1970, Optimal Statistical Decisions (New York: McGraw-Hill).
French, Simon, 1982, "On the axiomatisation of subjective probabilities", Theory and Decision 14, 19-33.
Howson, Colin, 2003, "Probability and Logic", Journal of Applied Logic 1, 151-165.
Howson, Colin and Peter Urbach, 2006, Scientific Reasoning: The Bayesian Approach, 3rd ed., Open Court, La Salle, IL.


What's Happening in Machine Learning Today

David Corfield (Tübingen, Germany)

I shall describe what I see as the major developments in contemporary machine learning, and discuss their philosophical significance. Most noticeable is the use of nonparametric, discriminative models, apparently rendering model selection criteria, such as the Akaike and Bayesian ones, nugatory. In its place we have results which place bounds on the generalisation error rates of models, in terms of their training error rates and the selection process. Some researchers still hope that their discoveries will illuminate human learning, others are happy to devise machines that work. The Bayesian/frequentist divide is still alive and well: what to one side is a prior, looks to the other as a means of regularization to avoid ill-posedness. Many machine learning algorithms can be cast in the maximum entropy framework.


The role of intuitive probabilistic reasoning in scientific theory formation

Helen De Cruz (Brussels, Belgium)

Experimental evidence from cognitive science has revealed a discrepancy between intuitive and formal modes of probabilistic reasoning. When asked to make a probabilistic judgement, laypeople fail to think in statistical terms Instead, they seem to rely on mental modelling of concrete situations (e.g., Johnson-Laird et al., 1999), or on fast and shallow heuristics (e.g., Kahneman & Tversky, 1996). These strategies often lead to errors in probability estimates, e.g., smokers tend to believe that smoking does not significantly increase their chance of getting cancer, because they extrapolate from a very limited data set (i.e., the covariance of cancer and smoking in family members and friends). The cognitive science literature has hitherto mainly focused on the 'negative' aspects of intuitive probabilistic reasoning, in particular its inability to predict the likelihood of single events. However, despite these epistemological limitations, I will defend the view that intuitive probabilistic reasoning plays an important role in scientific practice. I illustrate this with historical examples, including Darwin's Origin of Species (1859), early evolutionary psychology (1970s to early 1990s), and Zahavi's handicap theory (1975). Intuitive probabilistic reasoning plays a legitimate role in guiding scientific creativity, as it can help develop novel ideas in the absence of adequate mathematical models in which such ideas can be phrased more precisely. This is especially important in the early creative stages of theory formation. In the examples mentioned above, the theories were developed well in advance of their mathematical description, and all made extensive use of intuitive probabilistic arguments to bolster claims. For example, statistical methods in population genetics became available more than 50 years after the publication of the Origin of Species, only then enabling evolutionary biologists to discriminate the influence of natural selection on organic evolution.

References Darwin, C. (1859). On the origin of species by means of natural selection or the preservation of favoured races in the struggle for life. London: John Murray.
Johnson-Laird, P.N., Legrenzi, P., Girotto, V., & Legrenzi, M.S. (1999). Naive probability: A mental model theory of extensional reasoning. Psychological Review, 106, 62-88.
Kahneman, D. & Tversky, A. (1996). On the reality of cognitive illusions. Psychological Review, 103, 582-591.
Zahavi, A. (1975). Mate selection: A selection for a handicap. Journal of Theoretical Biology, 53, 205-214.


Model Checking Knowledge in Probabilistic Systems

Carla A.D.M. Delgado (Rio de Janeiro, Brazil) and
Mario Benevides (Rio de Janeiro, Brazil)

Our aim is to develop models, languages and algorithms to formally verify knowledge properties of concurrent Multi-Agents Systems. Two models are proposed and their adequacy is discussed with respect representation and verification of knowledge properties.

First, we address the issue of model checking knowledge in concurrent systems. The work benefits from many recent results on model checking and combined logics for time and knowledge, and focus on the way knowledge relations can be captured from automata-based system specifications. We present a formal language with compositional semantics and the corresponding Model Checking algorithms to model and verify Multi-Agent Systems (MAS) at the knowledge level, and a process for obtaining the global automaton for the concurrent system and the knowledge relations for each agent from a set of local automata that represents the behavior of each agent. Our aim is to describe a model suitable for model checking knowledge in a pre-defined way, but with the advantage that the knowledge relations for this would be extracted directly from the automata-based model.

Finally, we extend the previous approach to reasoning about probabilistic and non-deterministic model of discrete time. Instead of using automata we use Markov decision process. The probabilities measures are defined and the non-determinism are dealt with adversaries. Thus, the global MDP becomes probabilistic. In each state of the system is possible to verify probabilistic temporal formulas PCTL. We extend the language of PCTL with knowledge operators, one for each agent, and provide algorithms for model checking formulas in this language. The semantics associates distribution of probabilities to each set of indistinguishable states.


Can there be a propensity interpretation of conditional probabilities?

Isabelle Drouet (Paris, France)

The propensity interpretation of probability provides us with our sole concept of objective singular probability. To this extent it seems theoretically indispensable. At the same time, it is the only interpretation that is available for the probabilities attached to quite a wide range of natural phenomena Š especially phenomena resorting to fundamental physics.

Yet, the propensity interpretation has experienced many criticisms since it was introduced by Popper in the 1950s. Amongst those criticisms, the most serious one is probably the one known as "Humphreys's Paradox". The paradox states that it is impossible to give a propensity interpretation of conditional probabilities. Humphreys's argument in favour of this statement is roughly as follows: propensity candidates for the interpretation of conditional probabilities --what Humphreys calls "conditional propensities"-- provably do not behave as conditional probabilities do.

The first aim of the present talk is to call Humphreys's analysis into question. More precisely, my criticism deals with the idea according to which what would be the propensity interpretation of conditional probabilities, is analytically contained in the propensity proposition for the interpretation of absolute probabilities. Against this idea, I claim that conditionalisation needs an interpretation of its own, and that this interpretation can in no way be reduced to the interpretation of absolute probabilities. This claim is based on an analysis of the concept of conditional probability, leading to a general account of what it is to interpret conditional probabilities Š and especially of how the interpretation of conditional probabilities is to be articulated with an interpretation of absolute probabilities. The account is shown to be corroborated by the subjectivist and frequentist interpretations of the probability calculus -- considered as a calculus of both conditional and absolute probabilities. It leads to reject Humphreys's argument as relying on a misconception as to what an interpretation of conditional probabilities is.

Now, giving a general account of what it is to interpret conditional probabilities does not serve only as a tool to criticize Humphreys's argument. Positively, it allows to reformulate the question of the propensity interpretation of conditional probabilities. As an answer to the question thus reformulated, I actually propose a propensity interpretation of conditional probabilities. According to this proposition, relative to a system S, the conditional probability p(A | B) is to be interpreted as the propensity for A in the system which is the most similar to S amongst those for which the propensity for B has value 1. Obvious possible difficulties with this proposition are envisaged and discussed. It is concluded that none of them is fatal to the proposition. Therefore my conclusion is that the proposition may be accepted -- at least for the time being.

Bibliographical indications :
Humphreys 1985 : "Why propensities cannot be probabilities", The Philosophical Review, 94
Humphreys 2004 : "Some considerations on conditional chances", BJPS, 55
McCurdy 1996 : "Humphreys's paradox and the interpretation of inverse conditional propensities", Synthese, 108
Popper 1959 : "The propensity interpretation of probabilities", BJPS, 10.


Epistemological Critiques of "Classical" Logic: Two Case Studies

Branden Fitelson (Berkeley CA, United States of America)

I will begin the talk by discussing a (naive) "relevantist" argument against classical deductive logic. I will explain (Harman- style) why this argument fails, and why no "nearby" argument will succeed. Then, by analogy, I argue that Goodman's "grue" argument against "classical" inductive logic (Hempelian and Carnapian flavors) fails for analogous reasons. Indeed, I argue that Goodman's "grue" argument against "classical" inductive logic is even less compelling than the relevantist's argument against classical deductive logic. The analogy with deductive logic also reveals several other interesting features of Goodman's "New Riddle".


Probability: One or many?

Maria Carla Galavotti (Bologna, Italy)

After more than three centuries and a half since the ?official? birth of probability with the work of Blaise Pascal and Pierre Fermat, and about two centuries after Pierre Simon de Laplace endowed probability with a univocal meaning and started a tradition in this sense, the dispute on the interpretation of probability is far from being settled. While the classical interpretation forged by Laplace is outdated, there are at least four interpretations still on the market: frequentism, propensionism, logicism and subjectivism, each of which admits of a number of variants. Upholders of one or the other of these interpretations have traditionally been quarrelling over the "true: meaning of probability and the "right" method for calculating initial probabilities.

According to a widespread opinion, the natural sciences call for an objective notion of probability and the subjective interpretation is better suited to the social sciences. This viewpoint is shared by authors of different inspiration - not just frequenstists but also logicists like Harold Jeffreys. Even an upholder of a pluralistic tendency like Donald Gillies, maintains that the natural sciences - physics in the first place - and the social sciences call for different notions of probability. Gillies makes the further claim that the propensity theory is apt to represent physical probabilities, while probabilities encountered in the social sciences are better interpreted along subjectivist lines.

As a matter of fact, subjectivism seems to be surrounded by a halo of arbitrariness that makes it unpalatable even to those operating in various areas of the social sciences; for instance, law scientists tend to discard it in favour of the logical interpretation, reassured by the promise of objectivity involved in the term "logical".

By contrast, the paper will argue that the subjective interpretation has the resources to account for all applications of probability, in the natural as well as the social sciences.


From the doctrine of probability to the theory of probabilities: the emergence of modern probability calculus

Anne-Sophie Godfroy-Genin (Paris, France)

In the 1654-1713 period, modern probability emerged simultaneously from the calculations on games of chances and their applications to business and law; as well as from the calculations and tables concerning large collections of data as in mortality tables; and from the philosophical and theological concept of qualitative probability, as inherited from Aristotle and Aquinas and revised by the Jesuit casuists. The standard solution of the division problem discovered in 1654 by Pascal and Fermat allowed the idea of quantifying uncertainty, and after fifty years of difficulties in conceptualisation, shifted from a theological doctrine of probability to a mathematical theory of probabilities. At first an instrument to support a rational decision under uncertainty, it progressively became a tool of epistemic measurement of belief, an objective measure of uncertainty and a new logic against the progress of scepticism. This enlightens modern discussions on the nature of probability.


The fundamental shift in the metaphor through which philosophers conceive of learning

Norma B. Goethe (Cordoba, Argentina)

Inductive inference has always been a major concern in philosophy. This paper considers two different ways of thinking about induction: the classical way and the new, learning-theoretic way. After a brief introduction, I discuss the classical style of thinking about induction, which conceives of inductive inference as an extension of deductive argumentation. Secondly, I shall focus on the new, alternative learning-theoretic approach which sees induction as a type of computational process that converges to the truth. I conclude the paper by considering some of the important philosophical consequences of this fundamental shift in perspective.


Statistics without Stochastics

Peter Grünwald (Amsterdam, The Netherlands)

Consider a set of experts that sequentially predict the future given the past and given some side information. For example, each expert may be a weather(wo)man who, at each day, predicts the probability that it will rain the next day.

We describe a method for combining the experts' predictions that performs well *on every possible sequence of data*. In marked contrast, classical statistical methods only work well under stochastic assumptions ("the data are drawn from some distribution P") that are often violated in practice.

Nonstochastic prediction schemes can be used as a basis for robust, nonstochastic versions of standard statistical problems such as parameter estimation and *model selection*, a central issue in essentially all applied sciences: given two structurally different models that fit a set of experimental data about equally well, how should we choose between them?

The resulting theory is closely related to Bayesian statistics, but avoids some of its conceptual problems, essentially by replacing "prior distributions" by "luckiness functions".

This work is based on Dawid's Prequential Analysis,Vovk's work on universal prediction and Rissanen and Barron's work on the Minimum Description Length Principle.


The Halting Problem is decidable on a set of asymptotic probability one

Joel Hamkins (New York NY, United States of America; Amsterdam, The Netherlands)

Recent research in complexity theory has uncovered the "Black Hole" phenomenon, by which the difficulty of an undecidable or infeasible problem is concentrated on a very small region, a black hole, outside of which it is easy. It was natural to inquire whether the Halting Problem itself exhibited a black hole. Our main theorem shows that indeed it does: the halting problem for Turing machines is polynomial time decidable on a set of probability one with respect to the natural asymptotic measure. The proof is sensitive to details in the particular computational models. This is joint work with Alexei Miasnikov.


Redoing the Foundations of Decision Theory

Joe Halpern (Ithaca NY, United States of America)

In almost all current approaches to decision making, it is assumed that a decision problem is described by a set of states and set of outcomes, and the decision maker (DM) has preferences over a rather rich set of Acts, which are functions from states to outcomes. However, most interesting decision problems do not come with a state space and an outcome space. Indeed, in complex problems it is often far from clear what the state and outcome spaces would be. We present an alternate foundation for decision making, in which the primitive objects of choice are not defined on a state space. A representation theorem is proved that generalizes standard representation theorems in the literature, showing that if the DM's preference relation on her choice set satisfies appropriate axioms, then there exist a set S of states, a set O of outcomes, a way of viewing choices as functions from S to O, a probability on S, and a utility function on O, such that the DM prefers choice a to choice b if and only if the expected utility of a is higher than that of b. Thus, the state space and outcome space are subjective, just like the probability and utility; they are not part of the description of the problem. In principle a modeller can test for SEU behavior without having access to states or outco mes. A number of benefits of this generalization are discussed.


Indeterminacy in the Combining of Attributes

Jeffrey Helzner (New York NY, United States of America)

One point of entry into certain theoretical studies of decision making begins with a study of set-valued choice functions. Let X be a set of alternatives. Let S be a collection of subsets of X, e.g. the finite subsets. If C is a function on S such that, for all Y in S, Y is a subset of C(Y) and C(Y) is nonempty whenever Y is nonempty, then C is a choice function on S. The term "choice function" is, though standard, a bit misleading since C may be set-valued while in the intended interpretation no more than one alternative can actually be selected from a given menu of alternatives. By way of avoiding this tension it is sometimes convenient to say that an element y in C(Y) is "admissible" in Y; one possible reading of this term takes the admissible elements of Y to be the set of alternatives that the agent would be willing to choose if offered the opportunity to make a selection from Y.

Let X be the product X1×...×Xn One familiar interpretation takes (x1, ... , xn) in X to be the "act" that would have outcome xi if the ith state were to obtain. Normative studies concerning this interpretation often begin by equating admissibility with maximizing expectation against numerically precise probabilities and utilities. Though less familiar than these traditional expected utility models there is now an active study of choice functions that allow for indeterminacy in the underlying probabilities or utilities that inform expected utility maximization. While some of these models remain committed to the usual reduction of choice to a complete ranking of the alternatives -- or, more suggestively, a reduction of choice to preference -- some who have allowed for such indeterminacy, Isaac Levi most notably, have abandoned this reduction. Abandoning this traditional reduction forces one to rethink the role of preference in choice and the manner in which preferences are to be measured within the context of such an account.

A second important interpretation of choice in product sets takes (x1, ..., xn) to be an alternative that has value xi with respect to the ith attribute. Applications of this sort of interpretation range from psychological models of judgment and decision to prescriptive techniques in decision analysis. The most well-studied models in this setting assume that admissibility amounts to maximizing a real-valued function that is additive over the attributes; that is, the relevant index is computed by summing the alternative's performance over each of the attributes. As in the act-state interpretation one may allow for indeterminacy, e.g. there might be a family of appropriate additive models. Again, such indeterminacy can require us to rethink the manner in which preferences are to be measured. Techniques that have been developed for the analogous problem within the act-state framework are not appropriate here since they impose structural requirements (e.g. convex combinations) that are either unfounded or undesirable in the multiattribute interpretation. We will consider the possibility of relaxing assumptions in the theory of additive conjoint measurement in a way that allows for indeterminacy in the combining of attributes.


Beliefs: identification and change

Brian Hill (Paris, France)

Beliefs play an important role in human decision. Beliefs change. These banalities conceal the two most important questions regarding beliefs. The first is the question of the identification of the role of beliefs in decision; the second is the question of how they change, of their dynamics. Both these questions have been posed, and answers have been proposed, in terms of probabilities, the assumption being that probabilities can be considered a faithful enough representation of beliefs. This assumption is retained for the purposes of this presentation.

The question of the role of beliefs in decision proves difficult because, as well as the agent's beliefs about the world (or probabilities), his preferences over the consequences of his actions (or utilities) are involved in his choice. The problem is to separate the agent's probabilities from his utilities. Answers have normally come in the form of representation theorems stating that, for any agent whose preference relation (over acts) satisfies particular conditions, there is a unique probability function and an essentially unique utility function --intuitively the agent's beliefs and preferences over consequences-- such that the agent prefers acts which have a larger expected utility.

On the other hand, answers to the question of belief change have generally come in the form of mechanisms (Bayesian update, imaging, Jeffrey conditionalisation etc.) which, given the prior belief state and a piece of new information, yields a posterior belief state.

These two questions have generally been dealt with separately. The assumption underlying this practice is that they can be treated separately. But what happens in situation where there is change in the agent's behaviour, but it is an open question whether this change is to be understood as belief change or as utility change? Such questions will have to be faced soon, given the current growing interest in preference change. An interesting example, important in Social Theory, is the phenomenon of alleged "adaptive preferences" or "sour grapes": is it really the agentÕs utilities (preferences over consequences) which change, or rather his beliefs?

In the presentation, it will be argued that the divide-and-conquer strategy to phenomena where both the distinction of beliefs and preferences and the change of one or the other are at issue is inadequate. On two counts: firstly, the conditions demanded by representation theorems are particularly implausible when they are applied to the agent's preferences over the acts available in particular situations. Secondly, this strategy risks making the changes of attitudes between situations more obscure, if not totally incomprehensible. An alternative strategy for tackling these questions will be presented, which treats the two questions simultaneously, using information about the relationship between different situations to inform the elicitation of beliefs and preferences in individual situations. Such a strategy places different constraints on models of decision and change, in so far as these models should be able to deal fruitfully with both the behaviour in particular decision situations and the change between situations. Aspects of the application of this strategy will be considered, and illustrated on the case of sour grapes phenomena.


Dynamic Update with Probabilities

Barteld Kooi (Groningen, The Netherlands)
(with Johan van Benthem & Jelly Gerbrandy)

Current dynamic-epistemic logics model different types of information change in multi-agent scenarios. We propose a way to generalize these logics to a probabilistic setting, obtaining a calculus for multi-agent update with different slots for probability, and a matching dynamic logic of information change that may itself have a probabilistic character. In addition to our proposed update rule per se, we present a general completeness result for a large class of dynamic probabilistic logics. Finally, we also discuss how our basic update rule can be parametrized for different 'update policies'.


Empirical Progress and Truth Approximation by the "Hypothetical Probabilistic (HP-)method"

Theo Kuipers (Groningen, The Netherlands)

Probabilistic approaches to confirmation and testing are usually not seen as concretizations of deductive approaches. However, as argued elsewhere (1), if one uses 'positive relevance', i.e. p(E/H)>p(E), as the basic criterion of probabilistic confirmation of a hypothesis H by evidence E, 'deductive confirmation' appears as the common idealization, in the minimal sense of an extreme special case (2) , in the rich landscape of probabilistic confirmation notions.

Now, similarly, if 'p(E/H)>p(E)' is taken as the basic definition of 'E is a probabilistic consequence of H', deductive consequences are idealizations of probabilistic consequences. From this perspective, the HD-method of testing, based on 'H entails E', resulting either in deductive confirmation or falsification, depending on whether the predicted E turns out to be true or false, is an idealization of the Hypothetico-Probabilistic (HP-)method.

According to the HP-method, testing a hypothesis H amounts to 'deriving' probabilistic consequences E, in the sense defined above. When such probabilistic predictions come true, H is probabilistically confirmed, otherwise, that is, when the produced evidence is not-E, and hence is such that p(not-E/H) < p(not-E), H is (said to be) probabilistically disconfirmed. All this was already anticipated by Ilkka Niiniluoto in 1973 (3) .

In my "From Instrumentalism to Constructive Realism. On some relations between confirmation, empirical progress and truth approximation" (4) I have argued that the comparative evaluation of two (deterministic) theories, based on repeated application of the HD-method, to be continued if one or both have already been falsified, results in a plausible explication of the notion of (deductivistic) empirical progress, with articulated perspectives on (deductivistic) truth approximation.

Hence, it is plausible to try to argue along similar lines that the comparative evaluation of theories based on the repeated application of the HP-method, results in concretized versions of empirical progress and perspectives on truth approximation. The repeated application amounts to the systematic comparison of the likelihoods, p(E/X) and p(E/Y), of two theories X and Y relative to the successive experimental results E, generally called the L(ikelihood)C(omparison)-method. It was already shown (5) that the HP-/LC-method suggests plausible probabilistic concretizations up to and including 'being piecemeal more successful'.

This paper deals with the perspectives of the HP-/LC-method on "empirical progress" and "truth approximation" of successive deterministic theories, assuming that the truth is also a deterministic theory. The proposed definition of "Y is probabilistically closer to the truth than X" is intuitively rather plausible and turns out to be a concretization of 'deductively closer to' and 'quantitatively closer to'. Moreover, it entails that, in the long run, Y will show empirical progress relative to X, that is, Y will become irreversibly more successful than X in the cumulative sense, i.e., the likelihood of Y for the total evidence exceeds that of X to a given degree (not < 1). This 'threshold success theorem' directly supports the claim that the HP-/LC-method is functional for probabilistic truth approximation.

(1) Kuipers, T., Structures in Science, Synthese Library 301, Kluwer AP, Dordrecht, 2001, Ch. 7.1.2.
(2) For, if H entails E, p(E/H)=1 or, if undefined because p(H)=0, it is plausible to define it as 1.
(3) See (mainly) his "Towards a non-inductivist logic of induction" in I. Niiniluoto and R. Tuomela: 1973, Theoretical Concepts and Hypothetico-Inductive Inference, Synthese Library, Vol. 53, Reidel, Dordrecht.
(4) Synthese Library 287, Kluwer AP, Dordrecht, 2000.
(5) Kuipers, T. (to appear), "The hypothetico-probabilistic (HP-)method as a concretization of the HD-method", to appear in Festschrift in honour of Ilkka Niiniluoto


The Principle of Conformity and Spectrum Exchangeability

Jürgen Landes (Manchester, United Kingdom), Jeff Paris (Manchester, United Kingdom) and Alena Vencovska (Manchester, United Kingdom)

We show that if a probability function on the sentences of a finite predicate language satisfies Spectrum Exchangeability then it must satisfy a very strong conformity principle when restricted to the sentences of certain derived languages.


Combining probability and logic: Embedding causal discovery in a logical framework

Bert Leuridan (Ghent, Belgium)

The meaning and `logic' of the concepts of causation and of causal discovery have long escaped precise treatment. Since the 1980s, however, this has changed. By combining probability and graph theory, Pearl (2000), Spirtes et al. (2000) and others were able to (partly) overcome these problems. They have developed algorithms for causal discovery, consisting of syntactic inference rules that allow one to derive causal structure from purely observational data (e.g. Pearl's IC or Spirtes' and Glymour's PC algorithm).

However valuable and fruitful these algorithms are, from a logical point of view they face several problems. Firstly, while explicitly incorporating probabilistic and graph theoretic reasoning, they also make use of classical logic. But these classical inferences remain implicit. Secondly, although the algorithms are backed with a semantics (which is formulated graph theoretically), the formulation of this semantics strongly deviates from those encountered in logic (for instance, the notion of truth in a model is absent). Thirdly and most importantly, causal discovery is a form of non-monotonic reasoning. When confronted with new observational data, a non-omniscient agent may have to drop causal beliefs previously derived. PC nor IC is designed to handle such cases. They presuppose the agent has full knowledge (they take a full, stable distribution over a set of variables as their input).

The aim of this paper is to show how probabilistic theories and algorithms for causal discovery can be embedded in a logical framework and to discuss the advantages of doing so.

More specifically, I will embed (a revised version of) the IC algorithm within a logical framework and show how the resulting logic ALIC solves the problems in question. Well-formed formulas (wffs) in ALIC are either atomic probabilistic sentences, or atomic causal statements, or complex sentences built from such atomic sentences by means of logical connectives (negation, conjunction, etc.). ALIC is formulated within the framework of adaptive logics. Adaptive logics are a class of non-monotonic logics that all have a dynamic proof theory and a (sound and complete) semantics (Batens, 2004, 2007). ALIC-models assign truth values (0, 1) to all wffs and represent faithful distributions and graphs. The proof theory contains non-conditional axioms and inference rules. These include the rules of classical propositional logic, plus probabilistic and causal axioms and inference rules. Most importantly, it also contains a conditional rule to infer causal relations from correlations. If X and Y are correlated, then one may infer they are adjacent on the condition that no non-empty set of variables screens them off (cf. Van Dyck, 2004, where an adaptive logic for causal discovery is formulated; this logic has no semantics, however, and is not formulated within the standard format for adaptive logics). A conditional line in a proof later may need to be marked if its condition is violated, e.g. in view of newly added premises. Together, the conditional rule and the marking definition (which governs the marking process) provide the logic with a dynamics which makes it very suited for formally handling the non-monotonic characteristics of causal discovery.

I shall proceed as follows. Firstly, I will introduce the reader to the PC and IC algorithms. I will discuss their value and their main shortcomings, chiefly focussing on the non-monotonic nature of causal discovery. Then I will present ALIC's language and proof theory, focussing on the resemblances and differences with IC. Finally, I will present its semantics.

References Batens, D. (2004). The need for adaptive logics in epistemology. In Gabbay, D., Rahman, S., Symons, J., and Van Bendegem, J.-P., editors, Logic, Epistemology and the Unity of Science, pages 459-485. Kluwer, Dordrecht.
Batens, D. (2007). A universal logic approach to adaptive logics. Logica Universalis, 1:221-242.
Pearl, J. (2000). Causality. Models, Reasoning, and Inference. Cambridge University Press, Cambridge.
Spirtes, P., Glymour, C., and Scheines, R. (2000). Causation, Prediction, and Search. MIT Press, Cambridge, Massachusetts.
Van Dyck, M. (2004). Causal discovery using adaptive logics. towards a more realistic heuristics for human causal learning. Logique et Analyse, 185-188:5-32.


Probability in Evolutionary Theory

Aidan Lyon (Canberra, Australia)

Evolutionary theory is up to its neck in probability. For example, probability can be found in our understanding of mutation events, drift, fitness, coalescence, and macroevolution.

Some authors have attempted to provide a unified realist interpretation of these probabilities. Or, when that has not worked, some have decided that this means there is no interpretation available at all, defending a "no theory" theory of probability in evolution. I will argue that when we look closely at the various probabilistic concepts in evolutionary theory, then attempts to provide a unified interpretation of all these applications of probability appear to be poorly motivated. As a consequence of this, we need to be more careful in drawing conclusions from the fact that each interpretation fails for just one particular area of evolutionary theory. I will also argue that a plurality of interpretations is much better than no interpretation at all.

On a more positive note, I will outline a particular way that a plurality of objective and subjective interpretations of probability can be strung together which provides a suitable understanding of the role probability plays in evolutionary theory.


Fuzzy logic and betting games

Ondrej Majer (Prague, Czech Republic)

Probability and fuzzy logic have for a long time been considered as two incompatible or at least complementary theories of uncertainty. The aim of the paper is to explore common background of the rival theories. In particular we shall concentrate on interpreting fuzzy logics in a probabilistic framework using the method of betting games proposed by Giles and Fermueller.


Like-Minded Agents as a Foundation for Common Priors

Klaus Nehring (Davis CA, United States of America)

The Common Prior Assumption (CPA) plays a central role in information economics and game theory . However, in situations of incomplete information, that is: in situations in which agents are mutually uncertain about each others' beliefs, and without a preceding stage in which beliefs were commonly known, there is a significant gap between the formal statement of the CPA and its underlying intuitive content. Indeed, Gul (1998) has even questioned whether the CPA can be transparently interpreted at all in this context. This meaningfulness question has by now been successfully addressed in a number of papers in the literature.

Nonetheless, the extant results leave a major gap in the normative and positive foundations of the CPA in that all extant characterizations make assumptions directly on what is commonly known; that is to say, they make assumptions which, if true, must be commonly known. By contrast, any norm of "interactive rationality" must be ``fully local``, i.e. can only constrain the actual beliefs of actual agents; a normative account of the CPA must therefore be able to clearly accommodate patterns of interactive beliefs within which all agents are de facto (at the true state) interactively rational, but that put positive probability on other agents' irrationality, or on their belief in others' irrationality. In other words, interactive rationality must be defined in terms of fully local properties of beliefs.

The goal of this paper is therefore to derive the CPA from common knowledge of fully local properties of agents' beliefs, specifically from common knowledge of their "Like-Mindedness". We define like-mindedness of two agents formally as equality of their beliefs, conditional on knowledge of both agents' entire belief hierarchies; the conditioning ensures that the agents' beliefs are compared on the basis of the same (hypothetical) information. Like-mindedness can be viewed as an interpersonal generalization of the "reflection principle" studied in the philosophical literature.

We show that in the general case of many-sided incomplete information there is no fully local property whose being commonly known characterizes the existence of a common prior. Thus any foundation of the CPA must rely on the existence of auxiliary assumptions that are not entailed by the CPA itself. In the present paper, we assume the existence of an "uninformative" outsider to close this gap. Importantly and in line with the incomplete information setting, uninformativeness does not require that insiders know anything about the outsider's beliefs. The main formal result of the paper shows that common knowledge of like-mindedness with the outsider together with common knowledge of his uninformativeness yields a common prior among insiders.

In a concluding section, we also develop a richer framework in which agents' "epistemic attitudes" are introduced as independent primitives, and like-mindedness of beliefs is derived from the recognition of other agents as "equally rational". This framework allows one to formulate competing normative positions on the content of intersubjective rationality in terms of alternative restrictions on these equivalence relations. We distinguish in particular a rationalist, a pluralist, and a relativist position.


Measuring the Uncertain

Martin Neumann (Osnabrück, Germany)

Throughout the history, the probability calculus was faced with the problem of its interpretation. Roughly speaking, two major approaches can be distinguished: a subjective and an objective interpretation. The latter can be further distinguished into theories of relative frequencies and propensity theories, most prominently advocated by Karl Popper. Propensity theories can be characterised as theories of objective single case probabilities.

In this talk a historical theory of such objective single case probabilities will be outlined: the so-called "Spielraum" (i.e. range or scope) theory of probability, which was developed in 1886 by the German physiologist Johannes von Kries. It will be shown that the style of probabilistic reasoning is fundamentally different from current propensity theories: on the one hand, propensities are measured by relative frequencies and on the other hand, they refer to an ontic indeterminism. Both claims do not hold for the "Spielraum" theory. This will be shown behind the backdrop of 19th century science.

The question of measuring objective single case probabilities will be investigated behind the backdrop of the historical framework of laplacian probability: the sorest point within laplacian probability theory was the determination of cases of equal possibility. In the 19th century theories of so-called objective possibility where of growing interest. Yet, the question had to be answered, of how to identify cases of equal possibility in nature. By developing the "Spielraum" theory, von Kries developed an advice to measure cases of equal possibility. These conditions are called indifference, comparability and originality. They will be outlined in detail in the talk. Thereby von Kries developed a sophisticated theory of measurement. In modern terms it can be characterised as a theory of extensive measurement. In particular, these conditions have to be fulfilled in any single case. Thus, a "Spielraum" is not measured by relative frequencies.

The question of (in)determinism will be investigated behind the backdrop of developments in the kinetic theory of gases, in particular, the 2nd law of thermodynamics. Originally, the kinetic theory of gases was introduced to reduce the phenomenological law of theromdynamics to the laws of newtonian mechanics. However, these are reversible while the 2nd law of thermodynamics is irreversible. Ludwig Boltzmann faced this objection by probabilistic considerations. Yet, this was in contradiction to the determinism of newtonian mechanics. However, von Kries demonstrated that Boltzmann's considerations fulfil the measure theoretical conditions of the "Spielraum" theory. This enabled a formalisation of randomness which allowed to reconcile probabilistic reasoning with scientific determinism. Thus, contrary to modern propensities, the "Spielraum" theory does not call for an indeterministic world view.


Probabilistic Justification and the Regress Problem

Jeanne Peijnenburg (Groningen, The Netherlands)

Today epistemologists are usually probabilists: they hold that epistemic justification is mostly probabilistic in nature. If a person S is epistemically (rather than prudentially) justified in believing a proposition E0, and if this justification is inferential (rather than noninferential or immediate), then typically S believes a proposition E1 which makes E0 probable.

How to justify E1 epistemically? Again, if the justification is inferential, then there is a proposition E2 that makes E1 probable. Imagine that E2 is in turn made probable by E3, and that E3 is made probable by E4, and so on, ad infinitum. Is such a process possible? Does the "ad infinitum" makes sense? The question is known as the Regress Problem and the reactions to it are fourfold. Skeptics have hailed it as another indication of the fact that we can never be justified in believing anything. Foundationalists famously argue that the process must come to an end in a proposition that is itself noninferentially justified. Coherentists, too, maintain that the infinite regress can be blocked, but unlike foundationalists they hold that the inferential justification need not be linear and may not terminate at a unique stopping point. Finally, infinitists have claimed that there is nothing troublesome about infinite regresses, the reason being that an infinite chain of reasoning need not be actually completed for a proposition or belief to be justified.

I will defend a view that is different from all four. Against skeptics, foundationalists and coherentists I will show that an infinite regress can make sense; against infinitists I will show that some beliefs are justified by an infinite chain of reasons that can be formally completed.


Probability Spaces for First-Order Logic

Kenneth Presting (Chapel Hill NC, United States of America)

Following the pioneering work of Haim Gaifman in 1964, the development of probability logic in the first-order setting has been advanced by Kyburg, Bacchus, Halpern, and many others. Most of this work has been in the context of probability logics or probability semantics. The present article develops a different approach which is closer to the traditional measure-theoretic foundation of probability due to Kolmogorov. Our approach is similar to that of Fenstad, in that we preserve the intuition that the probability of an expression is the measure of its extension, that is, the set where the expression is true. We construct probability spaces from models for the language, and from classes of models. This allows explicit calculation of objective probabilities and complements the more subjective approach of probability logic.

We consider a first order language (FOL) L with countably many variables, interpreted into a given model M with a countable domain. Every FOL contains two varieties of expressions - open formulas and closed sentences. In the standard Tarski semantics, the extension of an open sentence is a set of sequences of domain elements from M, {s(i) | s(i) in M for every natural i}, such that the sequence satisfies the formula. The set of all these extensions, denoted [L], generates a sigma-algebra s[L]. The Boolean operations in the sigma-algebra are isomorphic to the logical conjunction and disjunction in the language. Furthermore, in the countable models, existential and universal quantification is homomorphic to countable union and intersection, which is shown by embedding the cylindrical algebra over the sequences into the infinite-dimensional product algebra MN. Thus, the set of all sequences is the domain of a measurable space (MN,s[L]). These sequences may be interpreted as the outcome space for an experiment of random sampling (with replacement) from the model domain. Assuming exchangeability of the domain members, the resulting probabilities for open formulas are identical to the probabilities calculated from traditional sampling models. Closed sentences are satisfied either by every sequence or by no sequence. Thus they have an essential role in the space of sequences, although that space reveals little information to distinguish one generalization from any other. We construct a probability space in which generalizations have intermediate probability by appealing to the concept of sampling without replacement. Every sample from a population is a subset of that population, and corresponding to the model M there is a class of sub-models 2M. Letting Lq stand for the set of closed sentences (i.e. quantified formulas) in L, we have another measurable space (2M, s[Lq]) generated by the quantified formulas Lq, such that the extension of each sentence is the set of models where it is true. It is an open question whether the probability of a closed generalization obtained from the space of sub-models is computable from the probabilities of corresponding open formulas in the space of sequences. The sample space of sub-models can be given a Lebesgue measure using a construction due to Halmos, which corresponds to giving each member of the domain a 50% chance of inclusion in any random sample. This case has interesting properties, some of which are explored.


Special progicnet Presentation: Probabilistic Logics and Probabilistic Networks

Rolf Haenni (Bern, Switzerland),
Jan-Willem Romeijn (Amsterdam, The Netherlands), Greg Wheeler (Lisbon, Portugal), Jon Williamson (Canterbury, United Kingdom)

In classical logic the question of interest is whether a proposition ψ is logically implied by premiss propositions φ1, φ2, ... , φn. A probabilistic logic, or progic for short, differs in two respects. First, the propositions have probabilities attached to them. The premisses have the form φX, where φ is a classical proposition and X subset [0,1] is a set of probabilities, an where the premiss is interprested as a restriction of the probability of φ to X. Second, the question of interest is not the direct analogue of the classical question, namely whether some specific ψY follows from a set of premisses φ1X1, φ2X2, ..., φnXn. Rather, the question of interest is the determination of the smallest possible, or most informative Y:

(1) φ1X1, φ2X2, ..., φnXn models ψ?
That is, what minimal set Y of probabilities should attach to the conclusion sentence ψ, given the premisses φ1X1, φ2X2, ..., φnXn.

The first part of the paper shows how a large range of systems for uncertain inference may be captured by this general question: the standard probabilistic inference of Ramsey and Jeffrey, probabilistic argumentation using degrees of support and possibility, or Dempster-Shafer belief functions, the partial entailment of Kyburgian evidential probability, classical and Bayesian statistical inference, and objective Bayesian inference. All of these views effectively provide a different semantics for the terms in Equation 1. Now it must be noted that several of these semantics distance themselves explicitly from standard Kolmogorov probability as the expression of uncertainty. However, despite this disparity they retain a formal connection to standard probability, and it is this connection that can be exploited to provide a syntactic procedure that provides, at least in part, an answer to the question of Equation 1.

In the second part of the paper, we show that the different systems of uncertain inference can all be assisted by employing a procedure based on so-called credal networks. A credal network is a graphical representation of a convex set of probability functions. If the sets Xi appearing in the premisses are intervals, then credal networks are applicable, and under some further assumptions, which may be motivated by the respective semantics, credal networks appear as computationally attractive tools for determining an interval for Y. Thus the progic proposed above can perform a double function. It unifies a number of different views on uncertain inference by showing that they can be represented in the same format, and it offers each of these views a useful piece of inference machinery.


Eclectic Interpretation of the Probability: A Question of Convenience?

Paolo Rocchi (Roma, Italy)

Whereas the advocates of the Bayesian view on probability and of the frequentist view reject the opponent theories as ill-grounded, a large group of experts is inclined to accept both the interpretations of probability [1]. Those authors claim that each approach should be used in accordance to the specific context of the problem to confront. It is often said that the Bayesian statistics and the classical statistics may be employed independently from the theoretical convinction, that the truth is messy and we have to make approximations.

Concluding, the concept of probability ignites fierce debates at one hand in the scientific community and inspires conciliatory compromise at the other hand. This astonishing evolution of the mathematical searches raises a doubt: is the eclectic position a question of convenience or otherwise may a superior logic be found to conciliate the irreconcilable opponents?

The pathway we follow does not provide a reply in a direct manner. We scale to the problem through the argument of the probability which is used as a key to screen the different theoretical interpretations. This approach is close to the Popperian school that repeatedly considers the difference existing between the single event and the long-run event [2]. In detail we examine the argument of the probability from the historical perspective [3] and then argue about the internal logic of the subjectivist and frequentist views [4].

[1] J.Berger, Could Fisher, Jeffreys and Neyman have agreed upon testing?, Statistical Science, 18 (2003),1-32.
[2] D.Gillies, Varieties of propensity, British Journal for the Philosophy of Science 51 (2000) 807-835.
[3] P.Rocchi, De Pascal à nos jours: Quelques notes sur l'argument A de la probabilité P(A), Actes du Congr¸s Annuel de la Société Canadienne d'Histoire et de Philosophie des Mathématiques (CSHPM/SCHPM) (2006) [in printing].
[4] P.Rocchi, The Structural Theory of Probability, Kluwer/Plenum, N.Y. (2003).


On the Alleged Impossibility of Bayesian Coherentism

Jonah Schupbach (Pittsburgh PA, United States of America)

In several recent publications, Bovens and Hartmann present an "impossibility result" against Bayesian Coherentism. This result putatively shows that coherence can only be given a probabilistic, complete and transitive ordering relation if it is not separable. Bovens and Hartmann intend their result to apply to any such ordering, and thus to any proposed order-inducing probabilistic measure of coherence. Underlying their notion of separability - and thus underlying their impossibility result - is Bovens and Hartmann's introduction and support of a set of specific ceteris paribus conditions. In this paper, I argue that these ceteris paribus conditions are not clearly appropriate. Certain proposed coherence measures not only call for different such conditions but they also call for the rejection of at least one of Bovens and Harmann's conditions. I show that there exist sets of ceteris paribus conditions which, at least prima facie, have the same intuitive advantages as Bovens and Hartmann's conditions but which also allow one to sidestep the impossibility result altogether. This shifts the debate from the merits of the impossibility result itself to the underlying choice of ceteris paribus conditions.


Concepts of Independence for Full Conditional Measures and Sets of Full Conditional Measures

Fabio Cozman (Sao Paolo, Brazil), Teddy Seidenfeld (Pittsburgh PA, United States of America)

This paper explores the definition and the properties of independence for events and variables in the context of full conditional measures -- that is, probability measures that allow conditioning on events of zero probability. We consider both the usual situation where a single full conditional measure is employed, and the situation where a set of full conditional measures is present. There are several such concepts in the literature; some rely on convexity assumptions and decision-theoretic justification, while others abandon both convexity and decision-theoretic arguments. We use non-binary choices to define independence concepts that are both grounded on decisions and that do not demand convexity for sets of measures, and extend these definitions to the lexicographic settings induced by full conditional measures. Graphoid properties of these concepts, and their implications for the theory of directed acyclic models, are investigated.


Surprise and Evidence in Model Checking

Jan Sprenger (Bonn, Germany; London, United Kingdom)

In various situations we evaluate data merely with respect to a null model or a family of null models, without specifying any alternative. Then, p-values are commonly taken as measures of "evidence against the null". This interpretation is seriously flawed: evidence is a comparative concept and always relative to an alternative. To the extent that p-values do not implicitly assume an alternative model, they cannot be taken as a measure of evidence against the null model simpliciter. Furthermore, p-values typically depend on the probability of hypothetical, counterfactual outcomes. That might be useful in a genuine alternative testing problem, but it precludes their interpretation as quantification of evidence.

Given the widespread use of p-values in statistical inference, it is notwithstanding mandatory to clarify their epistemic role. I argue that they are closely related to measures of surprise and that confounding surprise and evidence has led to serious misunderstandings about the significance of p-values. Quantifying surprise in the results is valuable at preliminary stages of model analysis when a null model is only tentatively endorsed. At this early stage of the analysis, observation results are supposed to either license the adoption of the null model or to indicate the need for modification. In order to describe the relative expectedness of the actual result under the null model, a measure of surprise has to depend on the probability of counterfactual outcomes. This property clearly distinguishes measures of surprise from measures of evidence. Thus, the epistemic rationale of surprise in the results consists in guiding and driving the development of alternative models. Accounting for degrees of surprise and the introduction of new models is especially challenging for a full Bayesian approach where all conceivable alternative models are part of a Bayesian supermodel.

This epistemological characterization of surprise enables us to evaluate various measures of surprise with regard to the relevance of p-values. Contrary to likelihood-based suggestions, p-values do not ignore the value of counterfactual considerations in measuring surprise. However, in the more general case of a whole family of null models, p-values based on maximum likelihood estimates tend to be overly conservative, concealing the need for a modification of the model. More refined techniques, e.g. Bayesian or resampling methods, are required to deal with such cases. Finally, I propose a measure of surprise which is tightly connected to classical p-values in the case of a single null model. This recognizes the significance of p-values but endows them with a more natural scaling and interpretation.

The preceding results are applied to a comparison of surprise and evidence. Measuring surprise is required in an exploratory statistical analysis whereas the intrinsically comparative concept of evidence is crucial at later stages of the analysis, e.g. for model choice and validation. Hence, both surprise and evidence play important and distinct epistemic roles. I further argue that objections to the Likelihood Principle usually rest on confounding both concepts. These objections can be convincingly rebutted if the distinction between surprise and evidence is clearly drawn.


Rationality Constraints on Credences

Weng Hong Tang (Canberra, Australia)

The probabilist subscribes to two doctrines: first, that there are such things as credences (or degrees of belief), and second, that rational credence functions ought to be probability functions. I subscribe to the first doctrine, but will not defend it in this paper. Instead, I will focus on the second doctrine. Given that credences exist, must rational credence functions be probability functions? I answer `No', and propose some other rationality constraints on credences to take the place of those provided by the probabilist. My account is similar to Ian Hacking's `Slightly More Realistic Personal Probability' (1967) and Haim Gaifman's `Reasoning With Limited Resources and Assigning Probabilities to Arithmetical Statements'. However, there are important differences which I shall highlight. I shall also defend my account against various objections, in particular, against the charge that constraints on credences that are weaker than those provided by the probabilist lead to a loss of mathematical structure, and block various important results in Bayesian epistemology.


Quality, quantity, and beyond. On nonmonotonic probabilistic reasoning

Emil Weydert (Luxembourg, Luxembourg)

When artificial intelligence took off in the seventies and eighties, probabilistic reasoning used to be not so popular. In particular, quantitative approaches were thought to be too cumbersome for modeling commonsense reasoning, the holy grail of knowledge representation. This motivated a lot of research on alternative qualitative approaches, known as default reasoning, and more generally, nonmonotonic logic.

However, in the last 15 years, probabilistic methods have had their revenge, mainly through the success story of probabilistic graphical models. Furthermore, it has turned out that the intuitively most interesting default formalisms can be actually grounded in probabilistic considerations, for instance mediated by quotient structures over nonstandard models of probability theory. In particular, well-motivated probabilistic choice functions, like entropy maximization (ME), have inspired defeasible entailment notions for conditional defaults (e.g. System JLZ).

Although the additional degrees of freedom and slightly different purposes prevent straightforward one-to-one relationships, it seems natural to require that any reasonable qualitative approach to reasoning under uncertainty should be anchored in the probabilistic framework, understood in a broad sense. This does not just help to provide justifications and tools for some coarser-grained formalisms, but also to exploit techniques or insights gained at the qualitative level for the finer-grained probabilistic level. More generally, we may note that probabilistic and statistical inference are and always have been instances of nonmonotonic reasoning, a still hardly explored area.

Taking a look at the complexity of real-world knowledge and of the corresponding uncertainty management tasks, as confronted e.g. by evolving information agents, it could therefore be interesting to try to merge the probabilistic and the default tradition in a more systematic way. This may result in more transparent and powerful representation and inference models, less blocked by a uni-dimensional perspective. In the talk, we are going to discuss some problems and ideas about how to achieve this.