JOSÉ MIGUEL BERNARDO: “Traditional statistical methods are inefficient for the problems of modern science”

José Miguel Bernardo, Professor of Statistics in the University of Valencia, is lecturing at the XXVI Winter School of the Astrophysics Institute of the Canaries. Credit: Miguel Briganti, SMM (IAC).

Advertised on

11/07/2014

By ADELINA PASTOR

In elections, with Barkly 1% of the votes counted, we can predict the final result with considerable accuracy.

Bayesian inference is used to test the plausibility of different models of the origin of the universe and the date of the Big Bang.

Powerful statistical methods are able to analyze very quickly all the available relevant information to reach the conclusions we need.

José Miguel Bernardo, Professor of Statistics in the University of Valencia is an International authority on Bayesian inference, which is a type of statistics that is so versatile that it can be used in widely different contexts: elections, the stock market, market research, medical diagnostics, … and even in the search for extrasolar planets. He is one of the lecturers at the Winter School of the Astrophysics Institute of the Canaries, which is taking place in La Laguna. He was a founder member of the International Society for Bayesian analysis, and the initiator of the Valencia meetings, a forum which, every four years, brings together the experts in the subject. “It is difficult to find a publication about Bayesian inference which does not cite one or other of the proceedings of these meetings “ is the proudly paternal statement of this mathematician who has collaborated with political parties and with the media in Spain, South America, and South Africa, and who next Tuesday, November 11th at 19.00 will give a lecture in the Museum of Science and the Cosmos.

Question: Your Lecture on the 11th has the title “How to make sensible electoral predictions” I am bound to ask “Is this really possible?”

Answer: There are two types of electoral predictions. Before voting takes place these are made using opinion polls, which at times have problems of the representative nature or the truthfulness of the people whose biases are not always detectable. In the lecture I will talk mainly about predictions with no subjective element: the sample of the first results declared in a few well chosen polling stations. With these data, very few, barely 1% of the vote, it is possible to predict the final result with great accuracy almost as soon as the polling stations have closed, as well as to determine if there has been some possibility of fraud.

Q: How are these polling stations chosen?

A: In any country with a democratic tradition there are electoral results declared, polling station by polling station, for each election. Given the results of these stations we can create an algorithm which extracts a small subsample (20 or 30 stations) which is a microscopic representation of the overall result. It is possible to show that with only those 20 stations it would have been possible to predict the result of the elections, so that the result of the prediction is reliable, unless the population voting at a given station changes, which has to be checked every time.

Q: But you are in the Canaries as an invited lecturer at the XXVI Winter School of Astrophysics. What have elections and astrophysics in common?

A: Very little, except the Bayesian method, which is becoming standard, bit by bit, in all the experimental sciences, including astrophysics.

In conventional statistics (also called frequentist) the probability is understood as the relative frequency with which something occurs. In Bayesian statistics, however, the probability is a measure of the probable, which is the likelihood that an event will occur, or a hypothesis be true, as a function of the information which we have available. They are two different paradigms, rather like Freudianism or Jungianism in psychology. Exaggerating a bit, the difference is like working with a heliocentric paradigm or a geocentric paradigm within the Solar System.

Q: What are the advantages of Bayesian inference for the experimental sciences?

A: To deal with relative frequency it is necessary for something to be repeated again and again under the same conditions, and this does not always occur. For example the probability of having a boy baby is 0.51, but this is certain with very restricted information, in a very general way. If you want to know the gender which will be the result of a pregnancy in a marriage which already has two male children it is more complicated to work out the probability using frequency because you need data from similar couples. The moment you add more data you are faced with a single case, and the frequency disappears. And in experimental science there are more and more situations where it is not possible to talk about frequency. These include, for example, problems in genetics, in molecular biology, or in the development of new drugs: the combinations are so complex, there are so many parameters to take into account that conventional methods do not work. As the problems become more complicated the traditional methods are not only worse, they do not work at all.

Q: Give me a problem in the field of astronomy

A: For example trying to see which models of the universe, are more likely to be true, and in particular estimating its age. Choosing between specific models is a problem of probability, and Bayesian statistics lets you assign probabilities a posteriori, that is to say the probability, with the information we have, that the correct model is one or another. And that is something which cannot be done from the classical point of view.

Q: But Bayesian inference is very old: it is based on Bayes’ theorem which dates from the XVIII century

A: It is true that we use Bayes’ theorem, in a sophisticated version, but its application is very recent. Part of the theory was developed by Laplace in XIX century, but it was valid only under very restricted conditions (with uniform initial distributions) and for that reason it was not used as an alternative. In the middle of XX century it became clear that its use could be generalized, but this needed integrals with many parameters, and even the computers at the time were not sufficiently powerful, nor were Monte Carlo techniques known, as these were developed around the ´90’s. This is what has given rise to really important applications because although the theory was known previously, it was not known how to apply it practically.

Q: Does this imply a change in experimental science?

A: There are many problems which are so complicated that the human mind cannot solve them without assistance. One straightforward example is automatic diagnostics in medicine, which is backed by data bases about a given complaint with information about the clinical histories of patients and the final diagnostic. For example hepatitis or cirrhosis is a problem of the liver. If a new patient arrives with his or her own clinical history, of course different from previous histories, the data base is used to obtain a probability distribution of what is wrong with him (or her), given the historical data available, the result could be, for example that the patient has cirrhosis with a probability 0.8, and liver cancer with a probability 0.1. A doctor with experience could probably guess this, but could not show it experimentally. Powerful statistical techniques are able to analyze very quickly all the relevant information available to arrive at useful conclusions. And this is true in medicine, in physics, and in any discipline.

Organizing Committee: Andrés Asensio Ramos, Íñigo Arregui, Antonio Aparicio y Rafael Rebolo.

Secretary: Lourdes González.

Contacts: Andrés Asensio Ramos (IAC): aasensio [at] iac.es (aasensio[at]iac[dot]es) y 922605238 Íñigo Arregui (IAC): iarregui [at] iac.es (iarregui[at]iac[dot]es) y 922605465

Press: Carmen del Puerto: prensa [at] iac.es (prensa[at]iac[dot]es) y 922605208

Previous press release:

http://www.iac.es/divulgacion.php?op1=16&id=898&lang=en

http://www.iac.es/divulgacion.php?op1=16&id=897&lang=en

Programme of the Winter School: http://www.iac.es/winterschool/2014/pages/about-the-school/timetable.php

Further information: http://www.iac.es/winterschool/2014/