Bernoulli and the Foundations of Statistics. Can you correct a 300-year-old error?

Author: Julian Champkin

Ars Conjectandi is not a book that non-statisticians will have heard of, nor one that many statisticians will have heard of either. The title means ‘The Art of Conjecturing’ – which in turn means roughly ‘What You Can Work Out From the Evidence.’ But it is worth statisticians celebrating it, because it is the book that gave an adequate mathematical foundation to their discipline, and it was published 300 years ago this year.

More people will have heard of its author. Jacob Bernouilli was one of a huge mathematical family of Bernoullis. In physics, aircraft engineers base everything they do on Bernoulli’s principle. It explains how aircraft wings give lift, is the basis of fluid dynamics, and was discovered by Jacob’s nephew Daniel Bernoulli.

Just a placeholder image

Jacob Bernoulli (1654-1705)

Johann Bernoulli made important advances in mathematical calculus. He was Jacob’s younger brother – the two fell out bitterly. Johann fell out with his fluid-dynamics son Daniel, too, and even falsified the date on a book of his own to try to show that he had discovered the principle first.

But our statistical Bernoulli is Jacob. In the higher reaches of pure mathematics he is loved for Bernoulli numbers, which are fiendishly complicated things which I do not pretend to understand but which apparently underpin number theory. In statistics, his contribution was two-fold: Bernoulli trials are, essentially, coinflips repeated lots of times. Toss a fair coin ten times, and you might well get 6 heads and four tails rather than an exact 5/5 split. Toss 100 times and you are quite unlikely to get 60 heads and 40 tails. The more times you toss the coin, the closer you will get to a 50-50 result.

His second statistical result was more fundamental, though related. Suppose you have an urn with 3000 white pebbles and 2000 black pebbles. You take out a pebble, look at its colour, and return it. Then do it again; and again; and again. After ten times you might guess that there were 2/3 as many black pebbles as white; after 1000 times you might feel a bit more sure of it. Can you do this so often that you become absolutely sure – morally certain, as Bernoulli put it - that the pebbles in the vase were actually in the ratio of 3 to 2? Or would that conclusion forever remain just a conjecture?

Just a placeholder image

Ars Conjectandi, Title page.

Courtesy Gonville and Caius College, Cambridge.


If it is just a conjecture, then all of statistics is built on sand. Happily, Bernoulli showed it was more than a conjecture; he spent years thinking about it, managed to prove it was true – and when he had done so he called it his Golden Theorem as it was the crown of his life’s work. The more time you repeat a series of experiments like this, the closer your result will get to the true one. Statisticians are rather pleased that he proved it. If it had stayed a conjecture, there would have been no need to believe anything (statistical) that a statistician told you.

We shall have a major scholarly piece on Ars Conjectandi in our June issue, out on paper and on this site shortly. A challenge: can you correct something that Jacob Bernoulli got wrong? It stayed wrong for nearly 300 years until our author, Professor Antony Edwards, spotted it and corrected it.

Here is the problem: It is a simple exercise in schoolboy probability. It is
Problem XVII in Part III of Bernoulli’s book. For those who would like to
try their hand, the problem is as follows.

Just a placeholder image

Bernoulli' table..

From Bernoulli's Ars Conjectandi 

In a version of roulette, the wheel is surrounded by 32 equal pockets marked 1 to 8 four times over. Four balls are released and are flung at random into the pockets, no more than one in each. The sum of the numbers of the four occupied pockets
determines the prize (in francs, say) according to a table which Bernoulli gives – it is on the right. The cost of a throw is 4 francs. What is the player’s expectation? That is, how much, in the long run, can he expect to walk away with per game?

The left-hand columns in the table are the total four-ball score; centre columns are the paybacks for a four-franc
stake; the right-hand columns are the number
of combinations that could give rise to the score.

The answer Bernoulli gives in the book is 4 + 349/3596, which is 4.0971. Professor Edwards comes up with a different answer, which we shall give in his article in the magazine section of this site when the issue goes live in about a week. Which do you agree with?

And happy calculating…

Bookmark and Share

Comment on this article

Submit your comment
  1. Image of unique ID

Comments

Simon Bartlett

Not sure how old this is but my results concur with James' so Bernoulli's result still stands. I would also agree that the most likely explanation is the use of a payout of 1 instead of 2 for the sum of 18 event given the difference in the sum total of payouts is different by 3184 between the two methods.

Someone also mentioned about order for this problem. If we think about the problem our interest is in events summing to a number so order would not seem to be important. If you were to calculate this using permutations, rather than combinations as Bernoulli has done, you get the same result. Thus proving that for this example order isn't important. I have a quick demo to help illustrate this.

For the sum of 5 event (3 1's and a 2) there are 16 combinations (4C3 x 4C1) and 384 (4P3 x 4P1 x [4C1 or 4C3 depending on whether you allocate the 2 or the 1's]) permutations. Now 16x24=384 and you find that there are 24 times the number of permutations than combinations for every event. Since these 24's will all group together they will subsequently be cancelled out by the total number of permutations being 24 times the number of combinations (863040 compared to 35960) and so the expected values will be equal.

Hope that didn't come across as too teacherish :)

reply to this comment

Karl

Does the shape of the roulette wheel matter at all in the calculations? The reason why I ask is  that roulette wheels are typically never fixed, they rotate, and in the counting, there are fewer configurations if we take into account rotations.

The calculations that Bernoulli made would assume that each of the 32 slots are unique and that the roulette wheel would not rotate.

For instance, looking at a sum of 5, Bernoulli counts that there are 16 ways to sum up to five, if we treat each of the 32 slots as unique. But if we take into account rotations on a roulette wheel, then I have only been able to count 3 possible ways that are unique, this assumes that the slots on the roulette wheel are numbered 1,2,...,8,1,...,8,1,...,8,1,...8

reply to this comment

Karl

Okay, now I see where I made an error. My previous reasoning was that there was only 4 ways to sum to 5. But it was incorrect because I did not take into account that each of the 32 slots are unique.

reply to this comment

Karl

It looks like to me that Bernoulli incorrectly calculated the frequencies of each sum.
For instance in the table, it lists the sum 5 as having 16 different ways. I can't figure out why he came up with that number.

At most I can only find, that there is only one way to partition 5 across 4 positive integers, that is 5=1+1+1+2

and since the 4 balls can land in any of the numbered slots, there are 4!/(3!*1!) ways in which the sum of 5 is possible in the game.

Now, it can still be the case that Bernoulli's calculation of the expected value is correct, provided that the counting rule he used is consistent. I will come back to this board with an answer once I got some time.

reply to this comment

James Hanley

At the end of the June 10 web item entitled 'Can you correct a 300-year-old error?', the Editor asked which expectation is correct, that of Bernoulli (B) or Edwards (E).
My earlier response suggested that the 300-year old answer is correct. This response explains where the error in the 6-year old answer may have occurred.
The item led me to understood that Edwards had disagreed with Bernoulli not on his arithmetic, but on his counting of the frequencies. But my enumeration and resulting expectation, and my direct simulations, all agreed with Bernoulli, so in my first response I posted them on the Significance site.
The expectation involves what the item calls 'a simple exercise in schoolboy probability', but the derivation of the probabilities/frequencies themselves is more challenging, at least for some of us who are no longer so combinatorially nimble.
I was anxious to know exactly what Edwards disagreed about. The item gave Bernoulli's 29 winning amounts and 29 corresponding frequencies. But it omitted the elegant derivation (in a pullout table after p 172 of the original -- and that
remained folded and was thus missed by the Google digitization) that showed how he arrived at the frequencies. Bernoulli described this source table, along with his notation and logic, in pages 169-173. No longer able to follow the Latin, I tracked down the 2006 translation by Sylla and then happened upon Edwards' 2007 review of it. In that review Edwards explains that "with the aid of a calculator, (he had) ploughed through Bernoulli's arithmetic only to disagree with his answer". Edwards found the expectation to be 4 153/17980 (4.0085) rather than the 4 + 349/3596 (4.0971) that Bernoulli had calculated. Scholar that he is, he added that he "should be glad to hear from any reader who disagrees with [his] result." In the intervening years nobody contacted him, so he repeated his answer and again asked "which is correct?" in the 2013 Significance Magazine article.
I believe that Edwards made the slip at the step where the winning amount of 2 francs in the bottom row of the 'nummi' in the June 10 Table should be multiplied by the frequency 3184. (Edwards has since told me that he expects that when he gets a moment to recheck his own arithmetic, he will agree with me).
For, at issue is the difference between 'B' = 147330/35960 = 4.0971 and 'E' = 4 + 153/17980 = 144146/35960 = 4.0085.
This is a difference of 3184/35960, and the missing 3184 points to exactly where the slip occurred. To obtain the sum of 29 products, Bernoulli told us to
"multiply
1 case by 120 and again by 180;
16 cases by 100 and also by 32;
52 cases by 30 and by 25; etc.,
or, more briefly [making it a sum of just 15 products],
1 by 300 = 120 + 180,
16 by 132 = 100 + 32,
52 by 55 = 30 + 25,
and so forth up to
3184 multiplied by 2,
and finally divide the sum of all these products by 35,960, the sum of all the cases. The quotient, 4 349/3596, is the expectation of the player."
It appears that while ploughing through Bernoulli's arithmetic, and probably doing it in 15 steps rather than 29, Edwards, in the last step, somehow took 3184 cases multiplied by 1 franc, rather than by the 2 francs.
Those who did not read beyond the headline, and tend to believe headlines, have been left with the impression that Bernoulli's expectation contains an error that went unnoticed for 300 years. As the full story illustrates, even in purely numerical matters, one sometimes needs to read (or skip) to the end of the item, where it became clear that Edwards himself was not sure. One also learns that the Editor was not quite as sure as his headline suggested, and that he was asking us to act as 'fact-checkers' -- such as are employed by many magazines
to check all factual assertions made in their articles.
( see http://en.wikipedia.org/wiki/Fact_checker )
I thank the Editor and Professor Edwards for taking us on this trip back in time, and for presenting the winning amounts and frequencies as they appeared in the 300-year old book. I am quite confident that when I ask future students to try their hand at Editor this simple exercise in schoolboy probability, there will be, just as there have been to date, several different answers.
J H . 2013.06.23

reply to this comment

Hugh

In my earlier post, I had an error.  My revised answer is 4.0971, the same as Bernoulli's.

Is the supposed error lie in 4.0971 vs 0.0971, the expected value vs the expected profit?
Other than that I don't see any error from Bernoulli's calculation.

reply to this comment

Hugh

My answer is 5+428/3596=5.1190

reply to this comment

Gary

If Mark is correct, I have a problem with the correction. The question is how much the player expects to walk away with after each game. The player has to have 4 francs to play the game. After winning .0971, the player then walks away with 4.0971 in his or her pocket.

reply to this comment

Mark

The numbers in Bernoulli's table are all correct (sorry for people needing to write lengthy programs to verify this).

The problem lies in the question that is apparently being asked: "That is, how much, in the long run, can he expect to walk away with per game?"

The expected gain per game played is 349/3596, not 4+349/3596 (the 4 francs it costs to play the game isn't returned!).  [This assumes that the player always has at least 4 francs to play with (i.e. unlimited capital).  If not, this becomes a ruin problem.]

reply to this comment

Graham Wheeler

Quote:

Assuming I've correctly amended Bernoulli's table, I find the answer to the problem is 4.006618.

 

Withdrawing my submission, my calculation is definitely incorrect! I incorrectly amended the number of possible combinations for each total, thinking there were, say, only 4 ways of getting a total of 5 (analogous to rolling four eight-sided dice number 1 to 8) - of course there are indeed 16. Will have another go at this.

reply to this comment

Skip to Main Site Navigation / Login

Site Search Form

Site Search