Okey dokey. Better reveal the solution to the stat-geek marginalisation quiz. There were sixty two votes.

The popular winner was the economics/marginal profits idea, with 31 votes. Plausible but wrong.

The second most popular was the “marginal interest” idea. Well… this is what the term has more or less drifted into meaning, because (almost) everybody has forgotten the true origin. So… wrong.

Nobody voted for “lost in the mists of time”, which proves you all care. How nice.

Only two people voted for EB Margin being the pseudonym of WR Gossett. This disappointed me, both because it is funny and because it was supposed to be a cunning false trail. WS Gosset in fact published his papers under the name of “Student”, which is why we have the “Student’s t test’.

So of course the correct answer was other. Sorry if that was an annoying tactic, but I think if I’d made the right answer one of the choices, it would have been too obvious. Amongst the 6 suggestions, two were for our amusement :

“I’d write the reason here but there’s not enough room in the margin”

“To marginalise those who don’t know”

and four were were spot on or more or less right

“Refers to margins of a contingency table”

“thought it was to do with averaging rows, with answer stuck in the margin”

“your are projecting the 2D pdf onto the “margin” of the plot”

“Sweeping the probability to the edge (=margin) of the paper?”

Sounds like the first two people knew, and the second two deduced the right answer. If you were one of those people, award yourself an extra biscuit at coffee time, and feel free to announce yourself.

Just to it spell out.. As physicists, we nearly always think in abstract mathematical terms, so we think of “marginalisation” as a calculus problem – an integral. Even when thinking visually we picture a joint probability distribution as a smooth surface in three dimensions. But early statisticians were often concerned with tables of numbers, and worked on paper. Think of a joint frequency distribution as a grid of numbers in cells. Then add up a row, and write the answer in the margin. When you have done this for all the rows, read down that margin, and – voila – the marginal distribution for y.

Don’t start me on regression…