Skip to content

Now, Andy did you hear about this one?

We drank a toast to innocence, we drank a toast to now. We tried to reach beyond the emptiness but neither one knew how. – Kiki and Herb

Well I hope you all ended your 2017 with a bang.  Mine went out on a long-haul flight crying so hard at a French AIDS drama that the flight attendant delivering my meal had to ask if I was ok. (Gay culture is keeping a running ranking of French AIDS dramas, so I can tell you that this BPM was my second favourite.)

And I hope you spent your New Year’s Day well. Mine went on jet lag and watching I, Tonya in the cinema. (Gay culture is a lot to do with Tonya Harding especially after Sufjan Stevens chased his songs in Call Me By Your Name with the same song about Tonya Harding in two different keys.)

But enough frivolity and joy, we’re here to talk about statistics, which is anathema to both those things.

Andy Kaufman in the wrestling match

So what am I talking about today (other than thumbing my nose at a certain primary blogger who for some reason didn’t think I liked REM)?

Well if any of you are friends with academic statisticians, you know that we don’t let an argument go until it’s died, and been resurrected again and again, each time with more wisdom and social abilities (like Janet in The Good Place). So I guess a part of this is another aspect of a very boring argument: what is the role for statistics in “data science”.

Now, it has been pointed out to me by a very nice friend that academic statisticians have precisely no say in what is and isn’t data science. Industry has taken that choice out of our equivocating hands. But if I cared about what industry had to say, I would probably earn more money.

So what’s my view? Well it’s kinda boring. I think Data Science is like Mathematics: an encompassing field with a number of traditional sub-disciplines. There is a lot of interesting work to do within each individual sub-field, but increasingly there is bleeding-edge work to be done on the boundaries between fields.  This means the boundaries between traditions and, inevitably, the boundaries between training methods. It also means that the Highlander-Style quest for dominance among the various sub-disciplines of Data Science is a seriously counter-productive waste of time.

So far, so boring.

So why am I talking about this?  Because I want to seem mildly reasonable before I introduce our pantomime villain for the post (yes, it’s still Christmas until the Epiphany, so it’s still panto season. If you’re American or otherwise don’t know what panto season is, it’s waaaaaay too hard to explain, but it involves men in elaborate dresses and pays the rent for a lot of drag queens in the UK).

So who’s my villain? That would be none other than Yann LeCun.  Why? Because he’s both the Director of AI Research at Facebook and a Professor at NYU, so he’s doing fine. (As a side note, I’d naively assume those were two fairly full on full time jobs, so all power to that weirdly American thing where people apparently do both).

LeCun also has a slightly bonkers sense of humour, evidenced here by him saying he’s prepared to throw probability theory under a bus. (A sentiment I, as someone who’s also less good at probability theory than other people, can jokingly get behind.) Also a sentiment some people seriously got behind, because apparently if you stand for nothing you’ll fall for anything.

Hey Andy are you goofing on Elvis, hey baby, are we losing touch?

But what is the specific launching pad for my first unnecessarily wordy blog post of 2018? Well it’s a debate that he participated in at NIPS2017 on whether interpretability is necessary for machine learning.  He was on the negative side. The debate was not, to my knowledge, recorded, but some record of it lives on on a blue tick twitter account.

Ok, so the bit I screenshot’d is kinda dumb. There’s really no way to water that down. It’s wrong. Actively and passively. It’s provably false. It’s knowably false from some sort of 101 in statistical design. It’s damaging and it’s incorrect.

So, does he know better? Honestly I don’t care. (But, like, probably because you usually don’t get to hold two hard jobs at the same time without someone noticing you’re half-arsing at least one of them unless you’re pretty damn good at what you do and if you’re pretty damn good at what you do you know how clinical trials work because you survived undergraduate.) Sometimes it’s fun to be annoyed by things, but that was like 3 weeks ago and a lot has been packed into that time. So forget it.

And instead look at the whole thread of the “best and brightest” in the category of “people who are invited to speak at NIPS”. Obviously this group includes no women because 2017 can sink into the ******* ocean. Please could we not invite a few women. I’m pretty sure there are a decent number who could’ve contributed knowledgeably and interestingly to this ALL MALE PANEL (FOR SHAME, NIPS. FOR SHAME). 

Serious side note time: Enough. For the love of all gods past, future, and present can every single serious scientific body please dis-endorse any meeting/workshop/conference that has all male panels, all male invited session, or all male plenary sessions. It is not the 1950s. Women are not rare in our field. If you can’t think of a good one put out a wide call and someone will probably find one. And no. It’s not the case that the 4 voices were the unique ones that had the only orthogonal set of views on this topic. I’m also not calling for tokenism. I reckon we could re-run this panel with no men and have a vital and interesting debate. Men are not magic. Not even these ones.

[And yes, this is two in a row. I am sorry. I do not want to talk about this, but it’s important that it doesn’t pass without mention. But the next post will be equality free.]

Other serious side note: So last night when I wrote this post I had a sanctimonious thing asking why men agree to be in all-male sessions. Obviously I checked this morning and it turns out I’m in one at ISBA, so that’s fun. And yes, there are definitely women who can contribute to a session (ironically) called The diversity of objective approaches at least as well as I can (albeit probably not talking about the same stuff).

Will I do it? Probably. I like my job and speaking in sessions at major conferences is one of the things I need to do to keep it. Do I think ISBA (and other conferences) should reject out of hand every all male session?  No, because there are focussed but important subfields that just don’t have many women in them. Which is unfortunate, but we need to live in the world as it is. Do I think ISBA (and other conferences) should require a strong justification for a lack of diversity in proposed sessions? ABSOLUTELY.

Final serious side note: Why do I care about this? I care because people like Yann LeCun or Andrew Gelman or Kerrie Mengersen do not spring fully formed from the mind of Zeus. While they are extremely talented (and would be no matter what they did), they also benefitted from mentoring, guidance, opportunities, and most importantly a platform at all points during their career. We are very bad at doing that for people in general, which means that a lot of talent slips through the cracks. Looking at gender diversity is an easy place to start to fix this problem – the fact that women make up about half of the population but only a tiny slither of possible speaking slots at major conferences is an easy thing to see.

If we give space to women, ethnic and racial minorities, people from less prestigious institutions, people with non-traditional backgrounds etc, (and not just the same handful that everyone else invites to speak at their things) we will ensure that we don’t miss the next generation of game changers.

Anyway, long rant over. Let’s move onto content.

So first and foremost let’s talk about machine learning (where LeCun makes his bread). It’s a fairly uncontroversial view to say that when the signal-to-noise ratio is high, modern machine learning methods trounce classical statistical methods when it comes to prediction. (I’ve talked a little about the chicken-armageddon regime before.) The role of statistics in this case is really to boost the signal-to-noise ratio through the understanding of things like experimental design. [The Twitter summary of] LeCun’s argument suggests that this skill may not be being properly utilized in industrial data science practice.  This is the seat at the table we should be fighting hard for.

One easy thing statisticians could contribute could be built around Andrew’s  (and collaborator’s) ideas of Mister-P and their obvious extensions to non-multilevel smoothing methods. Another would be a paper I love by Chen, Wakefield and Lumley, which is basically the reverse of Mister-P.

Broadly speaking, the Twitter thread summarizing the debate is worth reading. It’s necessarily a summary of what was said and does not get across tone or, possibly, context. But it does raise a pile of interesting question.

I fall on the side that if you can’t interpret your model you have no chance of sense-checking your prediction. But it’s important to recognize that this view is based on my own interests and experiences (as is LeCun saying that he’d prefer good prediction to good interpretability). In the applied areas I’m interested in, new data comes either at a scientific or human cost (or is basically impossible to get), so this is hugely important. In other areas where you can get essentially infinite data for free (I’m looking at fun problems like solving Go), there is really no reason for interpretability.

What’s missing from the summary is the idea that Science is a vital part of data science and that contextually appropriate loss functions are vitally important when making decisions about which modelling strategy (or model) will be most useful for the problem at hand. (Or the philosophy, methodolgy, and prior specification should only be considered in the context of the likelihood.)


  1. Z says:

    “If we give space to women, ethnic and racial minorities, people from less prestigious institutions, people with non-traditional backgrounds etc, (and not just the same handful that everyone else invites to speak at their things) we will ensure that we don’t miss the next generation of game changers.”

    I’m 100% behind trying this trickle down theory because it aims to solve a real problem and the costs are so low, but “ensure” is a strong word.

  2. Xi'an says:

    Hey, Dan, to put the matter to a (temporary?) rest, when we [O’Bayes program chair and section chair] made this section for the next ISBA together, we contacted just as many women as we did men. It just happened that the women we contacted were already involved in other sessions. And some suggested male co-authors.

  3. someone says:

    Dan, why do you say about the twitter comment…??? —
    >> wrong. Actively and passively. It’s provably false. It’s knowably false from some sort of 101 >> in statistical design. It’s damaging and it’s incorrect.

    I think it’s true that people die from approved medical drugs. And that’s after approval. I don’t think that’s got anything to do with stats 101 or 401 or whatever. It’s just the way Big Pharma, the spineless FDA and drug-pushing society works.

    Maybe you feel differently about the whole drug industry and this has biased you into thinking it’s about the stats.

    I mean even the drug-pushing doctors don’t say stuff about proven statistical design. The drug-pushing doctors just say: ‘well that’s not on the official list of side-effects’, or ‘that’s 1 in 10000 event’ and ‘you’ll die if you don’t take that chance’, blah, blah, blah…

    We know you can’t prove zero risk. And if it’s a successful drug that 10% of the population takes for 4 decades, your trial sensitivity is never going to be enough to pick up small effects that have real consequences.

    • someone says:

      Let me add that the the reality is even worse than what I wrote above.
      The problems Andrew writes about on the blog like p-hacking, publication bias, etc.,
      arguably occur most in pharma/medical studies.
      That’s one of the things some commenters have raised on this blog.
      Power posing etc. probably haven’t caused anyone to die, but the drug/medical studies have,
      so perhaps statisticians should focus on raising the alarm for pharma/med studies more than social science etc. studies.

      • Andrew says:


        Statisticians and others have been sounding the alarm about problems in medical research, and I’d be happy if there were more action along these lines.

      • Dan Simpson says:

        Yes people die. But to say you try things and hope too many people don’t die ignores everything about how these trials are designed or are executed. (If too many people die you stop them)

        • someone says:

          Dan, I think Twitter is dangerous, now I know what LeCun said. He actually did say they do trials in stages, e.g. animal testing. He mentioned that trials are stopped if too many die or it’s too successful. To me, his main point was that in the end it’s testing that matters. I think it’s pretty hard to argue against that. I think Caruana’s main point was he thinks faults in the testing only show up if results are interpretable. Caruana mentioned the pneumonia/asthma thing. LeCun only conceded that you have to be careful in testing. LeCun mentioned Aspirin was used without anyone knowing how it worked before 1970.

  4. R says:

    Andrew, I don’t think this is a matter of knowing how clinical trials work – in reality, clinical trials can carry risks specially for sick individuals. The counter-factual of what happened if an individual did_not/did take drug A are never observed and it’s obvious that in some cases if an individual took/did_not_took drug A would have survived. In more extreme cases the actual death is observed and that’s because a very extreme/high risk drug is decided to be tested:

    I am by no means defending the AI non-interpretation argument and/or the all male panel but I also think that camouflaging real-life decision making and clinical trials true risks into statistics 101 problem is not a good idea either. I also don’t agree that they work by killing people that’s absurd (don’t think LeCun implies this explicitly but he does probably implies it figuratively in my sense). In summary, there are true risks in participating on clinical trials and as much as we aim to minimize them they will never be 0 and in some cases they are very high. It reminds me also of the Dallas Buyers movie, excellent real life movie about the implications of clinical trials in the life of patients.

  5. Jonathan (another one) says:

    Please credit Dan Fogelberg, not Kiki and Herb. This is sort of gay version of Stigler’s Law of Eponymy, no?

  6. Will says:

    I believe this is the recording of the Interpretability Panel/Debate from NIPS2017.

Leave a Reply