The yin and yang of questions and answers

In a previous post, I wrote that asking questions is harder than answering them, although I qualified that in a big way with “answering [questions] involves going back over and over again and updating our hypotheses, which makes answering questions feel hard”.  I want to revisit this claim.

Some of you may be familiar with the “reproducibility crisis” happening in the sciences, where many popular and well-known results have failed to replicate.  But what does failure to replicate mean?

Maybe it means that there was something wrong with the original study.   Maybe it means that there was something wrong with the replication.  But those aren’t the only options.  As nobel laureate psychologist Daniel Kahnamen wrote in an open letter to the scientific community:

In the myth of perfect science, the method section of a research report always includes enough detail to permit a direct replication. Unfortunately, this seemingly reasonable demand is rarely satisfied in psychology, because behavior is easily affected by seemingly irrelevant factors.

Note that underspecification of methods is an issue in all sciences.  Psychology just has a particularly rough time of it because psychology itself, like other soft sciences, is so underspecified.  Behavior is affected by seemingly irrelevant factors which are actually relevant previously unspecified factors.

In a better world, replication would be a collegial and common process involving many back-and-forths between originators and replicators.  Each replication could help identify new factors that turn out to be surprisingly relevant.  Eventually the hypothesis and methodology would be specified enough to permit consistent replication, at which point we’d have both our question and our answer.

This example makes clear that asking and answering questions are not two separate activities.  They are intertwined, at least when the questions and answers are new.  So it makes no sense to say, “Asking questions is harder than answering them” or vice versa, because you can’t do one meaningfully without also doing the other.

FYI: to read more about replication, try this article I wrote back in 2014 on the Open Science Collaboration blog: What we talk about when we talk about replication.

 

Hard and soft sciences

Back when I was a research scientist, I straddled the boundary between “hard” and “soft” sciences.  I did social psychology, which is a pretty soft science as sciences go, but I paired it with biology and physiology in general and endocrinology in particular, which meant getting a taste for some of the harder stuff.

I have never particularly liked the terms “hard” and “soft”, though, because it’s too easy to conflate them with “hard” and “easy”.  There’s a saying that goes: the soft sciences are easy to do poorly and hard to do well.  They are easier to do poorly than the hard sciences, and harder to do well than the hard sciences.  Here, have a chart:

What’s going on here?  The hard sciences are better developed than the soft sciences, so it’s clearer when someone’s making obvious mistakes, cutting corners, or making under-supported claims.  That makes it difficult to do poor work.  It’s also difficult to good work, of course.  The easiest thing to do in the hard sciences is to meet a minimum level of competency and do solid but uninspiring work.

Meanwhile in the soft sciences there’s questions even about the field’s basics.  There’s still a minimum level of competency, but it’s much less stringent than in the hard sciences.  So sloppy researchers tend to end up in the soft sciences.

Here’s another way to approach the hard/soft distinction. What’s easier, formulating questions or answering them?  It’s almost always easier to do the latter, provided you’ve very clearly and specifically formulated your question.  Of course we seldom do get our questions right on the first try, and so answering them involves going back over and over again and updating our hypotheses, which makes answering questions feel hard.  But the hardest parts of answering questions are really secretly still about asking them.

In the hard sciences, it’s easier to clearly and specifically formulate questions because so much knowledge has already been established.  Isaac Newton famously said (paraphrased) ‘If I have seen further it is only by standing on the shoulders of giants.’  The hard sciences are full of giants, with shoulders for modern researchers to stand on.  The soft sciences are by and large still on the ground.

For this reason, I prefer the terms “developed” vs “undeveloped” sciences.  I think it comes closer to the essential difference.

Note: this post has an update/correction post.

Science vs Software

Andrew Gelman has a brief post up on his blog comparing the way bug reports in open source software are received to the way many researchers respond to criticisms of their work.  The comments there are good, and cover my first reaction, which was, “Developers respond well to bug reports?”  But that’s a bit tongue in cheek.  I do think that, overall, developers are a bit more responsive to bug reports than scientists are to published criticisms of their work.  Here are my theories as to why that is:

  • Bug reports are not analogous to published criticisms. Bug reports are the primary way for people to give feedback to the maintainers of a project, while historically, much criticism of research has been through less formal mechanisms such as email, questions at conference or post talks, and face-to-face at lab meetings, conferences, etc.  I have never seen a scientist respond poorly to a critique at a lab meeting or a poster session, for instance.  If you factor in these other interactions, the average emotional response to criticism might be more equal.
  • Research careers are measured in papers.  The academy clings to its traditions, and that’s one of the big ones.  Papers are difficult to amend in significant ways, and they take a lot of work to produce.  If software developers could only push to production once every year or so, I bet we’d find receiving them much more stressful!
  • Most bug reports don’t existentially threaten software projects.  I mean, I’m sure it’s happened (and now I’m kind of curious when and how!).  But many critiques of published research are suggesting that reported effects may not exist, which is a much bigger blow than “hey your app keeps freezing”.  So long as researchers are lauded not for the quality of the work they do but the significance of the results they achieve, this kind of critique is going to feel like a threat, especially to non-tenured researchers and scientists in other types of precarious positions.

If I have one critique of the open science movement, which I otherwise endorse and consider myself a part of, it’s that we focus too much on the behavior of individual researchers and not enough on the systems which motivate that behavior.  It’s not that developers are better people than scientists.  It’s that the systems developers operate within are set up to reward and punish different things.