What I learned today: original post

Showing posts with label original post. Show all posts

Sunday, July 6, 2014

Journal papers should not be seen as the source of truth

I don't understand our obsession with wanting journal papers to be a source of truth.

I read the Nature News article about the editorial in Science (which you need a subscription to access) announcing that they would conduct a statistical review, in addition to peer review for new papers. While I think reviewing all aspects of a paper carefully is a good thing, I don't think that journal papers need to contain water-tight, fire-proof analyses.

Journal papers should be, and are, exploratory. It is perfectly fine, when you are doing a new experiment to start without a strong experimental design, but just some ideas of what to look for. Once you get your data, it is perfectly fine to 'play around' with various ways of analyzing it until you find something interesting. And it is perfectly fine to publish that. In fact, in my experience, that's what most papers are.

The problem is that we pretend they are something else. Once they are published we pretend that they are a rigorous analyses of an a priori hypothesis and experimental design. We pretend that they are definitely correct. If a paper doesn't stand up to replication we might even retract it so that it doesn't sully the scientific record!

I find this maddening. We need exploratory, open-ended, ill-defined experiments to understand a system, what we are measuring and what limitations we have. You might say that you are supposed to do all that exploration before you design and run the actual experiment you report. To that I say a) how often does that actually happen, b) is it realistic to expect and believe that everyone has done this before publishing their paper and c) why shouldn't you share your preliminary non-rigorous results, with the understanding of what they actually are.

The truth should come later, from replications, meta-analyses and reviews. It should come from the synthesis of many journal papers by different researchers. So instead of wasting time trying to ensure that journal papers are absolutely correct before they are published, why not spend some of those resources building more formalized infrastructure for that second level. Let's stop pretending journal papers are something that they are not.

Saturday, July 6, 2013

Two "poems"

I found these when I was going through some of my old notes.

1.
I set off through the landscape, searching for my peak, blind and alone.
Only gravity and memory could help me make a map, but it's changing over time.
As the memory develops, is interpreted, is molded into a different context
Bumps in the road are exciting, but not as exciting as hills, mountains, peaks.
I walked up one yesterday. I had my hiking boots on but I was a little out of shape.
It didn't really matter, the hill was short.
I felt around in the dark hoping there was a path I'd missed. Maybe there was but I never found it.

Thinking about life as an optimization problem is probably a mistake, but sometimes useful for illustrative purposes.

2.
I saw you earlier but you didn't say any of the things I wanted you to say.
You were at the airport but I don't know where you were going. I guess it doesn't matter.
From Manchester to Heidelberg, it's all the same.
There was a fire I heard. I knew you had kerosene and a box of matches with you.
I didn't ask you about it.
I wish you weren't so cold.

Friday, June 21, 2013

Stories don't tell the whole story

Here is my failed submission to the nature career columnist competition:

Every time I sit down to write a paper or prepare a presentation I ponder the same question. How much, if at all, should I highlight and discuss my reservations or doubts about my own research? In the 1974 Caltech Commencement address Richard Feynman said that when reporting an experiment “Details that could throw doubt on your interpretation must be given, if you know them.” This idea appeals to me and fits with how we are taught that science moves forward. However, throughout my career, rarely has anyone encouraged me to express more doubts about my results or be more open about my reservations. In fact, almost the opposite. Co-authors, colleagues and mentors might suggest I leave something out of a manuscript and wait to see if the reviewers question it, or they might encourage me to try and keep my story simple in a presentation, so I don't confuse people. Just to be clear, I am not discussing issues of fraud, hiding or 'fudging' the data here, I am focusing on the issue of how to craft an interesting and concise narrative while representing as fully as possible the complicated and messy set of results that have accumulated.

A simple example: Say I test a group of 10 subjects on a task, and find that they perform significantly better after intervention A. Maybe this is a result that I'm using to build up a theory about A. But what if the group improvement was mainly due to a large improvement in 3 of the subjects? There is no reason to exclude these subjects, they comprise 30% of my sample, and the difference is statistically significant. However, this to me, seems relevant to my interpretation. What's happening with the other 7 subjects? Maybe, on average they also show an improvement but much smaller and less clear. Do I make a point of highlighting this question?

We are taught that doing science isn't about “selling” a story. But when it comes to communicating our data to our colleagues, we are encouraged to draft a simple, compelling, engaging, and clear story. When submitting a paper to a journal we have to argue for why our results are interesting and important, in order to convince the editors and the reviewers that our work should be published. When applying for a grant we want to convince that our work should be funded. Even when giving a talk we want to convince the audience that attending was worth their time.

And what is more convincing and memorable than a neat story? We are scientists, but we are also people. We have limited memory for collections of data that aren't assembled into a narrative. A good story is much more compelling, effective, and memorable than a tangle of data with multiple potential interpretations. Communicating with our colleagues is one of the most important parts of our job; taking these big messy data sets and distilling meaning from them for others to be able to easily build on. But every story is a choice that highlights certain aspects and downplays others. Different ways of looking at the data might lead you to tell slightly or radically different stories, each just as legitimate as the other.

I fear that raising doubts about my own work would make it less comprehensible, less believable and less likely to be cited. I also worry it makes me seem less competent. So how do I deal with this problem? How do you deal with this? So far, I have prioritized convincing over expressing my doubts, but to me that is an unsatisfying solution. I think a better solution could be found in changing the format in which we, as scientists, communicate with each other, but that is a whole topic in itself.