Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

This! I struggled with this topic in university. I was studying pulsar astronomy, and there was only one or two common tools used at the lower levels of data processing, and had been the same tools used for a couple of decades.

The software was "reproducible" in that the same starting conditions produced the same output, but that didn't mean the _science_ was reproducible, as every study used the same software.

I repeatedly brought it up, but I wasn't advanced enough in my studies to be able to do anything about it. By the time I felt comfortable with that, I was on my way out of the field and into an non-academic career.

I have kept up with the field to a certain extent, and there is now a project in progress to create a fully independent replacement for that original code that should help shed some light (in progress for a few years now, and still going strong).



> The software was "reproducible" in that the same starting conditions produced the same output, but that didn't mean the _science_ was reproducible, as every study used the same software.

This is the difference between reproducibility and replicability [1]. Reproducibility is the ability to run the same software on the same input data to get the same output; replication would be analyzing the same input data (or new, replicated data following the original collection protocol) with new software and getting the same result.

I've experienced the same lack of interest with established researchers in my field, but I can at least ensure that all my studies are both reproducible and replicable by sharing my code and data.

[1] Plesser HE. Reproducibility vs. Replicability: A Brief History of a Confused Terminology. Front Neuroinform. 2018;11:76.


This is almost an argument for not publishing code. If you publish all the equations, then everybody has to write their own implementation from that.

Something like this is the norm in some more mathematical fields, where only the polished final version is published, as if done by pure thought. To build that, first you have to reproduce it, invariably by building your own code -- perhaps equally awful, but independent.


Maybe gate release of the code by some number of attempted replications.


Should this be surprising? I'm not saying it is correct, but it is similar to the response many managers give concerning a badly needed rewrite of business software. Doing so is very risky and the benefits aren't always easy to quantify. Also, nobody wants to pay you to do that. Research is highly competitive, so no researcher is going to want to spend valuable time making a new tool that already exists even if needed if no other researchers are doing that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: