I'm going to be teaching my first ever bootcamp in the coming days. My biggest fear (and there are many!) is that the participants will walk out of the room on the final afternoon and go straight back to their old habits, never taking the time to incorporate what they've learned into their daily workflow. In an attempt to avoid this eventuality, I've planned a rousing concluding address to explain why the content taught at a Software Carpentry bootcamp is so important. It goes something like this...
Consider the workflow of Nigel Nobootcamp, which is typical of many researchers today:
If you read any research related blogs or scan the editorial sections of prominent journals like Science or Nature [e.g. 1, 2, 3], you'll know that "open" and "web" are two of the major buzz words right now. This is because there are now a whole range of web based tools out there that make it possible for this typical workflow to be more collaborative, transparent, reproducible and reliable. All scientists would agree that these are desirable improvements, but most lack the requisite computing skills to fully participate in this open science revolution. It is for precisely this reason that the Mozilla Science Lab—whose mission is to help researchers use the open web to shape science's future—is now the organisational home for Software Carpentry. They recognise that in order for these tools and practices to make it out of buzz word editorials and into the default workflow of everyday scientists like Nigel, the entire profession needs to upskill. Graduates of a two-day Software Carpentry bootcamp, whether they immediately realise it or not, have all the basic skills and knowledge needed to transform the way they do research. In comparison to Nigel, consider the workflow of Betty Bootcamp:
At an individual level, it's obvious that Betty's workflow would probably produce higher quality research. Her use of unit testing (as opposed to Nigel's "looks right" approach) has reduced the chance of errors, while the ease with which others could collaborate (GitHub) and provide commentary (arXiv) on her work means it's likely to have been exposed to a higher level of scrutiny prior to journal submission. The data analysis process was probably also far less painful for Betty. Aside from the best practices she learned to make her code more readable and easy to manage, her bootcamp experience has given her the confidence to interact with the programming community and further develop her skills (SciPy, mailing lists). When she runs into trouble—which still happens regularly—her support network goes far beyond random Google searches and friendly colleagues down the hall.
While improved research quality is a noble pursuit, in many cases the academic world places more emphasis on quantity. It's therefore noteworthy that Betty's workflow would probably also be more efficient. The key to this is the fact that in both her programming (via unit testing and defensive programming) and manuscript preparation (via reviews on GitHub and comments on the arXiv document) she is identifying errors earlier than Nigel. Anyone who has spent days tracking down a bug, only to eventually find it in a section of code written weeks/months ago, will know that catching errors early is the key to enhancing efficiency. Betty's programming skills are also superior to Nigel's (bootcamp, regular interaction with programming community), which means she performs simple tasks faster and doesn't reinvent as many wheels.
Finally, at a broader community level it's important to note that Betty's research is more transparent than Nigel's. If other researchers want to reproduce and ultimately build on her work, they have open access to a description of her methodology (arXiv), her data (Dryad) and her code (GitHub). This will surely accelerate the discovery process in her field of research, and is probably pretty good for her citation statistics too!
Originally posted 2013-11-19 by Damien Irving in Bootcamps.
comments powered by Disqus