A freshwater ecologist who studies natural phytoplankton communities had been working on a project for several years when unforeseen trouble cropped up and the still-incomplete project came to a standstill. Following days and weeks of pondering what to do about the obstacles that were hampering progress, she decided to put the project on hold for a while. Still experiencing the mix of relief and frustration that came with that decision, she focused her attention on another project that involved investigating changes over time in lake temperatures and in the depths at which different types of lake phytoplankton are found. After some time, however, that project also ran into difficulties. Prolonged agonizing over these setbacks (and worrying over all of the time invested without any manuscripts to show for it) was wearing her down when suddenly a moment of clarity revealed an idea that was surely pure genius – the problem with these projects is that they were not big enough!
You may now be questioning the distressed ecologist’s sanity, but her moment of insight actually turned out to be a valuable one. Many of the problems that had crept up in her two studies could actually be resolved if she expanded the study to involve more data and more organisms. Fortunately, she also knew another freshwater ecologist who was investigating yet another species of phytoplankton and he, too, had not yet published his work. She discussed with him her dilemma and her brilliant idea and he was completely on board, especially considering he had been having similar difficulties. They were consumed with optimism and were eager to embark on this new journey. However, after contemplating the next action, they realized they were in for a few storms before they could hit smooth sailing.
Between the two ecologists they had three separate projects, each having separate folders with multiple files and multiple versions of data and information. They also compared analyses for all of the projects, and came to the unfavorable realization that some of the analyses were in different formats. Two different programs with different formats were used – an open-source statistical and graphics scripting program called R and a commercial point and click program called JMP. They were overwhelmed to say the least! This is where the real trouble began. One of the researchers already had her data and documentation on a collaborative project management server, but this system was no longer supported at her institution, meaning that files needed to be transferred somewhere else. Although her data were on a collaborative site, she wasn’t sure whether or not the files there were up-to-date because she had another collaborator who had been working on the files and might not have added the most recent versions. The other collaborator had his data and documentation on his personal computer, but the file system and file-naming conventions were not very systematic, which meant he would need to go through a lot of the files to refresh his memory about what they each contained. How were they going to merge everything from their separate projects into one clear, organized workflow and file system for both of them to use at the same time, and how were they going to continue adding data and information to eventually successfully complete the collaborative manuscript?
Both of them spent an entire week organizing their data into the new project folder and establishing the most updated results of their combined projects. They began by setting up a shared Dropbox folder, which allowed them to upload and share their data files, code associated with their individual analyses, and other project documentation. To make navigating the files easier, they agreed to always label each file they saved with the date before the file name so they could quickly see what the most updated version of a file was. However, because they were constantly making changes, things very quickly became confusing and it was nearly impossible to keep track of exactly what had been changed and when. Even though they had invested so much time in setting up a file organization system that they both understood, things were getting out of control! As the number and complexity of files grew, finding the most recent version of a file and understanding exactly how it was different from other versions became a frustrating experience.
Despite their best efforts to keep things organized, their system wasn’t working. Frustration began to overtake their initial excitement about the possibilities the collaborative project offered. What to do, what to do?What would you do next? Do you have some good ideas for how to deal with this problem? Please share them in the comments section below. Click here to read about what the researchers did and the tools that helped them to keep their project organized.
To collaborate effectively, everything needed to be organized, and changes each collaborator made needed to be easy to identify and understand. During a meeting they organized to brainstorm a solution, one of the collaborators suggested a project management system she had used before – a web application called Redmine. Using specialized tools available for the Redmine environment, they would be able to post updated data files, results, and analyses, and the system would prompt them to create or update metadata (including information about changes that were made) for all of the files. The Redmine system also helps streamline communication between collaborators. If, for example, one researcher uploads files related to an analysis he is working on, he can use the issue-tracking functionality to enter information about how long he spent working on the analysis, the point at which he stopped, things he would like his collaborator to review or add, etc. The issue-tracking system then sends the collaborator an email notifying her that an issue has been posted for her to review. When she is done, she can use the same strategy to assign review or additional analyses to the other collaborator.
Switching to this project management system was a turning point for the project. As the confusion about what the files contained and who was working on what was cleared up, the enthusiasm for the expected outcomes of the analysis came flowing back. They were relieved to find how easy this program made it to collaborate and see the entire work flow of the project.
Although initially the two ecologists had some obstacles to overcome, their efficient communication, effective planning, and consistent adherence to their organization system set them on the right path to continue their project smoothly and efficiently. After another year, they were ready to publish their results and archive their data in a repository! They were so proud of the work they had produced and were so satisfied that they were able to successfully combine their data and keep it organized. When starting new projects, these ecologists now anticipate the types of data management problems they will encounter without a good data management system, and they proactively utilize the tools and practices that help to avoid such problems. No more weeks wasted on reorganizing poorly-organized files!
Story contributed by Dr. Derek Gray with additional information from Kara Woo.