Week 9 : Reproducibility of Script-Based Workflows – Wrap up

For this week, I kept working on reproducing the LIGO script on docker and Reprozip experiment by building “LIGO script” container on top existing YW container.  Additionally, I tried to understand the bridge between Docker and Reprozip which are for sharing dependencies but orthogonal kinds docker, i.e one is for software dependencies; e.g.,for YW or YW demos, Docker will tell what you need in terms of software and docker files will tell what is required (also “docker history” will allow you to see how your container was built). While Reprozip captures and reproduces data dependencies which trace which the data files are read, written by a script akin to YW + NW and docker container might have all the relevant data, but need Reprozip to determine what files were actually used. This tool may be applied for a case of Matlab DataONE Toolbox which capture data provenance for Matlab scripts and console commands without the need to modify existing Matlab code.

This week is also the last week of my internship, therefore I have also started documenting my work that summarize the questions, tasks, and findings of prospective and retrospective provenance and create some online documentation for each major “work package” I have worked on and checked into Github. I would like to thank my mentors Prof. Bertram, Tim and  Paolo for giving me an amazing chance working on this project as well as the constant guidance and informative feedbacks.

Leave a Reply

Your email address will not be published. Required fields are marked *