May 28: Blog, etc.

Right, first post. Today I got this site properly up, formatted as nice as I could given the boring Twenty-Ten theme, and there is now a public front page and a call for people to help me out. Hopefully, this will result in better research on all counts. I also did a few other things today. Let’s go through the agenda that was agreed upon last week during the skype chat with Bill, Karthik, and Bertram.

  • Find more examples of Kepler, Taverna Workflows from the Depositories and myExperiment – Still working on this.
  • (Saturday) Set up the public blog on the DataOne Site
  • (Saturday) Set up a public blog post and webpage to direct scientists who might be keen to share workflows to, in order to provide information about the project.
  • Use the Kepler mailing list (and others) to contact people – waited for this page. Will do tomorow.
  • Develop list of criteria for analysis of workflows; Brainstorming workflow usage. Still working on this – need more research, more background reading.
  • Make three excel sheets: Users, Workflow languages (share with Karthik), Workflows themselves. Done, but now I need to fill them up a bit more. Should be a fun task for tomorrow morning.
  • (Thursday) Set up, become familiar with, and use Mendeley
  • (Thursday, Friday) Do the relevant reading loaded onto Mendeley
  • (Saturday, 2:00 am) Set up a DropBox for usage by each of the mentors on this project.
  • (Saturday 2:10 am) Email out to everyone the hours for next week (16:00 GMT/ 11:00 EST, 31-05-2011)
  • (Saturday 1:00 am) Fill out the mentor program for the next two weeks
  • (Saturday 2:18 am) 29th-30th in UC Davis (Buy Flights – 28th evening).

As you can see, I couldn’t sleep last night at 2, mostly because I got an email from another intern who is way ahead of me in productivity. So, I’m stepping it up and working overtime. I have quite a bit to get through tomorrow – but hopefully, that should be doable before I sign off for the evening at around 6pm.

Today I also had a chat for around thirty minutes with Bertram about a few things. First, I took out a book with an article written by him on it in from the library – Workflows for e-Science. I’m hoping to peruse it tomorrow. He suggested a few papers:

  • Wei Tan, Jia Zhang, Ian Foster. Network Analysis of Scientific Workflows: a Gateway to Reuse. IEEE Computer, 43(9): 54-61, 2010. doi:10.1109/MC.2010.2622010
  • Norbert Podhorszki, Bertram Ludaescher, and Scott A. Klasky. 2007. Workflow automation for processing plasma fusion simulation data. In Proceedings of the 2nd workshop on Workflows in support of large-scale science (WORKS ’07). ACM, New York, NY, USA, 35-44. DOI=10.1145/1273360.1273368
  • Scientific Workflow Design 2.0: Demonstrating Streaming Data Collections in Kepler. Lei Dou, Daniel Zinn, Timothy McPhillips, Sven Köhler, Sean Riddle, Shawn Bowers, Bertram Ludäscher.27th IEEE Int’l Conference on Data Engineering (ICDE 2011), Hannover, Germany, April 2011.
  • Scientific Workflow Design with Data Assembly Lines.   Daniel Zinn, Shawn Bowers, Timothy McPhillips, Bertram Ludäscher. 4th Workshop on Workflows in Support of Large-Scale Science (WORKS 2009), Portland, OR, November 2009.
  • Timothy McPhillips, Shawn Bowers, Daniel Zinn, Bertram Ludaescher. “Scientific Workflow Design for Mere Mortals”. Future Generation Computer Systems, 25(5):541-551, May 2009.

As well as this YouTube Video, which is interesting. (Due to the incredible brilliance of Youtube and WordPress, I was unable after five minutes to embed this thing in any way.)

Finally, I installed KNIME at his suggestion, which might have more workflows in it. I also installed RapidMine, which has quite a few workflows on myExperiment, as well as emailing the guys over at SciencePipes to ask why there don’t seem to be any workflows on their site. I also started an email correspondance with the person in charge of myExperiment – looks hopeful. He also suggested a paper after talking to Ulf Leser, who is based in Berlin – I was thinking about going there for the Open Knowledge Conference around the first of July – I’ll ask the mentors whether this might be possible, as my research is entirely based on public data and it would be good to talk with others about the possibilities of using sites like myExperiment and how to share workflows.

That’s all for now.

Leave a Reply

Your email address will not be published.