Week 6 – Galaxy Zotero Analysis

Hello Guys,

This week is full of programming. Through the analysis process, several interesting points comes out.

Tag Analysis

Column “tags” is a highly interesting part for us because it contains keywords related to provenance research, for example, “reproducibility”. The objective of this project is to find out the current usage of provenance tools in academia and this column is a good point to start with. As for the composition, column “tags” contains manually added ones, each of which has its own meaning defined by Galaxy Project Group, and those automatically generated by Zotero.

1.1 All Tags

According to the definition of tags given by Galaxy Zotero Group,tags are composed by two parts –> some are manually added by Galaxy project group and others are automatically generated by Zotero. As a result, this analysis would be divided into two parts. Furthermore, among the manually added tags, tags started with “+” represent Galaxy Specific tags and each of them has its own definition.  Tags which start with “>” are named by public Galaxy platform.

Manually Added Tags Galaxy specific tags (“+”) 20
Public Platform tags (“>”) 168
Automatically Generated by Zotero 6381
Total number of tags unique 6569

1.2 Manually Added Tags Analysis

1.2.1. Analysis of Galaxy Specific Tags
1.2.2. Analysis of Public Platform Tags

For the public platform tags, three public platforms are frequently used including “>Huttenhower” “>RepeatExplorer” and “>workflow4metabolomics

Huttenhower: metagenomic and functional genomic analyses, intended for research and academic use

RepeatExplorer: Graph-based clustering and characterization of repetitive sequences, and detection of transposable element protein coding domains.

Workflow4metabolomics: A collaborative portal dedicated to metabolomics data processing, analysis and annotation.

1.3 Automatically Generated Tags Analysis

Happy to see some provenance related keywords: Reproducibility, Workflow

Papers Reading

Paper tag “reproducibility” 316

Paper tag “workflow” 117

The number of papers under these chosen tags(‘+Methods’, ‘Reproducibility’)is:5

The number of papers under these chosen tags(‘Reproducibility’, ‘Workflow’)is:7 

Next step for our research is to read the papers contained a combination of certain tags. Additionally, figuring out how the Galaxy group collected and tagged papers is necessary to ensure the reproducibility of our project.

Have a nice weekend.

Leave a Reply

Your email address will not be published. Required fields are marked *

*