{"id":2785,"date":"2016-06-20T14:53:20","date_gmt":"2016-06-20T14:53:20","guid":{"rendered":"https:\/\/notebooks.dataone.org\/?p=2785"},"modified":"2016-06-22T22:33:47","modified_gmt":"2016-06-22T22:33:47","slug":"week-four-further-analyses-and-results","status":"publish","type":"post","link":"https:\/\/notebooks.dataone.org\/dataone-impact\/week-four-further-analyses-and-results\/","title":{"rendered":"Week Four: Further Analyses and Results"},"content":{"rendered":"
This past week was spent solely on the statistical analysis of the user data from 6 member nodes. Due\u00a0to the staggered start of all of the Member Nodes, we had to restrict our sample size from all thirty MN to just 6 that fit our sampling criteria – the criteria were that the MN needed download data at least 1 year before and 1 year after its DataONE joining date. This meant checking that the MN not only fit the timeframe criteria, but also that we had data for the MN. Some I was able to confirm from the website, but for each MN, I also had to check the user data file I received and find the first date fo recorded data (i.e. none zero numbers).<\/p>\n
Most of my initial time in R was spent prepping the data for analysis; I’m hoping in the next few weeks I’ll be able to speed up that process to allow for more time on analysis. After the preparation, I initially ran individual t-tests comparing the data\u00a0across the “before” and “after” timeframes, a rough t-test of the averages across the timeframes, and a t-test of the linear regression coefficients. I also ran a cumulative t-test of all downloads\/uploads regardless of the MN. Results-wise, very few tests were statistically significant – a few individual t-tests, and the cumulative t-test. This is, in part, due to the fact that I didn’t run pairwise t-tests (which I will be doing this week), as well as the fact that the differences between the MN perhaps hid any before\/after differences. These I will be accounting for in a repeated analysis of variance model run this week.<\/p>\n
This coming week will include the above analysis I mentioned, as well as new analysis with our control group of repositories who haven’t yet joined DataONE to test if other factors may be influencing upload and download rates beyond just DataONE.<\/p>\n","protected":false},"excerpt":{"rendered":"
This past week was spent solely on the statistical analysis of the user data from 6 member nodes. Due\u00a0to the staggered start of all of the Member Nodes, we had to restrict our sample size from all thirty MN to just 6 that fit our sampling criteria – the criteria Continue reading Week Four: Further Analyses and Results<\/span>