Not every week produces a work of art, and while I tried, my code this week didn’t quite reach the level of poetry.
We’re in the midst of developing the final products for this project. To consist of three items:
- Reproducible code
- Short summary report
- Network visualizations
Item 1: Reproducible code. An Rmarkdown document intended for the analyst who will run the network analysis in the future. In six (possibly seven) parts:
- Load a data file that lists all the datasets and contributors in the archive
- Trim the data so that we only keep pairs of datasets connected by contributors
- Create an edge list from the trimmed data
- Make a network from the edge list
- Calculate, store, and report network statistics
- Create and store dataset groupings
- (possible) In-depth statistical analysis
Item 2: Short summary report. Intended for a non-technical audience, describes the who, what, when, where and why of this project. Mostly the what and why, with some vague gesturing at the how (cf. Item 1: Reproducible code, above).
Item 3: Network visualizations. We’ve got quite a few of these already. The trick is to figure out what we want to do for the grand finale. Static or dynamic? One network or several? So many options… Pulling this one together will be the focus of next week’s work.
And that’s about it for this week. I can feel the springs and gears winding down. Because the code I wrote this week is functional rather than lyrical, I’ll leave it to Mike Heaton to tickle your fancy: