{"id":2618,"date":"2015-06-26T19:10:38","date_gmt":"2015-06-26T19:10:38","guid":{"rendered":"https:\/\/notebooks.dataone.org\/?p=2618"},"modified":"2015-07-03T19:14:59","modified_gmt":"2015-07-03T19:14:59","slug":"week-five-yz","status":"publish","type":"post","link":"https:\/\/notebooks.dataone.org\/dataone-net\/week-five-yz\/","title":{"rendered":"Week Five"},"content":{"rendered":"
Last week I got the initial visualization, but the integrated\u00a0data seemed to have some mistakes. One large cluster of four webinar followers moved away from the central big cluster.\u00a0It\u00a0means that none of followers in that webinar cluster followed other SNS accounts. However, it was confirmed that some users in the webinar cluster should follow other SNS accounts, and they\u00a0didn\u2019t list in those accounts. Therefore, my priority for this \u00a0week was to find the reason of\u00a0what happended, generate new data, re-integrate input data, and generate a new visualization.<\/p>\n
After re-looking at the data several times, I found that for subjects in the webinar data their names\u00a0are converse. That means for those subjects I used “Last Name-First Name” as input. That is why those subjects were not connected to other SNS networks. Meanwhile, I didn\u2019t use emails as identifiers because some input data doesn\u2019t have that information available. However, I should have include them which would make the result more accurate. Therefore, I wrote codes to adjust the input names to right order,\u00a0and included email as identifiers when information available. After that, I re-integrated the data, fed the data into Gephi, and found all clusters connecting with each other. I adjusted the label size,\u00a0and checked some metric information of the new graph. Now the visualization looks more reasonable.<\/p>\n
Also, it would be very nice if we can have a live version of visualization. Unfortunately, I found the only ways to have animation Gephi graph is to use Gephi to open the file or to use screencast. Therefore, I plan to use screencast to show the network connection after I get the final version of visualization.<\/p>\n
Another task is to generate most active subjects. I calculated the number of SNS accounts followed by each user, displayed the descriptive statistics and printed the results. Though for many subjects, they only follow one SNS account, there are still some subjects following more than one\u00a0SNS account. However, it will take an effort to extract all their information. Many of that information have to be typed by handed. So I plan to discuss it with my mentor to decide what kind of information we want to extract and how many subjects we want to have.<\/p>\n","protected":false},"excerpt":{"rendered":"
Last week I got the initial visualization, but the integrated\u00a0data seemed to have some mistakes. One large cluster of four webinar followers moved away from the central big cluster.\u00a0It\u00a0means that none of followers in that webinar cluster followed other SNS accounts. However, it was confirmed that some users in the Continue reading Week Five<\/span>