Week Three

This is week is full of fun. I started collecting different kinds of data and had chances to learn from previous SNS analysis. The SNS analysis is complex but interesting, which is very attractive to me.
The biggest problem I had was to integrate people’s information from different social network. Before collecting data, I thought people’s email address would be a good source to identify different subjects. However, for privacy and safety issue, some social network, such as Facebook, don’t allow people to extract that information. Therefore, I may need to change my idea, and consider using multiple identifiers or using “soft” identifier, such as Names, to label different subjects.
I also learned and experienced Gephi, which is a powerful tool for sns analysis and visualization. However, Gephi needs to import not only node data but also edge information. So it would be challengeable to create linkage across different sources if we don’t have valid identification label. There are still many things I to learn and investigate. I will keep working Gephi and think of a way to create comprehensive network visualization by this tool.
For next week, data integration is my priority. I will extract rest of data and integrate them. Then, I will try to synthesize the data by writing some code and do experiment on clustering and visualization. It will be very exciting to see the overlap between different social networks.

