While trying to implement the code for the project regarding tracing retweets through the different users, we ran into a few problems. It turns out that the Twitter API does not hold any data in the Tweet object about which user a particular Tweet is retweeted from. It only links back to the original author of the Tweet, regardless of any intermediary links. This poses a great problem for us as a large part of the project relies on being able to trace a retweets journey. Without the links in between to show who a user retweets a Tweet from, building a tree network of retweets seems impossible.
Here are some suggested workarounds:
- Analysing retweets based on social network theories such as node centrality etc. instead of the tree-based approach in order to determine which users/retweets have more of an impact on the dispersion of the brand campaign.
- Focusing on analysing the retweets in relation to other factors such as time etc
In the meanwhile, we are focusing on tracing replies to the campaign tweets while we wait to run the alternatives or any other fix-its with our supervisor.
Rachel is then working on the database setup and the functions related to passing in structured data of Tweets so we may be able to use it for analysis. We also ran into a few problems in this area as the Tweet structure can get quite convoluted with nested objects and elements making it hard to parse as humans, thus making querying the database a little hard to do. Along with this, we realised that we are unsure if we have to create an actual mySQL server for the database or is there a way around that. We plan to discuss that with our supervisor in our next meeting.