Third Year Project: February 2017

Tuesday, 28 February 2017

27th February - 28th February 2017

This week we are working on the analysis phase of our project.
This includes:

Finding the most influential node in the tree of tweets

and seeing if a users follower count impact the level of influence
checking if there is a correlation

Finding the most common time tweets are sent

looking at time difference between tweets and replies

Conducting simple sentiment analysis to see if most replies to brand campaigns are positive or negative
Seeing if location has any influence

what countries tweet about a specific brand campaign the most.

We have divided up the tasks above to equally work on during the week to have completed for our meeting with our supervisor on Friday morning.

We will conduct testing and finishing up documentation and video walkthrough next week.

Sunday, 26 February 2017

23rd - 26th February 2017

This week Ina and I met up to discuss our progress. As in our last blog post, the database is now set up to receive tweets and the tree can now display tweets as branching nodes on the terminal which we use for testing.

Plans: Django web application + server configuration
We are using django, the python framework, to write up the app interface. We plan to use the following server configuration setup:

the web client <-> the web server <-> the socket <-> uWSGI <-> Python

This is an addition to using Amazon Web Services to host the NGINX web server and uWSGI.

Progress:
Started to implement simple analysis methods on the established tree structure. This includes:

Finding the average number of replies in a tree
Finding the longest reply-chain in the tree

We tried to contact our supervisor but unfortunately, he was unavailable so we will schedule another meeting with him next week.

Wednesday, 22 February 2017

18th - 22nd February 2017

While trying to implement the code for the project regarding tracing retweets through the different users, we ran into a few problems. It turns out that the Twitter API does not hold any data in the Tweet object about which user a particular Tweet is retweeted from. It only links back to the original author of the Tweet, regardless of any intermediary links. This poses a great problem for us as a large part of the project relies on being able to trace a retweets journey. Without the links in between to show who a user retweets a Tweet from, building a tree network of retweets seems impossible.

Here are some suggested workarounds:

Analysing retweets based on social network theories such as node centrality etc. instead of the tree-based approach in order to determine which users/retweets have more of an impact on the dispersion of the brand campaign.
Focusing on analysing the retweets in relation to other factors such as time etc

In the meanwhile, we are focusing on tracing replies to the campaign tweets while we wait to run the alternatives or any other fix-its with our supervisor.

Rachel is then working on the database setup and the functions related to passing in structured data of Tweets so we may be able to use it for analysis. We also ran into a few problems in this area as the Tweet structure can get quite convoluted with nested objects and elements making it hard to parse as humans, thus making querying the database a little hard to do. Along with this, we realised that we are unsure if we have to create an actual mySQL server for the database or is there a way around that. We plan to discuss that with our supervisor in our next meeting.

Tuesday, 14 February 2017

13th-17th February 2017

After falling slightly behind our initial schedule, we met with our supervisor (Ray Walshe) last Friday to discuss how to proceed with our project. After a successful meeting, we managed to kick ourselves into gear and divided up some work to be done individually. As it stands, we have managed to access the Twitter API and gather a number of tweets from the public stream and make these tweets readable by implementing the json package. Whilst this was being done, Ina was working off a controlled batch of tweets in order to try and create the basic tree-like structure which will be needed to represent our results. This is just our first blog post to try and get into the habit of keeping up with the progress that we make.