Raimon Bosch, Master thesis – Intelligent Interactive Systems, Universitat Pompeu Fabra (2013), Prof. Dr. Leo Wanner
Nowadays, social contacts are vital to find relevant content. We need to connect with people with similar interests because they provide content that matters. Every day is more clear that in the future of document recommendations will be necessary to cross the traditional data with the data obtained from social networks. For instance, in order to provide the best content available we can use sentiment analysis techniques to prioritize content with good reviews. The aim of this project is to offer a better sentiment recognition strategy.
In this master thesis we have worked analyzing short messages about brands in Twitter trying to classify them between positive and negative using Sentiwordnet. After several experiments, we have seen that applying a semi-supervised approach we could increase the quality of the dictionary and adapt it to a specific domain. In the second part of the project we wanted to get one step further by analyzing relevant content inside those tweets to know also the reason why something is positive or negative. Due to the lack of strong grammatical structures inside tweets we had to go for an approach based on structured N-grams. For that, we have modelized a new idea called sentigram that consists in the aggregation of several N-grams. This approach allows to create models very precise to specific domains and at the same time capture the relation between aspects and sentiment words.
Keywords: sentiment analysis, natural language processing, opinion mining, twitter, n-grams,
sentigrams, aspect identification, social networks, machine learning.
Read the entire work on he following link: