Abstract
Abstract
With the advent of web 2.0, the world has witnessed an immense growth in social interactions over the web. Social media allows online users to share their opinions and express their thoughts with others.
Users can post their views in the form of blog posts, tweets, product reviews, etc. With the exponential
growth in the use of social media, there was a need to automatically interpret and analyze this large
volume of data. Researchers worked on analyzing user sentiments and summarizing users’ views and
opinions.
Online debate sites are one of those social media platforms where users can take a stance and argue in
support or opposition of the debate topics. This platform provides a rich collection of differing opinions on various topics. Online users’ views mainly consists of opinions and facts. Facts include objective world known information which strengthens users’ stance. Objective information is very reliable to persuade others. On the other hand, opinions include users’ sentiments towards a debate topic which means subjective remarks are made towards either stance. Users’ posts can have a mixed set of emotions in which some sentences might support one topic and other might oppose that same topic. It can contain
positive or negative remarks about both the debate topics. Users structure their arguments in such a way that overall debate post helps them achieve the task of supporting their chosen stance.
Considering the rich collection of mixed bag of sentiments provided by online users, we carried out
the task of identifying stance users are supporting in these online debates. We identify sentence level
user intentions by assigning sentiment carrying words in the sentence to appropriate debate topic. For this subtask, we have devised domain independent intention tagging schema. Argument structure of the debate post is then built using these individual sentences’ intentions and thus debate posts’ stance are classified accordingly. The online debate is a multi-party conversation where users can rebut other users’ opinions. We generate inter-debate user interactions graph and use gradient ascent strategy to
improve stance classification accuracies further.
With the huge amount of opinions available about the debate topics, it was required to summarize
these debates to facilitate users’ for easier understanding. Considering the current state of Natural
Language Generation systems, we have opted for Extractive Summarization approach where debate
sentences are ranked with features like sentiment richness, topic directed sentiment strength, topic relevance,
etc. Our system significantly outperforms several baseline systems and show 5.2% (ROUGE-1),
7.3% (ROUGE-2) and 5.5% (ROUGE-L) improvement over the state-of-the-art opinion summarization
system. The results verify that topic directed sentiment features are most important to generate effective
vii
viii
debate summaries.
Major contributions made in this thesis include:
• Novel Domain independent intention tagging schema to mark sentence level user intentions.
• Argument structure of the debate post to aggregate contributing sentence level intentions.
• Gradient Ascent approach to improve stance classification based on inter-debate user interactions
structure.
• Novel debate summarization system to summarize the online debates.
Keywords: Topic Directed Sentiment Analysis, Argument Structure, Gradient Ascent approach,
Online debate summarization features