Abstract
The use of Internet and Social Media has increased exponentially during the last few years globally
giving the opportunity to interact with other people and share ideas, thoughts, opinions etc. Large-scale
data are shared every day through social media platforms with enormous speed and reach an incredibly
huge number of people. Additionally, the possibility of writing anonymous posts and comments makes
it even more easy to express and spread hate speech. Social media platforms, in order to improve the
experience of their users, are trying to eliminate comments expressing hatred.
In this thesis, we mainly focus on developing automated techniques to identify hate-speech. We start
with Traditional techniques used for hate-speech detection, followed by advanced techniques for identifying hate-speech on social media. Lastly we attempt to identify cases where these automated systems
can fail miserable and propose technique to make these systems robust. Traditionally, various social
media platforms, such as Wikipedia, Facebook, YouTube etc, employ hundreds of staff members as part
of their human review team which manually read every reported post and decide if it is inappropriate for
the users. Nowadays, having administrators detecting which comments are offensive or relying on user
reports are ineffective methods not only due to the large-scale of data produces through social media
and also due to the fact that during the last years an increase of hatred is noticed in the modern societies.
As of 2018, there are about 317,000 status updates on Facebook every 60 seconds making it harder than
finding a needle in a haystack every 60 seconds!
We commence our research work with the problem of hate-speech detection on social media. Popular
techniques like handcrafted feature-based Machine Learning and rule-based approaches have their own
disadvantages. The former method is hard to scale and generalize, while the latter fails to provide good
results making it unreliable for use in production level systems. Motivated by the improvements in the
computational resources and availability of large annotated datasets, we work towards building deep
learning based approaches for hate-speech detection. We propose composite models which utilizes the
capability of deep models for generating concise representations along with the power of traditional
machine learning classifiers.
Every word has at least one meaning when it stands alone. But the meaning can change depending
on context, or even over time. A sentence full of neutral words can be hostile (“Only whites should have
rights”), and a sentence packed with potentially hostile words (“F**k what, f**k whatever y’all been
wearing”) can be neutral when you recognize it as a Kanye West lyric. Motivated by this idea, we then
move towards building techniques that utilize the contextual information instead of using just the content
for identifying hate-speech. We propose a Context-based neural network architecture that utilizes the relationship between the content & context for identifying hate-speech text. Modeling relationship
between the content & the neighboring text (known as context), helps in resolving disambiguity in cases
where its very difficult to identify if the text is abusive or not just from the content.
Finally we identify issues with the automated hate-speech detection systems and propose novel techniques to build robust models. Use of social-media data for training automated systems poses risk of it
learning biases which are abundant on social media. Almost every forum is biased in some form, either
women, gender, race, right-wing groups, gay/lesbians, religion etc. Learning powerful systems that can
change the user experience requires the system to be fair towards every religion, group or gender. The
artificial intelligence can not yet detect/understand these lingual nuances. We propose techniques that
identify a type of bias, stereotypical bias, in an arbitrary model and then propose techniques on we can
learn unbiased-artificially intelligent systems by observations from the biased social media.
We carry out extensive experiments to establish the effectiveness of our methods. We also perform
experiments to prove the efficacy of our model on a real-world API, the Perspective API, and show its
effectiveness in identifying biased systems.