Abstract
In Natural Language, Anaphora is an expression which refers to another expression in its context. This referred expression, called the antecedent provides the information for interpretation of the anaphora. In Natural Language Processing (NLP), Anaphora Resolution is the task of identifying the referent of the anaphora. Anaphora resolution is required in various NLP applications such as information
extraction, summarization and Machine translation. While the task of anaphora resolution for English has been studied to a sufficiently great extent and various techniques have been proposed for it,
the research for Anaphora resolution for Indian Languages has been very limited. Hence, in this thesis, we aim to develop resources and explore features and techniques for Anaphora Resolution in Hindi. There are three important contributions of this thesis. First, we developed an anaphora annotated corpus which is required for experiments in Anaphora resolution. Towards the development of this corpus,
we proposed a scheme or framework for anaphora annotation. Our goal, with this scheme, is to identify
and resolve various consistency issues associated with previous schemes or framework which are
used for anaphora and co-reference annotation in English and other languages. Using the proposed
scheme, we annotated anaphora references over Hindi Dependency Treebank. Second, we present a
hybrid approach to resolve Entity-pronoun references in Hindi. Most of the existing approaches, syntactic
as well as data-driven, use phrase-structure syntax for anaphora resolution, we instead, explore
the utility of dependency structures as a source of syntactic information. In our approach, dependency
structures are used by a rule-based module to resolve simple anaphoric references, while supervised
learning algorithms are used to resolve more ambiguous instances using grammatical and semantic features. Our results show that, use of dependency structures provides syntactic knowledge which helps
to resolve some specific types of references. Semantic information such as animacy and Named Entity categories further help to improve the resolution accuracy. Finally, we also conduct some preliminary experiments in Event anaphora and co-reference resolution. Similar to Entity anaphora resolution, for event and co-reference resolution too, we explore use of dependency structures in a rule based settings. Our experiments show that even with limited data and using simple syntactic and semantic constraints, reasonable resolution accuracy can be achieved for Event anaphora and co-reference resolution.