Abstract
Legal contracts are formal agreements between parties that create legally binding obligations and
rights, typically between two parties involved in a transaction. Legal clauses form the fundamental units
of discourse of a legal contract, which are individual provisions that set out specific terms or conditions of
the agreement, which collectively make up the entire contract. Contracts are typically drafted by lawyers
or other legal professionals who have expertise in contract law through a process that involves a series of
steps to ensure that the interests of both the involved parties are captured, while also ensuring the legal
correctness of the content. The domain specific content in legal contracts is an interesting area for the
application of NLP techniques, not only because of the peculiar nature of legalese, but also because of
the challenges that lie in processing the long length of the content involved. Compared to the general
domain of language such as news, encyclopedic content, stories, social media content or some other
domain-specific verticals such as scientific articles, judicial proceedings, the domain of legal contracts
has seen much less research when it comes to the application of deep learning techniques in NLP, with
most of it focusing on problems in contract understanding and review. However, a study of the application
of generative methods to this domain has been severely lacking. This thesis aims to establish a stepping
stone towards the AI-aided generation of legal content, by presenting a study of two generation problems.
We also focus on involving user customizability in the process, for easy tailoring of legal content with
respect to the parties involved.
The thesis starts off with a simple exploratory idea focused on a small set of rental agreements. In this
pilot study, we built and studied an agreement drafting tool that could match informal user intents to rental
agreement clauses. Observations from this study brought forth the challenges that lie in the generation
of legal content, and the need for explicit finetuning in the face of the scarcity of supervised data while
leveraging the presence of large unlabeled corpus for content generation. Following this, we study and
extend a prior work done towards contract drafting on the recommendation of legal clauses. In this work,
we experiment with addition of a new clause to an existing, incomplete contract draft and experiment
several strategies to study the effect of different informative signals for clause recommendation. While
we model the contexts for recommendation using mean-based pooling of representations from BERTbased architectures, we also explore methods to achieve a better modeling with the use of long range
transformers and present the difficulties involved.
In another major contribution of this thesis, we devise a pipeline for generation of legal clauses from
minimal, keyword-level information. We take inspiration from the content planning for story generation
vii
viii
paradigm, and study the application of coherence-based techniques in devising a content plan for the
generation of legal clauses. Our approach is centered on the idea of generating a legal clause, given the
topic (or the type) of the clause and a few keywords for customization. Our proposed pipeline uses the
topic and keyword input to use a lightweight graph-based mechanism for creating a content plan to act as
the outline for the clause to be generated. The content plan is used by a transformer-based generative
model for producing an appropriate legal clause, interpolating through the content plan keywords. We
propose an ordered content plan consisting of clause keywords, ranked with respect to their generic-ness,
with the keywords more generic to the topic ranked higher. We show the benefit of this order as opposed
to a natural, sequential order. The study also compares the ordered plan-based generation in contrast to
baselines involving prompt-based generation and generation from unordered content plans. We also show
the robustness of our approach across a broad range of topics. With our techniques based completely
on pretrained transformer architectures, we also contribute a small chapter on the effectiveness of these
architectures on four other tasks.