Abstract
Bug report filing is a major part of software maintenance. Due to extensive number of bugs filed everyday in large software projects and the asynchronous nature of bug report filing ecosystem, duplicate bug reports are filed. Capturing and tagging duplicate bug reports is crucial in order to avoid assignment of the same bug to different developers. Efforts have been made in the past to detect duplicate bug reports by using topic modelling [2], discriminative methods [5], meta-attributes [6], etc. Recently, Yang et al.[8] proposed an approach to combine word embeddings, TF-IDF and meta-attributes to compute bug similarity between two bug reports.