“Machine-aided Back-of-the-book Indexing” is a tool which aids in the preparation of the index contained at the back of the book. The main idea behind developing this system is to save the time and money involved in the process of publishing a book by automating some of the manual tasks.
This system takes the document file as input, process it in order to extract the words along with information such as term frequency, document frequency and corpus frequency. Custom data structures were created to facilitate the process of optimized organization of the information gathered. These words are fed to the matrix processor for weight evaluation and words filtering using information retrieval techniques such as: term frequency- inverse document frequency and singular value decomposition.
The output of the system will be a structured list of the keywords giving the information about their location through their respective page numbers. The system was evaluated in comparison with existing indexing tools and the results supported our approach.
The paper won the award for best paper at National Students Conference on Information Technology (NaSCoIT) 2013.