-
Osku Salerma. Design of a Full Text Search index for a database management system. Master's thesis, University of Helsinki, January 2006.
Abstract: Full Text Search (FTS) is a term used to refer to technologies that allow efficient retrieval of relevant documents matching a given search query. Going through each document in a collection and determining if it matches the search query does not scale to large collection sizes, so more efficient methods are needed.
We start by describing the technologies used in FTS implementations, concentrating specifically on inverted index techniques. Then we conduct a survey of six existing FTS implementations, of which three are embedded in database management systems and three are independent systems. Finally, we present our design for how to add FTS index support to the InnoDB database management system. The main difference compared to existing systems is the addition of a memory buffer that caches changes to the index before flushing them to disk, which gives us such benefits as real-time dynamic updates and less fragmentation in on-disk data structures.