Some Useful Full Text Index Stoplist Related Queries

Stoplists are used in SQL Server Full Text Indexes to exclude common “noise” words (such as to, the, etc.) from filling up the index with information that is not very useful for Full Text Searches. By default, a new Full Text Index in SQL Server 2008 will use the system stoplist, which has about 15K different words in it.

You can create your own stoplist, either from scratch or by copying the system stoplist into it. Then you can modify your stoplist as you see fit, by adding or removing words. You can then change a Full Text Index to use your custom stoplist. You will have to repopulate the full text index for the new stoplist to take full effect.

At any rate, here are some useful queries if you need to manipulate your stoplist(s).

-- Some Useful FT Related Stoplist Queries
-- Glenn Berry    7/21/2009

-- Get base table names, FTCatalogNames and change tracking type
SELECT t.name AS [TableName], c.name AS [FTCatalogName], c.fulltext_catalog_id, 
       i.change_tracking_state_desc
FROM sys.tables AS t 
INNER JOIN sys.fulltext_indexes AS i 
ON t.object_id = i.object_id 
INNER JOIN sys.fulltext_catalogs AS c 
ON i.fulltext_catalog_id = c.fulltext_catalog_id;

-- Get all the non-system stoplists
SELECT stoplist_id, name 
FROM sys.fulltext_stoplists;

-- Create a stoplist from the system stoplist
CREATE FULLTEXT STOPLIST NewsGatorSL 
FROM SYSTEM STOPLIST; 

-- Look for a term in the stoplist
SELECT stoplist_id, stopword, [language], language_id 
FROM sys.fulltext_stopwords 
WHERE stopword = 're'

-- Drop a stoplist
DROP FULLTEXT STOPLIST TestSL;

-- Drop a term from a stoplist
ALTER FULLTEXT STOPLIST NewsGatorSL 
DROP 're' LANGUAGE 1033; 

-- Change the stoplist for a fulltext index
ALTER FULLTEXT INDEX ON CurrentPostFullTextMonday  -- This is base table name
SET STOPLIST NewsGatorSL

-- Repopulate FT Index after changing stoplist
ALTER FULLTEXT INDEX ON dbo.CurrentPostFullTextMonday START UPDATE POPULATION
GO

-- Get document counts for a display term
SELECT display_term, column_id, document_count 
FROM sys.dm_fts_index_keywords(DB_ID('ngfulltext1'), OBJECT_ID('CurrentPostFullTextMonday'))
WHERE display_term = 'Re'

 

Technorati Tags:

This entry was posted in SQL Server 2008. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s