Gathering word statistics

Full-text search can handle a lot of data. To give end users more insights into their texts, PostgreSQL offers the pg_stat function, which returns a list of words:

SELECT * FROM ts_stat('SELECT to_tsvector(''english'', comment) FROM pg_available_extensions') ORDER BY 2 DESC LIMIT 3;  word     | ndoc | nentry 
----------+------+--------  
 function |   10 |     10  
 data     |   10 |     10  
 type     |   7  |      7 
(3 rows) 

The word column contains the stemmed word, ndoc tells us about the number of documents a certain word occurs. nentry indicates how often a word was found all together.

Get Mastering PostgreSQL 10 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.