Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex Managed Service for PostgreSQL
  • Getting started
    • All guides
      • Managing extensions
      • pg_cron
      • pg_repack
      • pgaudit
      • pgcrypto
      • postgresql_anonymizer
      • Hunspell dictionaries for full-text search
  • Access management
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Public materials
  • Release notes

In this article:

  • Adding a dictionary
  • Examples
  • Adding an English dictionary
  1. Step-by-step guides
  2. PostgreSQL extensions and dictionaries
  3. Hunspell dictionaries for full-text search

Hunspell dictionaries for full-text search

Written by
Yandex Cloud
Updated at May 15, 2024
  • Adding a dictionary
  • Examples
    • Adding an English dictionary

Dictionaries are used to configure full-text search in documents stored in PostgreSQL. Managed Service for PostgreSQL clusters have pre-installed Hunspell spell checker dictionaries supporting multiple languages.

Below, we provide instructions on how to add an English dictionary, as an example.

Adding a dictionaryAdding a dictionary

  1. Connect to the database using psql.

  2. To find out which languages are available, get a list of preset configurations for full-text search:

    SELECT cfgname FROM pg_catalog.pg_ts_config;
    

    The result includes the configuration names represented as language names.

  3. Create the public.my_config configuration for full-text search:

    CREATE TEXT SEARCH CONFIGURATION public.my_config ( COPY = pg_catalog.<configuration> );
    

    In your SQL query, specify the language configuration obtained in the previous step.

  4. Create a dictionary named my_dictionary in the DB:

    CREATE TEXT SEARCH DICTIONARY my_dictionary (
       TEMPLATE = ispell,
       DictFile = <words_in_dictionary>,
       AffFile = <affixes>,
       Stopwords = <stopwords>
    );
    

    SQL query properties:

    • TEMPLATE: Template used to create a dictionary. You can learn more about Ispell dictionaries here.
    • DictFile: Preset file with words organized in a dictionary.
    • AffFile: Preset file with affixes (prefixes, suffixes, and endings) you can add to words in the dictionary.
    • Stopwords: Stopwords to disregard in full-text searches. Such words may include articles, prepositions and interjections.

    Files with dictionaries and affixes preset in Managed Service for PostgreSQL clusters:

    Language DictFile dictionary AffFile affixes
    English en_gb.dict en_GB.affix
    Danish da_dk.dict da_DK.affix
    Spanish es_es.dict es_ES.affix
    Italian it_it.dict it_IT.affix
    German de_de_frami.dict de_de_frami.affix
    Polish pl_pl.dict pl_PL.affix
    Russian ru_ru.dict ru_RU.affix
    Ukrainian uk_ua.dict uk_UA.affix
    Czech cs_cz.dict cs_CZ.affix

    There are also examples of PostgreSQL dictionaries:

    DictFile dictionary AffFile affixes
    hunspell_sample_long.dict hunspell_sample_long.affix
    hunspell_sample_num.dict hunspell_sample_num.affix
    ispell_sample.dict ispell_sample.affix

    In the CREATE TEXT SEARCH DICTIONARY SQL query, do not use the .dict and .affix extensions in the file names.

  5. Link the dictionary named my_dictionary and other dictionaries with the word token type.

    Token is a word or phrase being searched for. It is specified in the search query and displayed in the full-text search results.

    ALTER TEXT SEARCH CONFIGURATION public.my_config
       ALTER MAPPING FOR word
       WITH my_dictionary,<list of_dictionaries>;
    

    In the WITH row, specify the publicly available dictionaries with a broader range of words, e.g., english_ispell or english_stem. The broader the dictionary, the further down it should be listed in the WITH row.

  6. Set public.my_config as the default configuration:

    SET default_text_search_config = 'public.my_config';
    
  7. Make sure the default configuration is set to public.my_config:

    SHOW default_text_search_config;
    

    Result:

     default_text_search_config
    ----------------------------
     public.my_config
    (1 row)
    
  8. Make sure the full-text search works:

    SELECT * FROM ts_debug('<token>');
    

    Specify a word that can be found in the DB documents as the token.

    Result:

     alias |    description    |  token  |       dictionaries       |           dictionary           +   lexemes
    -------+-------------------+---------+--------------------------+--------------------------------+------------
     word  | Word, all letters | <token> | {<used_dictionaries>} | <dictionary_with_found_lexeme> | {<lexemes>}
    (1 row)
    

    Here, a lexeme is a word that replaces cognate words in full-text searches. For example, if a DB document contains such words as cloud, clouds, and cloud's, PostgreSQL can recognize them as a single lexeme, "cloud".

ExamplesExamples

Adding an English dictionaryAdding an English dictionary

  1. Connect to the database using psql.

  2. Create the public.my_english_config configuration for full-text search:

    CREATE TEXT SEARCH CONFIGURATION public.my_english_config ( COPY = pg_catalog.english );
    
  3. Create a dictionary in the DB:

    CREATE TEXT SEARCH DICTIONARY english_hunspell (
       TEMPLATE = ispell,
       DictFile = en_gb,
       AffFile = en_GB,
       Stopwords = english
    );
    
  4. Link dictionaries named english_hunspell and english_stem with the word token type:

    ALTER TEXT SEARCH CONFIGURATION public.my_english_config
       ALTER MAPPING FOR word
       WITH english_hunspell, english_stem;
    
  5. Set public.my_english_config as the default configuration:

    SET default_text_search_config = 'public.my_english_config';
    
  6. Make sure the default configuration is set to public.my_english_config:

    SHOW default_text_search_config;
    

    Result:

     default_text_search_config
    ----------------------------
     public.my_english_config
    (1 row)
    
  7. Make sure the full-text search works:

    SELECT * FROM ts_debug('<token>');
    

    Specify a word that can be found in the DB documents as the token.

    Result:

     alias |    description    |  token  |          dictionaries           |    dictionary    +   lexemes
    -------+-------------------+---------+---------------------------------+------------------+------------
     word  | Word, all letters | <token> | {english_hunspell,english_stem} | english_hunspell | {<lexemes>}
    (1 row)
    

Was the article helpful?

Previous
postgresql_anonymizer
Next
Viewing cluster logs
© 2025 Direct Cursus Technology L.L.C.