I'm trying to add a Russian dictionary for full-text search in PostgreSQL. Converted to UTF-8:

iconv -f koi8-r -t utf-8 < ru_RU.aff > /usr/local/Cellar/postgresql/9.4.1/share/postgresql/tsearch_data/russian.affix iconv -f koi8-r -t utf-8 < ru_RU.dic > /usr/local/Cellar/postgresql/9.4.1/share/postgresql/tsearch_data/russian.dict 

and tried to create a new dictionary:

 CREATE TEXT SEARCH DICTIONARY russian_ispell ( TEMPLATE = ispell, DictFile = russian, AffFile = russian, StopWords = russian ); 

But received an error message:

 ERROR: invalid byte sequence for encoding "UTF8": 0xd1 CONTEXT: line 341 of configuration file "/usr/local/Cellar/postgresql/9.4.1/share/postgresql/tsearch_data/russian.affix": "SFX Y хаться шутся хаться" 

How can I fix this error?

Thank.

    1 answer 1

    Try https://code.google.com/p/hunspell-ru/ - this dictionary is more complete and does not require conversion to utf-8. Postgres also supports this format.

    • Try to write more detailed answers. Explain what is the basis of your statement? - Nicolas Chabanovsky