For example, there is a variable in it 'Борщ' or 'Борщ по-польськи' and I need to find a match in the Content table, and then return the result found.

The following data will be stored in 'Content' => title :

  id | title | description | img | ----+------------------+-------------+-------+ 1 | Солодкий борщ | TEXT | TEXT | ----+------------------+-------------+-------+ 2 | Борщ по-польськи | TEXT | TEXT | ----+------------------+-------------+-------+ 3 | Холодний борщ | TEXT | TEXT | ----+------------------+-------------+-------+ 

DB: SQLite3, Python 3.5

Error installing the APSW library

 C:\Users\NameUser>pip install apsw Collecting apsw Using cached apsw-3.9.2-r1.tar.gz Installing collected packages: apsw Running setup.py install for apsw ... error Complete output from command c:\users\NameUser\appdata\local\programs\python\python35-32\python.exe -u -c "import setuptools, tokenize;__file__='C:\\Temp\\pip-build-y1sxrjbt\\apsw\\setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record C:\Temp\pip-jg4_ur_s-record\install-record.txt --single-version-externally-managed --compile: c:\users\NameUser\appdata\local\programs\python\python35-32\lib\site-packages\setuptools\dist.py:285: UserWarning: Normalizing '3.9.2-r1' to '3.9.2.post1' normalized_version, running install running build running build_ext SQLite: Using amalgamation C:\Temp\pip-build-y1sxrjbt\apsw\sqlite3\sqlite3.c building 'apsw' extension error: Unable to find vcvarsall.bat ---------------------------------------- Command "c:\users\NameUser\appdata\local\programs\python\python35-32\python.exe -u -c "import setuptools, tokenize;__file__='C:\\Temp\\pip-build-y1sxrjbt\\apsw\\setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record C:\Temp\pip-jg4_ur_s-record\install-record.txt --single-version-externally-managed --compile" failed with error code 1 in C:\Temp\pip-build-y1sxrjbt\apsw\ 
  • 3
    You were actually given the correct answer. For a more accurate answer, provide more information: which version of sqlite are you using? With ICU support? How exactly do you work with the database from Python? And finally, show your efforts: what have you done? Show your code. - Alexander Petrov

4 answers 4

To support case-insensitive search using LIKE for non-ascii text, you must enable the icu extension. Then the query LIKE '%борщ%' will find strings containing both “borsch” and “borsch”.

To quickly search by whole words, you can use the fts4 extension: MATCH 'борщ' . To support Unicode casefolding, you can also enable icu when creating a virtual table for fts, for example:

 CREATE VIRTUAL TABLE soup USING fts4(title TEXT,tokenize=icu) 

MATCH 'Холодний борщ' will find the lines containing the specified words in any order, for example, it will also be found: “Cold Borscht”.


To enable the icu extension, the sqlite assembly must support extensions in general and is compiled with options specifically for the icu extension in particular. A relatively simple way to get a sqlite assembly in which you can enable icu is to use the apsw module . The presence of its assembly does not prevent the use of the system sqlite even in the same process.

Here is a complete code example that uses apsw to find non-ascii text in caseless register in sqlite:

 #!/usr/bin/env python from __future__ import unicode_literals, print_function import apsw connection = apsw.Connection(":memory:") cur = connection.cursor() cur.execute('CREATE VIRTUAL TABLE soup USING fts4(title TEXT,tokenize=icu)') with connection as db: db.cursor().executemany('insert into soup(title) values(?)', [['Солодкий борщ'], ['Борщ по-польськи'], ['по-польськи борщ'], ['Холодний борщ'], ['Солянка']]) # NOTE: no Борщ on sqlite3 but it is found with apsw with the enabled icu extension # NOTE: 'like' works but it may be slow compared to the fts4 'match' for title, in cur.execute('SELECT title FROM soup WHERE title LIKE ?', ['%бор%']): print(title) print('*' * 60) for title, in cur.execute('SELECT title FROM soup WHERE title MATCH ?', ['борщ']): print(title) 

Result

 Солодкий борщ Борщ по-польськи по-польськи борщ Холодний борщ ************************************************************ Солодкий борщ Борщ по-польськи по-польськи борщ Холодний борщ 

To install the apsw from the repository on Ubuntu:

 $ sudo apt-get install libicu-dev icu-devtools # icu-config $ git clone https://github.com/rogerbinns/apsw.git $ cd apsw # NOTE: use virtualenv, to isolate the installation $ python setup.py fetch --all build --enable-all-extensions install $ python setup.py test 

There are ready-made assemblies for different platforms that support some extensions .

  • my pyCharm does not see the apsw library after trying to install via pip sees an error - Kill Noise
  • @Surfer your IDE nothing to do with. I should explicitly mention that apsw is not compatible with pip as it is explicitly stated in the last link in the answer. You can also see that pip not used when installing in commands explicitly listed in the answer. How to install apsw on Windows is a separate issue — do not mix it with the current one. Start by downloading binary installers (follow the last link in the answer), if that doesn't work, ask a separate installation question (don't forget to mention what configuration you are trying to install — what extensions do you need). - jfs
  • error when installing on windows, took from here: rogerbinns.imtqy.com/apsw/download.html --------------------------- Cannot install - -------------------------- Python version 3.5 required, which was not found in the registry. --------------------------- OK ---------------------- ----- - Kill Noise
  • Why is everything so difficult, simply can not be? - Kill Noise
  • one
    @Surfer it means that you have a version without the icu extension. Apparently, if you want the “soup” to also find “Borsch”, then you need to assemble the package yourself. The build command on Windows is available in the last link in the response ( python setup.py fetch ... install ... ). You must have a compiler installed that C can build for Python extensions. Considering you had difficulties just to download a file from the Internet, installing a compiler can be too difficult at this stage. - jfs
 SELECT title FROM Content WHERE title LIKE '%Борщ%' 

It is better to do queries in sql to the finished version. As they will execute faster, than we say if from sql an intermediate variant to pull out and to process it.

  • And how does your answer differ from the Batanichek answer? - Alexander Petrov
  • Council use. - hitcode
  • @AlexanderPetrov is also different register. As follows from my answer , by default, '%Борщ%' and '%борщ%' may lead to different results. - jfs
  • By the way, yes, case - sensitive - hitcode
 select title from content where lower(title) like lower('%Борщ%') 
  • 2
    Without ICU support this does not work. - Alexander Petrov

I think all you need is like

like

 select title from Content where title like '%борщ%' 
  • one
    I think, “Borsch” with a capital letter will not find. But this is a sqlite problem. - Alexander Petrov