PostgreSQL 8.3 on OS X: I like the full text indexing/search features

I built the latest version from source, with one problem: I was only able to install readline from source using "--disable-shared" so I ended up also building PostgreSQL statically linked - oh well so much for being in hurry, I have 2 gigs of RAM on my MacBook, so what is a little memory between friends :-)

I have been waiting for version 8.3 because of the full text indexing/search features. Here is the Text Search documentation - enjoy! Here is a little sample of the SQL extensions to support indexing and search:
test=# create table test (id integer, name varchar(30), email varchar(30));
CREATE TABLE

test=# create index test_name_idx on test using gin(to_tsvector('english', name));
CREATE INDEX
test=# insert into test values (1, 'Mark Watson', '[email protected]');
INSERT 0 1
test=# insert into test values (2, 'Carol Watson', '[email protected]');
INSERT 0 1
test=# select * from test where to_tsvector(name) @@ to_tsquery('mark');
id | name | email
----+-------------+---------------
1 | Mark Watson | [email protected]
(1 row)

test=# select * from test where to_tsvector(name) @@ to_tsquery('watsons');
id | name | email
----+--------------+----------------
1 | Mark Watson | [email protected]
2 | Carol Watson | [email protected]
(2 rows)

test=# test=# select * from test where to_tsvector(name) @@ to_tsquery('mark & watson');
id | name | email
----+-------------+---------------
1 | Mark Watson | [email protected]
(1 row)

test=# select * from test where to_tsvector(name) @@ to_tsquery('mark | watson');
id | name | email
----+--------------+----------------
1 | Mark Watson | [email protected]
2 | Carol Watson | [email protected]
(2 rows)

test=#
Obviously, if you were creating a new table with many rows, add the index after the data is added to the table. "gin" refers to a complete inverted word index. Specifying 'english' ensures that a word stemmer if used that understands English language conventions. Note that a search for 'watsons' matches because the search terms are stemmed before search.

The search syntax looks odd, but I expect to get used to it quickly. For Rails: I use "acts_like_ferret" a lot; I'll wait a month to see if any handy plugin is written for PostgreSQL specific search - I would rather that someone else write it. I need to check out acts_as_tsearch, but I don't think that it is updated yet to work with the final 8.3 release.

Comments

Popular posts from this blog

DBPedia Natural Language Interface Using Huggingface Transformer

Custom built SBCL and using spaCy and TensorFlow in Common Lisp

I have a new job helping to build a Knowledge Graph at Olive AI