Home of great SPHINX search http://www.sphinxsearch.com is down. Complete site including Forums, Downloads are unavailable. It is now almost 12 hours since we realized that the site is down.
One reason behind this might be that this site is NOT powered by SPHINX, yes its true forums searching is not implemented on SPHINX.
Hope to have it back soon.
When we performing full-text searching through a number of large indexes it is important that we get results in shortest time possible. To ensure this SPHINX provides feature of “Distributed Searching”.
Today we will look how we will look on SPHINX’s ability to perform distributed searching. If you don’t know what sphinx is please refer to my other post Introduction to Sphinx Search.
For example if you have a large index you can easily distribute. You can create this index in chunks and assign each chunk to each sphinx agent. You will query to this index and sphinx will do the searching in parallel and give you final results. This dramatically improves the speed of searching. This concept similar to table partitioning in mysql.
Here is the example to setup a distributed index
index mycompleteindex
{
type = distributed
local = chunk0
agent = localhost:3312:chunk2
agent = localhost:3312:chunk3
agent = localhost:3312:chunk4
}
Here type = distributed tells that this is not a normal index. As we only have one searchd instance installed so we are using same instance localhost:3312 and declaring it an agent.
With each agent are have specified an chunk to be served by this agent. The example shows distributed searching in one system the same can be achieved by using separate server for each chunk.
Note that this post is only introduction to this feature for further details refer to sphinx documentation.
Happy Searching
In an recent post famous MySQL Performance blog recommends use of SPHINX for full-text searching. Full-text indexing is one thing that stops most of the people to upgrade to InnoDB storage engine, and leaving MyISAM.
The post is focused on what to do with full-text when going to InnoDB. Several solutions are mentioned like
- Use MyISAM Slaves: Keep all the tables with full-text on separate slave server and query that slave for full-text
- Use “Shadow” MyISAM Table: Update myisam table on each update of innodb table by triggers
- Leave Tables as MyISAM: leave the tables with full-text to myisam and update all other tables to Innodb.
- Use Sphinx or other external full text search engine
How do you implement full-text search for that 10+ million row table, keep up with the load, and stay relevant? Sphinx is good at those kinds of riddles.
Sphinx stands for SQL Phrase Index. It is free, open source, powerful, easy to use, full-text search engine, which comes with apis in PHP and other languages. Sphinx is being used by some very large sites like craigslist.org, netlog.com and more.
Its key features include
- high indexing speed (upto 10 MB/sec on modern CPUs)
- high search speed (avg query is under 0.1 sec on 2-4 GB text collections)
- high scalability (upto 100 GB of text, upto 100 M documents on a single CPU)
- supports distributed searching (since v.0.9.6)
- supports MySQL natively (MyISAM and InnoDB tables are both supported)
- supports phrase searching
- supports phrase proximity ranking, providing good relevance
- supports English and Russian stemming
- supports any number of document fields (weights can be changed on the fly)
- supports document groups
- supports stopwords
- supports different search modes (“match all”, “match phrase” and “match any” as of v.0.9.5)
- pure-PHP (ie. NO module compiling etc) search client API
Sphinx is one of the tool which I enjoyed to work with. I love its speed, stability and cool features. For more you can see following links
Later we will be looking how to use sphinx things.