How search engines really work
From InfoCamp
How Search Engines Really Work
A sadly un-attended session.
InfoCamp Site Search Critique
The default MediaWiki search has significant problems, and they show up on the infocamp wiki site
- Huge stopword list means that if people search for any of the "noise" words, they will get nothing: no matches for the query after or right.
- It doesn't do stemming or pluralizing, so results for search on library returns 2 matches, search for libraries has 6 matches.
- Results display ugly Wiki markup text: facebook result
- There's no way to search for multiple alternative terms (such as ux OR ui) or search for one term without another (user -interface)
- Incredibly confusing search results - very unclear title match vs. text matches, and even how many results there are. result for InfoCamp)
(see also my critique Why MediaWiki Search Stinks)
How to Replace the Default Search
- MWSearch extension - Lucene search module (free and open-source), close to the the main Wikipedia search
- Sphinx extension - another free open source search extension, works very nicely
