How search engines really work

From InfoCamp

Jump to: navigation, search

How Search Engines Really Work

A sadly un-attended session.

InfoCamp Site Search Critique

The default MediaWiki search has significant problems, and they show up on the infocamp wiki site

  • Huge stopword list means that if people search for any of the "noise" words, they will get nothing: no matches for the query after or right.
  • It doesn't do stemming or pluralizing, so results for search on library returns 2 matches, search for libraries has 6 matches.
  • There's no way to search for multiple alternative terms (such as ux OR ui) or search for one term without another (user -interface)
  • Incredibly confusing search results - very unclear title match vs. text matches, and even how many results there are. result for InfoCamp)

(see also my critique Why MediaWiki Search Stinks)

How to Replace the Default Search

  • MWSearch extension - Lucene search module (free and open-source), close to the the main Wikipedia search
  • Sphinx extension - another free open source search extension, works very nicely
Personal tools