Advancing Drupal Search

|

At DrupalCon there was a Search BoF with about 20 people present. We primarily discussed integration with the Lucene/Solr search engine, but then concluded that creating a Search API that allowed for other search engines to plug into it would likely be the best approach. This type of approach allows small site owners to keep using integrated drupal search and larger site owners the option to integrate a more robust search engine with more features.

Another important part of what happened was David Lesiuer showed all of us his own custom drupal driven search engine module called Faceted Search (http://drupal.org/project/faceted_search). He has written a module which gives a views-esque type admin interface to creating any number of different search pages that site owners can create and allow their users to use. After seeing what an incredible job he has done with his search interface many at the conference agreed that it going down the path of creating a pluggable API with more features is a must have for Drupal.

Where does this leave us?

I've put together a list (in no particular order) of features that I think make a 'complete' search engine, in an effort to visualize how one might go about creating a core API that supports different engines which may or may not implement these features.

-Spelling suggestions (either inline or returned after query)
-Faceted search browsing
-Adherence to node access permissions
-Result highlighting (returning highlighted passages with keywords locate within the text corpus)
-Related items (mainly using a search engine for a means of content recommendation perhaps not at the core search API, but something to consider)
-Varied scoring metrics (create different search interfaces that return content scored differently, searching at interface 1 may return results that are more recently posted as rated with a higher relevance vs. interface 2 which returns results posted by admin users with higher relevance)
-Advanced search criteria (searching by user, field, title, body, profile field of author)
-Ranged Search (very relevant for product/ecommerce type sites)
-Searches nodes, users, or nodes and users combined
-Stop words (dismissing common words)
-Stemming (reducing words to their root prior to indexing and again prior to search performed)
-- stemming and stop words both are part of something called tokenizing, whereby words are transformed for better result matching during the search process. the ability to allow other module to hook into the index and search process would allow these module to perform various other tokenizing actions.
-Synonyms
-Attachment indexing (index .doc, .pdf, .txt, etc along with the node content)

Well, thats a really long list, so what can we do/where do we go? Creating a core API means creating a way for each search engine module to tell the core API that it is installed, that it performs x number of features, and it then needs to feed these extra features back to the search interface (this is true for spelling suggestions, faceted search blocks, etc.) Using this admin interface the admin could create different search pages on their site that utilize different engines, one interface could search just products on their site and it allows the user to search by title, product id, or taxonomy term (category), and another interface to search through their blog articles by keywords. This is truly where the power of a core search API lies, in giving the site admin the tools to easily create a valuable search experience for their users.

Personally, I think that if we can find a way to extract much of the search interface that David has put together into a core search API we will see a lot more really cool Drupal driven applications.

Where to go from here?

-Join the search group (http://groups.drupal.org/node/4102) and lucene/nutch/solr group (http://groups.drupal.org/lucene-and-nutch)
-View and use the 'apachesolr' module (http://drupal.org/project/apachesolr) and 'faceted search' module (http://drupal.org/project/faceted_search).
-Give feedback and opinions, talk on the group pages about your implementations and or experiences with search.
-Post up search engine implementations from websites that you enjoy using.

Thanks for this detailed

Thanks for this detailed report! To complement it, here is a list of issues created during the BoF.

I love faceted search, what

I love faceted search, what a fantastic module! I'm glad everyone else agrees that this is the way forward for Drupal search.

There has been a lot of

There has been a lot of search improvement from 4.4 to 4.5 to 4.6. I believe that there is more to come. If the built in search is insufficient, then you can use the contributed module trip_search. If that is insufficient you can code your own module. If your module works out as you hope to, then feature migration is always a posibility.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • You may use [inline:xx] tags to display uploaded files or images inline.
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.

Upcoming events

  • No upcoming events available