twitter youtube facebook linkedin email
Connect with:

Under the Hood - All things PDM and PLM

Fine Tuning your Search Slop Factor (You can’t make this stuff up)

schanenb
October 10, 2012

3611040776_2f32771c24_nThis week we are truly going under the hood with a look at how to fine tune your Lucene search engine by tweaking the slop factor.  As we discussed last week there are a number of factors affecting what results are returned in your search based on tokens and operators, details of which can be found here.

If you are finding however due to the structure of file names or other factors that your search results are simply too inaccurate or perhaps not broad enough there is a small modification you can make to alter the overall Vault search behavior.  This Value "Search Slop Factor" available as part of the Lucene service and modifyable in the web.config file defines to what degree of accuracy your search string must meet to be considered a match or, how close to the intended result does the search term need to be?

The Slop Factor is what we refer to as an edit distance and by default its set to 10 to enable a number of valid results to a search – if however you want a search on A-22* to ONLY return files with the string A-22 then you should set your slop factor to zero – be careful though because at 0 a search for A22* will not locate any files.

If you this same search search to return 22-A your slop actor would need to be 3, this is the number of token moves to get the string back in the correct order of the search query A-22*.  Note that if you want to return any values "out of order" the minimum slop factor required is 2.

If you would like to the same search to find a file called A-101-3675B-22 your slop factor would need to be 5 – it takes 5 token positional changes to move the "22" and match your search string A-22*.

Simple right?  Well this may get messy when we start searching for text strings in descriptions and comments – make your slop factor too small – say 2 and you may find your search for "front suspension" does not find "suspension assembly for front end "

So as you can see the factor decides the distance a string can be spread and still considered a match – your search of course will only return results for what you specify in terms of the search string but the "accuracy" or search intent is altered.  For more detail and examples visit the WikiHelp article on Fine-Tune Searches with the Search Slop Factor.

-Allan

Photo: Joelk75

schanenb

'