False positives when using element-word query after upgrade to 7.0-5 and 8.0-1
27 October 2015 04:48 PM
To avoid index bloat, MarkLogic only records positions in its indexes for words once for word-query fields. When word positions are necessary to accurately match element-word queries, they are normally used from the word-query field. When elements are excluded from the word query field, words under those elements are not indexed - so their positions are not recorded. In MarkLogic 7.0-5 and 8.0-1, a code change was included to avoid false negatives resulting from an element-word query expecting positions from words in elements descended from excluded elements. This code change was to not use positions from the word-query field for element-word searches if the word-query field has exclusions.
Unfortunately, this solution can sometimes result in false positives - which is captured in 7.0-5 bug #33207 and 8.0-1 bug #32686 (you can read more about both of these bugs in our Fixed Bugs Report). Consequently, a follow-up refinement was shipped in 7.0-5.1 & 8.0-2 to allow for the affected queries to be fully resolveable via indexes. To take advantage of this update, three changes are required:
1) Upgrade to 7.0-5.1 or later, or 8.0-2 or later
2) Database index settings must be updated to tell MarkLogic Server to use positions in this scenario and therefore avoid the previously seen false positives. There are two changes that could be made. Either:
2b. All the word-query excluded elements must be configured as phrase-around elements.
3) After the relevant database index settings are updated and the upgrade has been applied, a reindex must be performed
If these changes are made, positions in the word-query field should then be used, which should then ultimately result in the elimination of false positives.