Case Sensitive Search with Stemming
04 June 2012 10:38 AM
|
|
SummaryStemming in MarkLogic Server is a case-sensitive operation. Stemmed, Case Insensitive When you run a stemmed, case-insensitive search, MarkLogic will map all the word to lowercase and then calculate the stems. In English, this work fairly well as words are generally lowercase. For other languages (such as German) this doesn't always work as well. Stemmed, Case Sensitive When a search is case-sensitive, the stems are different depending on the case of the word. In English, case sensitive searches with stemming specified are not considered as stemmed searches because, in English, words with upper case letters stem to themselves. You would not expect proper names or acronyms to be stemmed to something else. For example, “Mr. Mark Cutting” should not match "marks cuts.” For German and other languages where stems exist for mixed case words, case-sensitive with stemming is recommended. ExamplesThese example queries demonstrate stemmed searches: Documents Case sensitive with stemming
| |
|