Use Levenshtein distance in query


#1

Enonic version: 7.2.0
OS: Linux Mint 19.3

Is possible use the Levenshtein distance with the expression into double quotes?

For example, these two queries returns different results:

fulltext('_allText', 'utprøvence~2', 'OR') => Return 95 results
fulltext('_allText', '"utprøvence"~2', 'OR') => No results

I need use double quotes, because I have expressions that have more than 1 word.


#2

Maybe it is related to this one?


#3

Maybe. But this topic is related to the double quotes. Without double quotes, the “Levenshtein distance” works correctly.


#4

I tried fulltext('_allText', '"utprøvence"~2', 'OR') in Data Toolbox and got same results s for fulltext('_allText', 'utprøvence~2', 'OR'), although it was only one item called utprøvence in the data.
So, I currently can’t reproduce the behavior you describe.


#5

Another question about this.
The Levenshtein distance consider the character space to return the results of query?
For example, the query

fulltext('_allText', 'myword~1', 'OR')

return the results with word my word?


#6

Due to tokeization which uses space as a delimeter it won’t work.
I suspect you are tying to solve compound words in Norwegian language. Enonic XP uses Elasticsearch which does not provide this functionality out of the box.


#7

Does the levenshtein distance not allow the misspelling of a spesific word by one character?
fulltext(‘displayName’, ‘tannhelsetjeneser~1’)

should this not equal?
fulltext(‘displayName’, ‘tannhelsetjenester’)

and return the same amount of results?


#8

The levenshtein Query should pot. return more hits than the exact one


#9

levenshtein Query is returning less hits in this case.
https://www.helsedirektoratet.no/search?searchquery=tannhelsetjeneser (92 hits)
https://www.helsedirektoratet.no/search?searchquery=tannhelsetjenester (197 hits)

fulltext(‘displayName’, ‘tannhelsetjeneser~1’)