Elasticsearch phrase freq score without IDF
I need a custom score that is a combination of phrase frequency (i.e. number of "John Hobs" occurrences - same order, next to each other) and custom score that is computed while indexing.
Basically I need to turn off IDF from default scoring as it brings extra information that has different (not under control) value for each term and is nondeterministic because of shards. I know I can use function score, but I need to get somehow the phrase frequency value and without the need of reindexing.
There is probably no way to turn off IDF in default similarities (which can be tune dynamically without reindexing), right?
I can define custom scripted similarity, but the score is actually computed for each term and summed up (for "John Hobs" it is computed twice, for "John Walker Hobs" three times, etc.) and I don't now how to actually get number of terms of the query in the script.
I can write custom plugin. It should work without reindexing, but it works on term level only (I can get term frequency). How can I can compute phrase frequency? I cannot get any position information. Also I can access my custom score defined while indexing via lookup, what about performance? I suspect the performance would be not good.
I'll be glad for any answer of any question :) Thank you very much