Choosing Boosting in Lucene
I have a few questions about boosting in Lucene. I am running a research project where I have, for each document, 4 fields: f1, f2, f3, f4. I also have a set of queries for my corpus, and I know the relevant documents for each of these queries. What I want to study is how boosting affects the search results of these queries. Basically, I want to show that by boosting some of these fields the results are better (I hope).
I have, though, a few essential questions that I cannot figure out and I would really appreciate some help...
1. Is there any difference between boosting the fields at index time and boosting the terms in the queries which appear in these fields at search time?
Again, I know beforehand the set of queries and also the terms in these queries which appear in the documents in the corpus in each of the fields.
2. In what range are boosting values usually chosen? I.e., should I choose boosts in a 0.5-2 range (say 0.5, 1, 1.5, 2), like I have seen in soem examples, or is it the same if I choose boosts in a range like 50-200 (respectively 50, 100, 150, 200)?
3. How sensitive is boosting in Lucene? For example, if I know approximately the importance of each field, and I want to assign boosting values accordingly, what would be good differences between the values of the boosting factor for the different fields? More precisely, if the importance order is f1<f2<f3<f4, will it matter if I choose the boosts as (1,2,3,4), or (1, 5, 10, 15)?
4. Is there any method besides trial and error for finding the boosts for each field that work the best for a particular corpus?
Thank you very much,