|
|
edismax支持boost函数与score相乘作为,而dismax只能使用bf作用效果是相加,所以在处理多个维度排序时,score其实也应该是其中一个维度 ,用相加的方式处理调整麻烦。
而dismax的实现代码逻辑比较简单,看起来比较易理解,edismax是它的加强版,其实是改变了不少。。比如在以下:
先看看dismax的解析主要实现思路:
首先取出搜索字段名qf
将最终解析成一个BooleanQuery
先解析主mainQuery:
- 用户主要是搜索串的解析
- altQuery解析处理,看是否使用用户定义的后备搜索串
- PhraseQuery解析组装
再解析bq查询,主要是额外加分的查询,不会影响搜索结果数,只会影响排序
再则是bf解析,函数搜索最后会以加的方式作用于文档评分
看主要代码更清晰:
[java] viewplaincopy
- @Override
- public Query parse() throws ParseException {
- SolrParams solrParams = SolrParams.wrapDefaults(localParams, params);
-
- queryFields = SolrPluginUtils.parseFieldBoosts(solrParams.getParams(DisMaxParams.QF));
- if (0 == queryFields.size()) {
- queryFields.put(req.getSchema().getDefaultSearchFieldName(), 1.0f);
- }
-
- /* the main query we will execute. we disable the coord because
- * this query is an artificial construct
- */
- BooleanQuery query = new BooleanQuery(true);
-
- boolean notBlank = addMainQuery(query, solrParams);
- if (!notBlank)
- return null;
- addBoostQuery(query, solrParams);
- addBoostFunctions(query, solrParams);
-
- return query;
- }
edismax的主要实现思路跟dismax差不多,以下是一些主要差别之处:
edismax解析含有+,OR,NOT,-语法时,就会忽略掉使用MM。
以下是主要代码实现:
统计搜索串中+,OR ,NOT,-语法元个数
[java] viewplaincopy
- // defer escaping and only do if lucene parsing fails, or we need phrases
- // parsing fails. Need to sloppy phrase queries anyway though.
- List clauses = null;
- int numPluses = 0;
- int numMinuses = 0;
- int numOR = 0;
- int numNOT = 0;
-
- clauses = splitIntoClauses(userQuery, false);
- for (Clause clause : clauses) {
- if (clause.must == '+') numPluses++;
- if (clause.must == '-') numMinuses++;
- if (clause.isBareWord()) {
- String s = clause.val;
- if ("OR".equals(s)) {
- numOR++;
- } else if ("NOT".equals(s)) {
- numNOT++;
- } else if (lowercaseOperators && "or".equals(s)) {
- numOR++;
- }
- }
- }
/////当搜索串里包含有+,OR ,NOT,-这四种时候,mm就会失效
[java] viewplaincopy
- boolean doMinMatched = (numOR + numNOT + numPluses + numMinuses) == 0;
- (parsedUserQuery != null && doMinMatched) {
- String minShouldMatch = solrParams.get(DisMaxParams.MM, "100%");
- if (parsedUserQuery instanceof BooleanQuery) {
- SolrPluginUtils.setMinShouldMatch((BooleanQuery)parsedUserQuery, minShouldMatch);
- }
- }
短语查询,先找出普通的查询,原来就是短语查询的、或者属于“OR”,“AND”,“NOT”,’TO‘类型的都不要。由于edismax支持解析符合lucene语法的搜索串,所以不像dismax那样,只需要简单的将搜索串去掉\“,然后加个“”括起来就行
// find non-field clauses
ListnormalClauses =new ArrayList(clauses.size());
for (Clauseclause :clauses) {
if (clause.field !=null ||clause.isPhrase)continue;
// check for keywords "AND,OR,TO"
if (clause.isBareWord()) {
String s =clause.val.toString();
// avoid putting explict operators in the phrase query
if ("OR".equals(s) ||"AND".equals(s) ||"NOT".equals(s) || "TO".equals(s))continue;
}
normalClauses.add(clause);
}
// full phrase...
addShingledPhraseQueries(query, normalClauses, phraseFields, 0,
tiebreaker,pslop);
// shingles...
addShingledPhraseQueries(query, normalClauses, phraseFields2, 2,
tiebreaker,pslop);
addShingledPhraseQueries(query, normalClauses, phraseFields3, 3,
tiebreaker,pslop);
////下面是dismax获取短语查询的作法:
[java] viewplaincopy
- protected Query getPhraseQuery(String userQuery, SolrPluginUtils.DisjunctionMaxQueryParser pp) throws ParseException {
- String userPhraseQuery = userQuery.replace("\"", "");
- return pp.parse("\"" + userPhraseQuery + "\"");
- }
下面是edismax的作法:
[java] viewplaincopy
- private void addShingledPhraseQueries(final BooleanQuery mainQuery,
- final List clauses,
- final Map fields,
- int shingleSize,
- final float tiebreaker,
- final int slop)
- throws ParseException {
- if (null == fields || fields.isEmpty() ||
- null == clauses || clauses.size() 1) {
- ValueSource prod = new ProductFloatFunction(boosts.toArray(new ValueSource[boosts.size()]));
- topQuery = new BoostedQuery(query, prod);
- } else if (boosts.size() == 1) {
- topQuery = new BoostedQuery(query, boosts.get(0));
- }
- }
可以看到最后不是一个BooleanQuery,而是一个BoostedQuery。
它就是简单处理子查询的分值再与函数查询的分值相乘返回 :主要的score方法如下:
[java] viewplaincopy
- public float score() throws IOException {
- float score = qWeight * scorer.score() * vals.floatVal(scorer.docID());
- return score>Float.NEGATIVE_INFINITY ? score : -Float.MAX_VALUE;
- }
转贴请声明来源:http://blog.iyunv.com/duck_genuine/article/details/8060026 |
|
|