Home » Highlighting in Apache Solr – Ultimate Solr Guide

Highlighting in Apache Solr – Ultimate Solr Guide

Hi folks! We are back with another post. Today we will discuss in detail an important aspect of solr which is being used by a large number of enterprise applications. It not only improves the aesthetics of the applications but also provides the end-user with a clear understanding of the response that he/she is expecting. The feature we are discussing is called “Highlighting” in solr.

In simple words, Highlighting in Solr allows fragments of documents that match the user’s query to be included with the query response. The fragments are included in a special section of the query response (the highlighting section), and the client uses the formatting clues to determine how to present the snippets to users. Fragments are a portion of a document field that contains matches from the query and are sometimes also referred to as “snippets”.

Highlighting is extremely configurable. There are many parameters each for fragment sizing, formatting, ordering, backup/alternate behavior and more options that are hard to categorise.

How to Use?

Use this parameter to enable or disable highlighting. The default is false. If you want to use highlighting, you must set this to true.
The highlighting implementation to use. Acceptable values are: unifiedoriginalfastVector. The default is original.
Specifies a list of fields to highlight, either comma- or space-delimited. These must be “stored”. A wildcard of * (asterisk) can be used to match field globs, such as text_* or even * to highlight all fields where highlighting is possible. When using *, consider adding hl.requireFieldMatch=true. The following example uses the local-params syntax and the edismax parser to highlight fields in hl.fl&hl.fl=field1 field2&hl.q={!edismax qf=$hl.fl v=$q}&hl.qparser=lucene&hl.requireFieldMatch=true (along with other applicable parameters, of course).
A query to use for highlighting. This parameter allows you to highlight different terms or fields than those being used to search for documents. When setting this, you might also need to set hl.qparser. The default is the value of the q parameter (already parsed).
The query parser to use for the hl.q query. It only applies when hl.q is set. The default is the value of the defType parameter which in turn defaults to lucene.
By default, false, all query terms will be highlighted for each field to be highlighted (hl.fl) no matter what fields the parsed query refers to. If set to true, only query terms aligning with the field being highlighted will, in turn, be highlighted.

If the query references field different from the field being highlighted and they have different text analysis, the query may not highlight query terms it should have and vice versa. The analysis used is that of the field being highlighted (hl.fl), not the query fields.

If set to true, the default, Solr will highlight phrase queries (and other advanced position-sensitive queries) accurately – as phrases. If false, the parts of the phrase will be highlighted everywhere instead of only when it forms the given phrase.
If set to true, the default, Solr will highlight wildcard queries (and other MultiTermQuery subclasses). If false, they won’t be highlighted at all.
Specifies the maximum number of highlighted snippets to generate per field. It is possible for any number of snippets from zero to this value to be generated. The default is 1.
Specifies the approximate size, in characters, of fragments to consider for highlighting. The default is 100. Using 0 indicates that no fragmenting should be considered and the whole field value should be used.

Highlighting in a query response

In response to a query, Solr includes highlighting data in a section separate from the documents. It is up to a client to determine how to process this response and display the highlights to users.

Using the example documents included with Solr, we can see how this might work:

In response to a query such as:

We get a response such as :

Note the two sections docs and highlighting. The docs section contains the fields of the document requested with the fl parameter of the query (only “id”, “name”, “manu”, and “cat”).

The highlighting section includes the ID of each document and the field that contains the highlighted portion. In this example, we used the hl.fl parameter to say we wanted query terms highlighted in the “manu” field. When there is a match to the query term in that field, it will be included for each document ID in the list.

So, this is it about Highlighting in solr. We will be back with another post on solr very soon.