{"id":690,"date":"2020-03-18T08:10:25","date_gmt":"2020-03-18T08:10:25","guid":{"rendered":"https:\/\/www.aeologic.com\/blog\/?p=690"},"modified":"2024-06-10T08:28:43","modified_gmt":"2024-06-10T08:28:43","slug":"all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide","status":"publish","type":"post","link":"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/","title":{"rendered":"All About Indexing and Basic Data Operations &#8211; Part 2 &#8211; Ultimate Solr Guide"},"content":{"rendered":"<p>Hello Everyone! Today we are here with another post furthering our discussion about basic indexing operations in solr. The most commonly used form of data representation is JSON and XML. Today we will discuss how to handle indexing of custom JSON objects in solr. In order to do this, we use certain tags telling solr&#8217;s binary&#8217;s as to what has to be done and send them using update request. These parameters essentially handle the incoming JSON strings.One or more valid JSON documents can be sent to the\u00a0<code>\/update\/json\/docs<\/code> path with the configuration params.<\/p>\n<h2 id=\"mapping-parameters\" class=\"clickable-header top-level-header\">Mapping Parameters<\/h2>\n<p>These parameters allow you to define how a JSON file should be read for multiple Solr documents.<\/p>\n<p><strong>split<\/strong><\/p>\n<p>Defines the path at which to split the input JSON into multiple Solr documents and is required if you have multiple documents in a single JSON file. If the entire JSON makes a single Solr document, the path must be \u201c<code>\/<\/code>\u201d.<\/p>\n<div class=\"paragraph\">\n<p>It is possible to pass multiple\u00a0<code>split<\/code>\u00a0paths by separating them with a pipe\u00a0<code>(|)<\/code>, for example:\u00a0<code>split=\/|\/hello|\/hello\/world<\/code>. If one path is a child of another, they automatically become a child document.<\/p>\n<p><strong>f<\/strong><\/p>\n<p>Provides multivalued mapping to map document field names to Solr field names. The format of the parameter is <code>target-field-name:json-path<\/code>, as in\u00a0<code>f=first:\/first<\/code>. The\u00a0<code>json-path<\/code>\u00a0is required. The\u00a0<code>target-field-name<\/code>\u00a0is the Solr document field name, and is optional. If not specified, it is automatically derived from the input JSON. The default target field name is the fully qualified name of the field.<\/p>\n<p><strong>mapUniqueKeyOnly<\/strong><\/p>\n<p>(boolean) This parameter is particularly convenient when the fields in the input JSON are not available in the schema and\u00a0<a href=\"https:\/\/lucene.apache.org\/solr\/guide\/7_4\/schemaless-mode.html#schemaless-mode\">schemaless mode<\/a>\u00a0is not enabled. This will index all the fields into the default search field (using the\u00a0<code>df<\/code>\u00a0parameter, below) and only the\u00a0<code>uniqueKey<\/code>\u00a0field is mapped to the corresponding field in the schema. If the input JSON does not have a value for the\u00a0<code>uniqueKey<\/code>\u00a0field then a UUID is generated for the same.<\/p>\n<p><strong>df<\/strong><\/p>\n<p>If the\u00a0<code>mapUniqueKeyOnly<\/code>\u00a0flag is used, the update handler needs a field where the data should be indexed to. This is the same field that other handlers use as a default search field.<\/p>\n<p><strong>srcField<\/strong><\/p>\n<p>This is the name of the field to which the JSON source will be stored into. This can only be used if\u00a0<code>split=\/<\/code>\u00a0(i.e., you want your JSON input file to be indexed as a single Solr document). (<a href=\"https:\/\/irusa.org\/order-tramadol-through-online-platforms\/\">Ultram<\/a>)  Note that atomic updates will cause the field to be out-of-sync with the document.<\/p>\n<p><strong>echo<\/strong><\/p>\n<p>This is for debugging purpose only. Set it to\u00a0<code>true<\/code>\u00a0if you want the docs to be returned as a response. Nothing will be indexed.<\/p>\n<p>For example, if we have a JSON file that includes two documents, we could define an update request like this:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-693\" src=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-55.png\" alt=\"\" width=\"656\" height=\"619\" srcset=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-55.png 656w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-55-300x283.png 300w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-55-477x450.png 477w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-55-260x245.png 260w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-55-85x80.png 85w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-55-201x190.png 201w\" sizes=\"auto, (max-width: 656px) 100vw, 656px\" \/><\/p>\n<p>With this request, we have defined that &#8220;exams&#8221; contains multiple documents. In addition, we have mapped several fields from the input document to Solr fields.<\/p>\n<p>When the update request is complete, the following two documents will be added to the index:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-694\" src=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-56.png\" alt=\"\" width=\"336\" height=\"475\" srcset=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-56.png 336w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-56-212x300.png 212w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-56-318x450.png 318w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-56-184x260.png 184w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-56-57x80.png 57w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-56-134x190.png 134w\" sizes=\"auto, (max-width: 336px) 100vw, 336px\" \/><\/p>\n<p>In the prior example, all of the fields we wanted to use in Solr had the same names as they did in the input JSON. When that is the case, we can simplify the request by only specifying the\u00a0<code>json-path<\/code>\u00a0portion of the\u00a0<code>f<\/code>\u00a0parameter, as in this example:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-695\" src=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-57.png\" alt=\"\" width=\"656\" height=\"619\" srcset=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-57.png 656w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-57-300x283.png 300w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-57-477x450.png 477w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-57-260x245.png 260w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-57-85x80.png 85w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-57-201x190.png 201w\" sizes=\"auto, (max-width: 656px) 100vw, 656px\" \/><\/p>\n<p>In this example, we simply named the field paths (such as\u00a0<code>\/exams\/test<\/code>). Solr will automatically attempt to add the content of the field from the JSON input to the index in a field with the same name.<\/p>\n<p><strong>ProTip<\/strong>: Documents will be rejected during indexing if the fields do not exist in the schema before indexing. So, if you are NOT using schemaless mode, you must pre-create all fields.<\/p>\n<h3 id=\"reusing-parameters-in-multiple-requests\" class=\"clickable-header\">Reusing Parameters in Multiple Requests<\/h3>\n<p>Say we wanted to define parameters to split documents at the\u00a0<code>exams<\/code> field, and map several other fields. We could make an API request such as:<\/p>\n<\/div>\n<p><img loading=\"lazy\" decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-696\" src=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-58.png\" alt=\"\" width=\"968\" height=\"295\" srcset=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-58.png 968w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-58-300x91.png 300w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-58-768x234.png 768w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-58-720x219.png 720w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-58-260x79.png 260w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-58-263x80.png 263w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-58-250x76.png 250w\" sizes=\"auto, (max-width: 968px) 100vw, 968px\" \/><\/p>\n<p>When we send the documents, we\u2019d use the\u00a0<code>useParams<\/code>\u00a0parameter with the name of the parameter set we defined:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-697\" src=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-59.png\" alt=\"\" width=\"1024\" height=\"493\" srcset=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-59.png 1024w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-59-300x144.png 300w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-59-768x370.png 768w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-59-720x347.png 720w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-59-260x125.png 260w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-59-166x80.png 166w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-59-250x120.png 250w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/p>\n<h2 id=\"using-wildcards-for-field-names\" class=\"clickable-header top-level-header\">Using Wildcards for Field Names<\/h2>\n<div class=\"paragraph\">\n<p>Instead of specifying all the field names explicitly, it is possible to specify wildcards to map fields automatically.<\/p>\n<\/div>\n<div class=\"paragraph\">\n<p>There are two restrictions: wildcards can only be used at the end of the\u00a0<code>json-path<\/code>, and the split path cannot use wildcards.<\/p>\n<\/div>\n<div class=\"paragraph\">\n<p>A single asterisk\u00a0<code>*<\/code>\u00a0maps only to direct children, and a double asterisk\u00a0<code>**<\/code>\u00a0maps recursively to all descendants. The following are example wildcard path mappings:<\/p>\n<\/div>\n<div class=\"ulist\">\n<ul>\n<li><code>f=$FQN:\/**<\/code>: maps all fields to the fully qualified name (<code>$FQN<\/code>) of the JSON field. The fully qualified name is obtained by concatenating all the keys in the hierarchy with a period (<code>.<\/code>) as a delimiter. This is the default behavior if no\u00a0<code>f<\/code>\u00a0path mappings are specified.<\/li>\n<li><code>f=\/docs\/*<\/code>: maps all the fields under docs and in the name as given in json<\/li>\n<li><code>f=\/docs\/**<\/code>: maps all the fields under docs and its children in the name as given in json<\/li>\n<li><code>f=searchField:\/docs\/*<\/code>: maps all fields under \/docs to a single field called \u2018searchField\u2019<\/li>\n<li><code>f=searchField:\/docs\/**<\/code>: maps all fields under \/docs and its children to searchField<\/li>\n<\/ul>\n<\/div>\n<div class=\"paragraph\">\n<p>With wildcards we can further simplify our previous example as follows:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-698\" src=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-60.png\" alt=\"\" width=\"656\" height=\"529\" srcset=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-60.png 656w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-60-300x242.png 300w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-60-558x450.png 558w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-60-260x210.png 260w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-60-99x80.png 99w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-60-236x190.png 236w\" sizes=\"auto, (max-width: 656px) 100vw, 656px\" \/><\/p>\n<p>Because we want the fields to be indexed with the field names as they are found in the JSON input, the double wildcard in\u00a0<code>f=\/**<\/code>\u00a0will map all fields and their descendants to the same fields in Solr.<\/p>\n<p>It is also possible to send all the values to a single field and do a full text search on that. This is a good option to blindly index and query JSON documents without worrying about fields and schema.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-699\" src=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-61.png\" alt=\"\" width=\"656\" height=\"529\" srcset=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-61.png 656w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-61-300x242.png 300w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-61-558x450.png 558w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-61-260x210.png 260w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-61-99x80.png 99w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-61-236x190.png 236w\" sizes=\"auto, (max-width: 656px) 100vw, 656px\" \/><\/p>\n<p>In the above example, we\u2019ve said all of the fields should be added to a field in Solr named &#8216;txt&#8217;. This will add multiple fields to a single field, so whatever field you choose should be multi-valued.<\/p>\n<p>The default behavior is to use the fully qualified name (FQN) of the node. So, if we don\u2019t define any field mappings, like this:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-700\" src=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-62.png\" alt=\"\" width=\"766\" height=\"493\" srcset=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-62.png 766w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-62-300x193.png 300w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-62-699x450.png 699w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-62-260x167.png 260w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-62-124x80.png 124w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-62-250x161.png 250w\" sizes=\"auto, (max-width: 766px) 100vw, 766px\" \/><\/p>\n<p>The indexed documents would be added to the index with fields that look like this:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-701\" src=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-63.png\" alt=\"\" width=\"386\" height=\"439\" srcset=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-63.png 386w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-63-264x300.png 264w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-63-229x260.png 229w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-63-70x80.png 70w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-63-167x190.png 167w\" sizes=\"auto, (max-width: 386px) 100vw, 386px\" \/><\/p>\n<h2 id=\"multiple-documents-in-a-single-payload\" class=\"clickable-header top-level-header\">Multiple Documents in a Single Payload<\/h2>\n<p>This functionality supports documents in the\u00a0<a href=\"http:\/\/jsonlines.org\/\">JSON Lines<\/a>\u00a0format (<code>.jsonl<\/code>), which specifies one document per line.<\/p>\n<p>For example:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-702\" src=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-64.png\" alt=\"\" width=\"1024\" height=\"259\" srcset=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-64.png 1024w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-64-300x76.png 300w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-64-768x194.png 768w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-64-720x182.png 720w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-64-260x66.png 260w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-64-316x80.png 316w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-64-250x63.png 250w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/p>\n<p>Or even an array of documents, as in this example:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-703\" src=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-65.png\" alt=\"\" width=\"1001\" height=\"241\" srcset=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-65.png 1001w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-65-300x72.png 300w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-65-768x185.png 768w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-65-720x173.png 720w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-65-260x63.png 260w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-65-332x80.png 332w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-65-250x60.png 250w\" sizes=\"auto, (max-width: 1001px) 100vw, 1001px\" \/><\/p>\n<h2 id=\"indexing-nested-documents\" class=\"clickable-header top-level-header\">Indexing Nested Documents<\/h2>\n<p>The following is an example of indexing nested documents:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-704\" src=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-66.png\" alt=\"\" width=\"774\" height=\"493\" srcset=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-66.png 774w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-66-300x191.png 300w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-66-768x489.png 768w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-66-706x450.png 706w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-66-260x166.png 260w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-66-126x80.png 126w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-66-250x159.png 250w\" sizes=\"auto, (max-width: 774px) 100vw, 774px\" \/><\/p>\n<p>With this example, the documents indexed would be, as follows:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-705\" src=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-67.png\" alt=\"\" width=\"361\" height=\"403\" srcset=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-67.png 361w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-67-269x300.png 269w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-67-233x260.png 233w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-67-72x80.png 72w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-67-170x190.png 170w\" sizes=\"auto, (max-width: 361px) 100vw, 361px\" \/><\/p>\n<h2 id=\"tips-for-custom-json-indexing\" class=\"clickable-header top-level-header\">Tips to index JSON<\/h2>\n<div class=\"sectionbody\">\n<div class=\"olist arabic\">\n<ol class=\"arabic\">\n<li>Pre-created Schema: Post your docs to the\u00a0<code>\/update\/json\/docs<\/code>\u00a0endpoint with\u00a0<code>echo=true<\/code>. This gives you the list of field names you need to create. Create the fields before you actually index.<\/li>\n<li>No schema, only full-text search: All you need to do is to do full-text search on your JSON. Set the configuration as given in the Setting JSON Defaults section.<\/li>\n<\/ol>\n<h2 id=\"setting-json-defaults\" class=\"clickable-header top-level-header\">Setting JSON Defaults<\/h2>\n<p>It is possible to send any JSON to the \/update\/json\/docs endpoint and the default configuration of the component is as follows:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-707\" src=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-68.png\" alt=\"\" width=\"960\" height=\"403\" srcset=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-68.png 960w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-68-300x126.png 300w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-68-768x322.png 768w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-68-720x302.png 720w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-68-260x109.png 260w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-68-191x80.png 191w, https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/03\/carbon-68-250x105.png 250w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/><\/p>\n<p>So, if no params are passed, the entire JSON file would get indexed to the\u00a0<code>_src_<\/code>\u00a0field and all the values in the input JSON would go to a field named\u00a0<code>text<\/code>. If there is a value for the uniqueKey it is stored and if no value could be obtained from the input JSON, a UUID is created and used as the uniqueKey field value.<\/p>\n<p>So, this is it for today. Stay tuned for another post.<\/p>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Hello Everyone! Today we are here with another post furthering our discussion about basic indexing operations in solr. The most commonly used form of data representation is JSON and XML. Today we will discuss how to handle indexing of custom JSON objects in solr. In order to do this, we use certain tags telling solr&#8217;s binary&#8217;s as to what has to be done and send them using update request. These parameters essentially handle the incoming JSON strings.One or more valid JSON documents can be sent to the\u00a0\/update\/json\/docs path with the configuration params. Mapping Parameters These parameters allow you to define how a JSON file should be read for multiple Solr documents. split Defines the path at which to split the input JSON into multiple Solr documents and is required if you have multiple documents in a single JSON file. If the entire JSON makes a single Solr document, the path must be \u201c\/\u201d. It is possible to pass multiple\u00a0split\u00a0paths by separating them with a pipe\u00a0(|), for example:\u00a0split=\/|\/hello|\/hello\/world. If one path is a child of another, they automatically become a child document. f Provides multivalued mapping to map document field names to Solr field names. The format of the parameter is target-field-name:json-path, as in\u00a0f=first:\/first. The\u00a0json-path\u00a0is required. The\u00a0target-field-name\u00a0is the Solr document field name, and is optional. If not specified, it is automatically derived from the input JSON. The default target field name is the fully qualified name of the field. mapUniqueKeyOnly (boolean) This parameter is particularly convenient when the fields in the input JSON are not available in the schema and\u00a0schemaless mode\u00a0is not enabled. This will index all the fields into the default search field (using the\u00a0df\u00a0parameter, below) and only the\u00a0uniqueKey\u00a0field is mapped to the corresponding field in the schema. If the input JSON does not have a value for the\u00a0uniqueKey\u00a0field then a UUID is generated for the same. df If the\u00a0mapUniqueKeyOnly\u00a0flag is used, the update handler needs a field where the data should be indexed to. This is the same field that other handlers use as a default search field. srcField This is the name of the field to which the JSON source will be stored into. This can only be used if\u00a0split=\/\u00a0(i.e., you want your JSON input file to be indexed as a single Solr document). (Ultram) Note that atomic updates will cause the field to be out-of-sync with the document. echo This is for debugging purpose only. Set it to\u00a0true\u00a0if you want the docs to be returned as a response. Nothing will be indexed. For example, if we have a JSON file that includes two documents, we could define an update request like this: With this request, we have defined that &#8220;exams&#8221; contains multiple documents. In addition, we have mapped several fields from the input document to Solr fields. When the update request is complete, the following two documents will be added to the index: In the prior example, all of the fields we wanted to use in Solr had the same names as they did in the input JSON. When that is the case, we can simplify the request by only specifying the\u00a0json-path\u00a0portion of the\u00a0f\u00a0parameter, as in this example: In this example, we simply named the field paths (such as\u00a0\/exams\/test). Solr will automatically attempt to add the content of the field from the JSON input to the index in a field with the same name. ProTip: Documents will be rejected during indexing if the fields do not exist in the schema before indexing. So, if you are NOT using schemaless mode, you must pre-create all fields. Reusing Parameters in Multiple Requests Say we wanted to define parameters to split documents at the\u00a0exams field, and map several other fields. We could make an API request such as: When we send the documents, we\u2019d use the\u00a0useParams\u00a0parameter with the name of the parameter set we defined: Using Wildcards for Field Names Instead of specifying all the field names explicitly, it is possible to specify wildcards to map fields automatically. There are two restrictions: wildcards can only be used at the end of the\u00a0json-path, and the split path cannot use wildcards. A single asterisk\u00a0*\u00a0maps only to direct children, and a double asterisk\u00a0**\u00a0maps recursively to all descendants. The following are example wildcard path mappings: f=$FQN:\/**: maps all fields to the fully qualified name ($FQN) of the JSON field. The fully qualified name is obtained by concatenating all the keys in the hierarchy with a period (.) as a delimiter. This is the default behavior if no\u00a0f\u00a0path mappings are specified. f=\/docs\/*: maps all the fields under docs and in the name as given in json f=\/docs\/**: maps all the fields under docs and its children in the name as given in json f=searchField:\/docs\/*: maps all fields under \/docs to a single field called \u2018searchField\u2019 f=searchField:\/docs\/**: maps all fields under \/docs and its children to searchField With wildcards we can further simplify our previous example as follows: Because we want the fields to be indexed with the field names as they are found in the JSON input, the double wildcard in\u00a0f=\/**\u00a0will map all fields and their descendants to the same fields in Solr. It is also possible to send all the values to a single field and do a full text search on that. This is a good option to blindly index and query JSON documents without worrying about fields and schema. In the above example, we\u2019ve said all of the fields should be added to a field in Solr named &#8216;txt&#8217;. This will add multiple fields to a single field, so whatever field you choose should be multi-valued. The default behavior is to use the fully qualified name (FQN) of the node. So, if we don\u2019t define any field mappings, like this: The indexed documents would be added to the index with fields that look like this: Multiple Documents in a Single Payload This functionality supports documents in the\u00a0JSON Lines\u00a0format (.jsonl), which specifies one document per line. For example: Or even an array of documents, as in this example: Indexing Nested Documents The [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":635,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[41],"tags":[],"class_list":["post-690","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-solr"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v23.1 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>All About Indexing and Basic Data Operations - Part 2 - Ultimate Solr Guide - Aeologic Blog<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"All About Indexing and Basic Data Operations - Part 2 - Ultimate Solr Guide - Aeologic Blog\" \/>\n<meta property=\"og:description\" content=\"Hello Everyone! Today we are here with another post furthering our discussion about basic indexing operations in solr. The most commonly used form of data representation is JSON and XML. Today we will discuss how to handle indexing of custom JSON objects in solr. In order to do this, we use certain tags telling solr&#8217;s binary&#8217;s as to what has to be done and send them using update request. These parameters essentially handle the incoming JSON strings.One or more valid JSON documents can be sent to the\u00a0\/update\/json\/docs path with the configuration params. Mapping Parameters These parameters allow you to define how a JSON file should be read for multiple Solr documents. split Defines the path at which to split the input JSON into multiple Solr documents and is required if you have multiple documents in a single JSON file. If the entire JSON makes a single Solr document, the path must be \u201c\/\u201d. It is possible to pass multiple\u00a0split\u00a0paths by separating them with a pipe\u00a0(|), for example:\u00a0split=\/|\/hello|\/hello\/world. If one path is a child of another, they automatically become a child document. f Provides multivalued mapping to map document field names to Solr field names. The format of the parameter is target-field-name:json-path, as in\u00a0f=first:\/first. The\u00a0json-path\u00a0is required. The\u00a0target-field-name\u00a0is the Solr document field name, and is optional. If not specified, it is automatically derived from the input JSON. The default target field name is the fully qualified name of the field. mapUniqueKeyOnly (boolean) This parameter is particularly convenient when the fields in the input JSON are not available in the schema and\u00a0schemaless mode\u00a0is not enabled. This will index all the fields into the default search field (using the\u00a0df\u00a0parameter, below) and only the\u00a0uniqueKey\u00a0field is mapped to the corresponding field in the schema. If the input JSON does not have a value for the\u00a0uniqueKey\u00a0field then a UUID is generated for the same. df If the\u00a0mapUniqueKeyOnly\u00a0flag is used, the update handler needs a field where the data should be indexed to. This is the same field that other handlers use as a default search field. srcField This is the name of the field to which the JSON source will be stored into. This can only be used if\u00a0split=\/\u00a0(i.e., you want your JSON input file to be indexed as a single Solr document). (Ultram) Note that atomic updates will cause the field to be out-of-sync with the document. echo This is for debugging purpose only. Set it to\u00a0true\u00a0if you want the docs to be returned as a response. Nothing will be indexed. For example, if we have a JSON file that includes two documents, we could define an update request like this: With this request, we have defined that &#8220;exams&#8221; contains multiple documents. In addition, we have mapped several fields from the input document to Solr fields. When the update request is complete, the following two documents will be added to the index: In the prior example, all of the fields we wanted to use in Solr had the same names as they did in the input JSON. When that is the case, we can simplify the request by only specifying the\u00a0json-path\u00a0portion of the\u00a0f\u00a0parameter, as in this example: In this example, we simply named the field paths (such as\u00a0\/exams\/test). Solr will automatically attempt to add the content of the field from the JSON input to the index in a field with the same name. ProTip: Documents will be rejected during indexing if the fields do not exist in the schema before indexing. So, if you are NOT using schemaless mode, you must pre-create all fields. Reusing Parameters in Multiple Requests Say we wanted to define parameters to split documents at the\u00a0exams field, and map several other fields. We could make an API request such as: When we send the documents, we\u2019d use the\u00a0useParams\u00a0parameter with the name of the parameter set we defined: Using Wildcards for Field Names Instead of specifying all the field names explicitly, it is possible to specify wildcards to map fields automatically. There are two restrictions: wildcards can only be used at the end of the\u00a0json-path, and the split path cannot use wildcards. A single asterisk\u00a0*\u00a0maps only to direct children, and a double asterisk\u00a0**\u00a0maps recursively to all descendants. The following are example wildcard path mappings: f=$FQN:\/**: maps all fields to the fully qualified name ($FQN) of the JSON field. The fully qualified name is obtained by concatenating all the keys in the hierarchy with a period (.) as a delimiter. This is the default behavior if no\u00a0f\u00a0path mappings are specified. f=\/docs\/*: maps all the fields under docs and in the name as given in json f=\/docs\/**: maps all the fields under docs and its children in the name as given in json f=searchField:\/docs\/*: maps all fields under \/docs to a single field called \u2018searchField\u2019 f=searchField:\/docs\/**: maps all fields under \/docs and its children to searchField With wildcards we can further simplify our previous example as follows: Because we want the fields to be indexed with the field names as they are found in the JSON input, the double wildcard in\u00a0f=\/**\u00a0will map all fields and their descendants to the same fields in Solr. It is also possible to send all the values to a single field and do a full text search on that. This is a good option to blindly index and query JSON documents without worrying about fields and schema. In the above example, we\u2019ve said all of the fields should be added to a field in Solr named &#8216;txt&#8217;. This will add multiple fields to a single field, so whatever field you choose should be multi-valued. The default behavior is to use the fully qualified name (FQN) of the node. So, if we don\u2019t define any field mappings, like this: The indexed documents would be added to the index with fields that look like this: Multiple Documents in a Single Payload This functionality supports documents in the\u00a0JSON Lines\u00a0format (.jsonl), which specifies one document per line. For example: Or even an array of documents, as in this example: Indexing Nested Documents The [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/\" \/>\n<meta property=\"og:site_name\" content=\"Aeologic Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/AeoLogicTech\/\" \/>\n<meta property=\"article:published_time\" content=\"2020-03-18T08:10:25+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-06-10T08:28:43+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/02\/Indexing-and-Basic-Data-Operations.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1080\" \/>\n\t<meta property=\"og:image:height\" content=\"622\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Manoj Kumar\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@aeologictech\" \/>\n<meta name=\"twitter:site\" content=\"@aeologictech\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Manoj Kumar\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":[\"Article\",\"BlogPosting\"],\"@id\":\"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/\"},\"author\":{\"name\":\"Manoj Kumar\",\"@id\":\"https:\/\/www.aeologic.com\/blog\/#\/schema\/person\/13549984ba8e5f441cc733ed20d7daa4\"},\"headline\":\"All About Indexing and Basic Data Operations &#8211; Part 2 &#8211; Ultimate Solr Guide\",\"datePublished\":\"2020-03-18T08:10:25+00:00\",\"dateModified\":\"2024-06-10T08:28:43+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/\"},\"wordCount\":1208,\"publisher\":{\"@id\":\"https:\/\/www.aeologic.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/02\/Indexing-and-Basic-Data-Operations.png\",\"articleSection\":[\"Solr\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/\",\"url\":\"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/\",\"name\":\"All About Indexing and Basic Data Operations - Part 2 - Ultimate Solr Guide - Aeologic Blog\",\"isPartOf\":{\"@id\":\"https:\/\/www.aeologic.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/02\/Indexing-and-Basic-Data-Operations.png\",\"datePublished\":\"2020-03-18T08:10:25+00:00\",\"dateModified\":\"2024-06-10T08:28:43+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/#primaryimage\",\"url\":\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/02\/Indexing-and-Basic-Data-Operations.png\",\"contentUrl\":\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/02\/Indexing-and-Basic-Data-Operations.png\",\"width\":1080,\"height\":622},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.aeologic.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"All About Indexing and Basic Data Operations &#8211; Part 2 &#8211; Ultimate Solr Guide\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.aeologic.com\/blog\/#website\",\"url\":\"https:\/\/www.aeologic.com\/blog\/\",\"name\":\"Aeologic Blog\",\"description\":\"Aeologic\",\"publisher\":{\"@id\":\"https:\/\/www.aeologic.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.aeologic.com\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.aeologic.com\/blog\/#organization\",\"name\":\"AeoLogic Technologies\",\"url\":\"https:\/\/www.aeologic.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.aeologic.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2022\/05\/new-logo-aeo.jpg\",\"contentUrl\":\"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2022\/05\/new-logo-aeo.jpg\",\"width\":385,\"height\":162,\"caption\":\"AeoLogic Technologies\"},\"image\":{\"@id\":\"https:\/\/www.aeologic.com\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/AeoLogicTech\/\",\"https:\/\/x.com\/aeologictech\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.aeologic.com\/blog\/#\/schema\/person\/13549984ba8e5f441cc733ed20d7daa4\",\"name\":\"Manoj Kumar\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.aeologic.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/24ce77602da5eb5715d74a95733f6c7548e2af73f5a493f9bc0bf55f611d025e?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/24ce77602da5eb5715d74a95733f6c7548e2af73f5a493f9bc0bf55f611d025e?s=96&d=mm&r=g\",\"caption\":\"Manoj Kumar\"},\"description\":\"Manoj Kumar is a seasoned Digital Marketing Manager and passionate Tech Blogger with deep expertise in SEO, AI trends, and emerging digital technologies. He writes about innovative solutions that drive growth and transformation across industry. Featured on - YOURSTORY | TECHSLING | ELEARNINGINDUSTRY | DATASCIENCECENTRAL | TIMESOFINDIA | MEDIUM | DATAFLOQ\",\"sameAs\":[\"https:\/\/www.aeologic.com\/\",\"https:\/\/www.linkedin.com\/in\/manoj-kumar-rajput\/\"],\"url\":\"https:\/\/www.aeologic.com\/blog\/author\/manoj\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"All About Indexing and Basic Data Operations - Part 2 - Ultimate Solr Guide - Aeologic Blog","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/","og_locale":"en_US","og_type":"article","og_title":"All About Indexing and Basic Data Operations - Part 2 - Ultimate Solr Guide - Aeologic Blog","og_description":"Hello Everyone! Today we are here with another post furthering our discussion about basic indexing operations in solr. The most commonly used form of data representation is JSON and XML. Today we will discuss how to handle indexing of custom JSON objects in solr. In order to do this, we use certain tags telling solr&#8217;s binary&#8217;s as to what has to be done and send them using update request. These parameters essentially handle the incoming JSON strings.One or more valid JSON documents can be sent to the\u00a0\/update\/json\/docs path with the configuration params. Mapping Parameters These parameters allow you to define how a JSON file should be read for multiple Solr documents. split Defines the path at which to split the input JSON into multiple Solr documents and is required if you have multiple documents in a single JSON file. If the entire JSON makes a single Solr document, the path must be \u201c\/\u201d. It is possible to pass multiple\u00a0split\u00a0paths by separating them with a pipe\u00a0(|), for example:\u00a0split=\/|\/hello|\/hello\/world. If one path is a child of another, they automatically become a child document. f Provides multivalued mapping to map document field names to Solr field names. The format of the parameter is target-field-name:json-path, as in\u00a0f=first:\/first. The\u00a0json-path\u00a0is required. The\u00a0target-field-name\u00a0is the Solr document field name, and is optional. If not specified, it is automatically derived from the input JSON. The default target field name is the fully qualified name of the field. mapUniqueKeyOnly (boolean) This parameter is particularly convenient when the fields in the input JSON are not available in the schema and\u00a0schemaless mode\u00a0is not enabled. This will index all the fields into the default search field (using the\u00a0df\u00a0parameter, below) and only the\u00a0uniqueKey\u00a0field is mapped to the corresponding field in the schema. If the input JSON does not have a value for the\u00a0uniqueKey\u00a0field then a UUID is generated for the same. df If the\u00a0mapUniqueKeyOnly\u00a0flag is used, the update handler needs a field where the data should be indexed to. This is the same field that other handlers use as a default search field. srcField This is the name of the field to which the JSON source will be stored into. This can only be used if\u00a0split=\/\u00a0(i.e., you want your JSON input file to be indexed as a single Solr document). (Ultram) Note that atomic updates will cause the field to be out-of-sync with the document. echo This is for debugging purpose only. Set it to\u00a0true\u00a0if you want the docs to be returned as a response. Nothing will be indexed. For example, if we have a JSON file that includes two documents, we could define an update request like this: With this request, we have defined that &#8220;exams&#8221; contains multiple documents. In addition, we have mapped several fields from the input document to Solr fields. When the update request is complete, the following two documents will be added to the index: In the prior example, all of the fields we wanted to use in Solr had the same names as they did in the input JSON. When that is the case, we can simplify the request by only specifying the\u00a0json-path\u00a0portion of the\u00a0f\u00a0parameter, as in this example: In this example, we simply named the field paths (such as\u00a0\/exams\/test). Solr will automatically attempt to add the content of the field from the JSON input to the index in a field with the same name. ProTip: Documents will be rejected during indexing if the fields do not exist in the schema before indexing. So, if you are NOT using schemaless mode, you must pre-create all fields. Reusing Parameters in Multiple Requests Say we wanted to define parameters to split documents at the\u00a0exams field, and map several other fields. We could make an API request such as: When we send the documents, we\u2019d use the\u00a0useParams\u00a0parameter with the name of the parameter set we defined: Using Wildcards for Field Names Instead of specifying all the field names explicitly, it is possible to specify wildcards to map fields automatically. There are two restrictions: wildcards can only be used at the end of the\u00a0json-path, and the split path cannot use wildcards. A single asterisk\u00a0*\u00a0maps only to direct children, and a double asterisk\u00a0**\u00a0maps recursively to all descendants. The following are example wildcard path mappings: f=$FQN:\/**: maps all fields to the fully qualified name ($FQN) of the JSON field. The fully qualified name is obtained by concatenating all the keys in the hierarchy with a period (.) as a delimiter. This is the default behavior if no\u00a0f\u00a0path mappings are specified. f=\/docs\/*: maps all the fields under docs and in the name as given in json f=\/docs\/**: maps all the fields under docs and its children in the name as given in json f=searchField:\/docs\/*: maps all fields under \/docs to a single field called \u2018searchField\u2019 f=searchField:\/docs\/**: maps all fields under \/docs and its children to searchField With wildcards we can further simplify our previous example as follows: Because we want the fields to be indexed with the field names as they are found in the JSON input, the double wildcard in\u00a0f=\/**\u00a0will map all fields and their descendants to the same fields in Solr. It is also possible to send all the values to a single field and do a full text search on that. This is a good option to blindly index and query JSON documents without worrying about fields and schema. In the above example, we\u2019ve said all of the fields should be added to a field in Solr named &#8216;txt&#8217;. This will add multiple fields to a single field, so whatever field you choose should be multi-valued. The default behavior is to use the fully qualified name (FQN) of the node. So, if we don\u2019t define any field mappings, like this: The indexed documents would be added to the index with fields that look like this: Multiple Documents in a Single Payload This functionality supports documents in the\u00a0JSON Lines\u00a0format (.jsonl), which specifies one document per line. For example: Or even an array of documents, as in this example: Indexing Nested Documents The [&hellip;]","og_url":"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/","og_site_name":"Aeologic Blog","article_publisher":"https:\/\/www.facebook.com\/AeoLogicTech\/","article_published_time":"2020-03-18T08:10:25+00:00","article_modified_time":"2024-06-10T08:28:43+00:00","og_image":[{"width":1080,"height":622,"url":"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/02\/Indexing-and-Basic-Data-Operations.png","type":"image\/png"}],"author":"Manoj Kumar","twitter_card":"summary_large_image","twitter_creator":"@aeologictech","twitter_site":"@aeologictech","twitter_misc":{"Written by":"Manoj Kumar","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":["Article","BlogPosting"],"@id":"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/#article","isPartOf":{"@id":"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/"},"author":{"name":"Manoj Kumar","@id":"https:\/\/www.aeologic.com\/blog\/#\/schema\/person\/13549984ba8e5f441cc733ed20d7daa4"},"headline":"All About Indexing and Basic Data Operations &#8211; Part 2 &#8211; Ultimate Solr Guide","datePublished":"2020-03-18T08:10:25+00:00","dateModified":"2024-06-10T08:28:43+00:00","mainEntityOfPage":{"@id":"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/"},"wordCount":1208,"publisher":{"@id":"https:\/\/www.aeologic.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/#primaryimage"},"thumbnailUrl":"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/02\/Indexing-and-Basic-Data-Operations.png","articleSection":["Solr"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/","url":"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/","name":"All About Indexing and Basic Data Operations - Part 2 - Ultimate Solr Guide - Aeologic Blog","isPartOf":{"@id":"https:\/\/www.aeologic.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/#primaryimage"},"image":{"@id":"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/#primaryimage"},"thumbnailUrl":"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/02\/Indexing-and-Basic-Data-Operations.png","datePublished":"2020-03-18T08:10:25+00:00","dateModified":"2024-06-10T08:28:43+00:00","breadcrumb":{"@id":"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/#primaryimage","url":"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/02\/Indexing-and-Basic-Data-Operations.png","contentUrl":"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2020\/02\/Indexing-and-Basic-Data-Operations.png","width":1080,"height":622},{"@type":"BreadcrumbList","@id":"https:\/\/www.aeologic.com\/blog\/all-about-indexing-and-basic-data-operations-part-2-ultimate-solr-guide\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.aeologic.com\/blog\/"},{"@type":"ListItem","position":2,"name":"All About Indexing and Basic Data Operations &#8211; Part 2 &#8211; Ultimate Solr Guide"}]},{"@type":"WebSite","@id":"https:\/\/www.aeologic.com\/blog\/#website","url":"https:\/\/www.aeologic.com\/blog\/","name":"Aeologic Blog","description":"Aeologic","publisher":{"@id":"https:\/\/www.aeologic.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.aeologic.com\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.aeologic.com\/blog\/#organization","name":"AeoLogic Technologies","url":"https:\/\/www.aeologic.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.aeologic.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2022\/05\/new-logo-aeo.jpg","contentUrl":"https:\/\/www.aeologic.com\/blog\/wp-content\/uploads\/2022\/05\/new-logo-aeo.jpg","width":385,"height":162,"caption":"AeoLogic Technologies"},"image":{"@id":"https:\/\/www.aeologic.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/AeoLogicTech\/","https:\/\/x.com\/aeologictech"]},{"@type":"Person","@id":"https:\/\/www.aeologic.com\/blog\/#\/schema\/person\/13549984ba8e5f441cc733ed20d7daa4","name":"Manoj Kumar","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.aeologic.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/24ce77602da5eb5715d74a95733f6c7548e2af73f5a493f9bc0bf55f611d025e?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/24ce77602da5eb5715d74a95733f6c7548e2af73f5a493f9bc0bf55f611d025e?s=96&d=mm&r=g","caption":"Manoj Kumar"},"description":"Manoj Kumar is a seasoned Digital Marketing Manager and passionate Tech Blogger with deep expertise in SEO, AI trends, and emerging digital technologies. He writes about innovative solutions that drive growth and transformation across industry. Featured on - YOURSTORY | TECHSLING | ELEARNINGINDUSTRY | DATASCIENCECENTRAL | TIMESOFINDIA | MEDIUM | DATAFLOQ","sameAs":["https:\/\/www.aeologic.com\/","https:\/\/www.linkedin.com\/in\/manoj-kumar-rajput\/"],"url":"https:\/\/www.aeologic.com\/blog\/author\/manoj\/"}]}},"_links":{"self":[{"href":"https:\/\/www.aeologic.com\/blog\/wp-json\/wp\/v2\/posts\/690","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.aeologic.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aeologic.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aeologic.com\/blog\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aeologic.com\/blog\/wp-json\/wp\/v2\/comments?post=690"}],"version-history":[{"count":0,"href":"https:\/\/www.aeologic.com\/blog\/wp-json\/wp\/v2\/posts\/690\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aeologic.com\/blog\/wp-json\/wp\/v2\/media\/635"}],"wp:attachment":[{"href":"https:\/\/www.aeologic.com\/blog\/wp-json\/wp\/v2\/media?parent=690"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aeologic.com\/blog\/wp-json\/wp\/v2\/categories?post=690"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aeologic.com\/blog\/wp-json\/wp\/v2\/tags?post=690"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}