This website uses cookies to ensure you get the best possible experience. See our Cookies Policy.

PMG Digital Made for Humans

Elasticsearch Term or Terms Query Not Working? Start Here.

5 MINUTE READ | January 5, 2016

Elasticsearch Term or Terms Query Not Working? Start Here.

Summary if the term(s) being searched contain spaces or special characters, you’ll need to use a not_analyzed property in your search to make it work.

By default Elasticsearch runs data that comes in through a set of analyzers when it comes in. You can specify what sort of analysis you want done on the strings when you set up the property’s

index
parameter.

This analysis turns the raw data into a set of tokens that are stored in an inverted index (here’s a bit more in depth guide).

When you search for something, the inverted index is queried and documents that match are returned.

When you search with something like a query string or match query, Elasticsearch will use its analyzers again to tokenize the query and look up documents that match in the inverted index. You can control which analyzer is used with the

analyzer
parameter in the query object. You can see how Elasticsearch tokenizes as term with the analyze endpoint.

curl 'http://localhost:9200/\_analyze?pretty&text=test%20two'{  "tokens" : [ {    "token" : "test",    "start_offset" : 0,    "end_offset" : 4,    "type" : "",    "position" : 1  }, {    "token" : "two",    "start_offset" : 5,    "end_offset" : 8,    "type" : "",    "position" : 2  } ]}

The term and terms queries do no analysis: they look for values that match exactly what’s given to them. This makes all kinds of sense: you’re trying to look up the values exactly as you pass them in.

But there’s a catch: term and terms queries still search the inverted index.

This is unnoticeable if you’re doing those queries on terms that are all one word or numeric since the terms stored in Elasticsearch would not have changed (the analyzer does nothing without spaces to tokenize on, etc). But term values with spaces or punctuation will appear not to be working unless the field you’re search is set to be not_analyzed.

First lets create an index with a single type and property.

curl -XPUT http://localhost:9200/analyzed\_example -d '{

  "mappings": {        "mytype": {            "_source": {"enabled": true},            "properties": {                "content": {                    "type": "string"                }            }        }    }}'

Then we’ll index some documents:

curl -XPOST http://localhost:9200/analyzed\_example/mytype -d '{"content": "test"}' curl -XPOST http://localhost:9200/analyzed\_example/mytype -d '{"content": "test two"}'

Now let’s try a terms query with test, which should return just one document, but really returns two:

curl -XPOST http://localhost:9200/analyzed\_example/mytype/\_search?pretty -d '{ "query": {"term": {"content": "test"}}}'

{  "took" : 3,  "timed_out" : false,  "_shards" : {    "total" : 5,    "successful" : 5,    "failed" : 0  },  "hits" : {    "total" : 2,    "max_score" : 0.5945348,    "hits" : [ {      "_index" : "analyzed_example",      "_type" : "mytype",      "_id" : "AVHotWCgWVxYklVnp_0-",      "_score" : 0.5945348,      "_source":{"content": "test"}    }, {      "_index" : "analyzed_example",      "_type" : "mytype",      "_id" : "AVHotYZ9WVxYklVnp_0_",      "_score" : 0.37158427,      "_source":{"content": "test two"}    } ]  }}

Why two documents? Because the analysis done one the content field in the second document put test and two into the inverted index. As such our terms query matches. But what happens when we do a term query on test two? No results.

curl -XPOST http://localhost:9200/analyzed\_example/mytype/\_search?pretty -d '{ "query": {"term": {"content": "test two"}}}'

{  "took" : 1,  "timed_out" : false,  "_shards" : {    "total" : 5,    "successful" : 5,    "failed" : 0  },  "hits" : {    "total" : 0,    "max_score" : null,    "hits" : [ ]  }}

We can get around this by setting the field we want to “not_analyzed”:

curl -XPUT http://localhost:9200/nonanalyzed\_example -d '{

"mappings": {        "mytype": {            "_source": {"enabled": true},            "properties": {                "content": {                    "type": "string",                    "index": "not_analyzed"                }            }        }    }}'

curl -XPOST http://localhost:9200/nonanalyzed\_example/mytype -d '{"content": "test"}' curl -XPOST http://localhost:9200/nonanalyzed\_example/mytype -d '{"content": "test two"}'

And now both of our queries turn out as expected:

curl -XPOST http://localhost:9200/nonanalyzed\_example/mytype/\_search?pretty -d '{ "query": {"term": {"content": "test"}}}'

{  "took" : 1,  "timed_out" : false,  "_shards" : {    "total" : 5,    "successful" : 5,    "failed" : 0  },  "hits" : {    "total" : 1,    "max_score" : 1.0,    "hits" : [ {      "_index" : "nonanalyzed_example",      "_type" : "mytype",      "_id" : "AVHov1xVWVxYklVnp_1H",      "_score" : 1.0,      "_source":{"content": "test"}    } ]  }}

curl -XPOST http://localhost:9200/nonanalyzed\_example/mytype/\_search?pretty -d '{ "query": {"term": {"content": "test two"}}}'

{  "took" : 1,  "timed_out" : false,  "_shards" : {    "total" : 5,    "successful" : 5,    "failed" : 0  },  "hits" : {    "total" : 1,    "max_score" : 1.0,    "hits" : [ {      "_index" : "nonanalyzed_example",      "_type" : "mytype",      "_id" : "AVHov4K7WVxYklVnp_1I",      "_score" : 1.0,      "_source":{"content": "test two"}    } ]  }}

It’s up to your application’s needs. Some examples are document properties that map to identifiers external to Elasticsearch or things like URL slugs.

An application at PMG needed some exact matching on certain fields as well as the normal search functionality Elasticsearch provides. We ended up creating a specially named field that was not analyzed specifically to do the term and terms queries we needed.

Elasticsearch is a trademark of Elasticsearch BV, registered in the U.S. and in other countries.

Interested in working with us? See our open engineering roles here.

Insights meet inbox

Sign up for weekly articles & resources.


Posted by Christopher Davis

Related Content

thumbnail image

Get Informed

PMG Innovation Challenge Inspires New Alli Technology Solutions

4 MINUTES READ | November 2, 2021

Get Informed

Applying Function Options to Domain Entities in Go

11 MINUTES READ | October 21, 2019

thumbnail image

Get Informed

My Experience Teaching Through Jupyter Notebooks

4 MINUTES READ | September 21, 2019

Get Informed

Trading Symfony’s Form Component for Data Transfer Objects

8 MINUTES READ | September 3, 2019

Get Inspired

Working with an Automation Mindset

5 MINUTES READ | August 22, 2019

Get Informed

Parsing Redshift Logs to Understand Data Usage

7 MINUTES READ | May 6, 2019

Get Inspired

3 Tips for Showing Value in the Tech You Build

5 MINUTES READ | April 24, 2019

thumbnail image

Get Informed

Testing React

13 MINUTES READ | March 12, 2019

Get Inspired

Tips for Designing & Testing Software Without a UX Specialist

4 MINUTES READ | March 6, 2019

Get Informed

A Beginner’s Experience with Terraform

4 MINUTES READ | December 20, 2018

All POST