Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch inner_hits query results in ArrayOutOfBoundsException #10334

Closed
mariusdw opened this issue Mar 31, 2015 · 3 comments
Closed

Elasticsearch inner_hits query results in ArrayOutOfBoundsException #10334

mariusdw opened this issue Mar 31, 2015 · 3 comments
Labels
>bug :Search/Search Search-related issues that do not fall into other categories

Comments

@mariusdw
Copy link

Hi! Please have a look at the following. Elastic is throwing an ArrayOutOfBoundsException when requesting inner_hits in my search query. The query works fine if inner_hits is not included. I feel this is a bug in elasticsearch. To reproduce:

curl -XPOST 'http://localhost:9200/twitter'

curl -XPOST 'http://localhost:9200/twitter/_mapping/tweet' -d '
{
    "tweet": {
        "properties": {
            "comments": {
                "properties": {
                    "messages": {
                        "type": "nested",
                        "properties": {
                            "message": {
                                "type" : "string", 
                                "index": "not_analyzed"
                            }   
                        }
                    } 
                }
            }
        }
    }
}'

curl -XPOST 'http://localhost:9200/twitter/tweet' -d '
{
    "comments": {
        "messages": [
            {"message": "Nice website"},
            {"message": "Worst ever"}
        ]
    }
}'

curl -XGET 'http://localhost:9200/twitter/tweet/_search' -d '
{
    "query": {
        "nested": {
            "path": "comments.messages",
            "query": {
                "match": {"comments.messages.message": "Nice website"}
            },
            "inner_hits" : {}
        }
    }
}'

Response:

{"took":54,"timed_out":false,"_shards":{"total":5,"successful":4,"failed":1,"failures":[{"index":"twitter","shard":4,"status":500,"reason":"ArrayIndexOutOfBoundsException[-1]"}]},"hits":{"total":1,"max_score":1.4054651,"hits":[]}}

Should the document not have been returned with the "Nice website" comment in the inner_hits array?

@martijnvg martijnvg added the >bug label Mar 31, 2015
@martijnvg
Copy link
Member

@mariusdw This is a bug. Elasticsearch mistakes the comments to be a nested object field and in your example it is just an object field.

This error will be fixed, but your mapping design (an object field that has a nested object field) raises a question. Elasticsearch indexes the messages json objects in a special way so that it can be used by nested query, nested sorting, inner hits etc. But the parent field is an object field and no special indexing happens there, so the nested features only work on the messages nested level. Is there a special reason why you chose this? It only makes sense if you have a single comment per field, otherwise I sugegst that you change the comments field to be of type nested too.

@mariusdw
Copy link
Author

Hi Martijn. Thanks for your reply. The example was just to reproduce the issue that I am experiencing with my real data structure. I kind of modified the example given in the elastic documentation to achieve this and maybe in the process of trying to simplify, used an example that isn't "practical" :)

Maybe I should rather explain my real data structure. In our system, we store configuration for various hardware devices. There are two levels of settings that can be configured:

  1. Individual settings - settings that apply to an individual device only.
  2. Group settings - settings that apply to all devices unless overridden by an individual setting.

Each configuration server reports these settings (together with some other data) to a central server that stores it as JSON documents in elasticsearch. We then do interesting things like check which percentage of devices that has a certain setting is online etc.

As there will only ever be one "group settings" for all devices of a certain type, I have decided to store this as a simple singular object inside my document. The individual settings for each device is then stored inside this single object as nested documents.

Looking at your answer I think a simple workaround for me for now would be to change the "group settings" to also be a nested document.

@martijnvg
Copy link
Member

@mariusdw That decision to use object field makes perfect sense. When PR #10353 gets in inner hits will work again with your mapping.

Changing the group settings to nested field will make it work for now, but does increase memory usage (due the fact that you have a nested nested field). I suggest that you move back to object field when 1.5.1 gets out.

martijnvg added a commit that referenced this issue Apr 3, 2015
martijnvg added a commit that referenced this issue Apr 3, 2015
mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
@clintongormley clintongormley added :Search/Search Search-related issues that do not fall into other categories and removed :Inner Hits labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search/Search Search-related issues that do not fall into other categories
Projects
None yet
Development

No branches or pull requests

3 participants