elasticsearch down for heap size

Ubuntu 14.04 LTS elasticsearch 2.1.0 logstash 2.1.1 kibana 4.3.0 ELK works well for a couple of days, and then becomes unusable. curl localhost:9200 has no response.

kibana4 shows RED.The error message of logstash looks like:

{:timestamp=>"2016-05-24T06:42:14.489000+0800", :message=>"retrying failed action with response code: 503", :level=>:warn}

Refs:

official doc
discuss.elastic.co
stackoverflow

Reason

Once analyzed strings have been loaded into fielddata, they will sit there until evicted (or your node crashes). For that reason it is important to keep an eye on this memory usage, understand how and when it loads, and how you can limit the impact on your cluster.

Fielddata is loaded lazily. If you never aggregate on an analyzed string, you’ll never load fielddata into memory. Furthermore, fielddata is loaded on a per-field basis, meaning only actively used fields will incur the “fielddata tax”.

However, there is a subtle surprise lurking here. Suppose your query is highly selective and only returns 100 hits. Most people assume fielddata is only loaded for those 100 documents.

In reality, fielddata will be loaded for all documents in that index (for that particular field), regardless of the query’s specificity. The logic is: if you need access to documents X, Y, and Z for this query, you will probably need access to other documents in the next query.

Unlike doc values, the fielddata structure is not created at index time. Instead, it is populated on-the-fly when the query is run. This is a potentially non-trivial operation and can take some time. It is cheaper to load all the values once, and keep them in memory, than load only a portion of the total fielddata repeatedly.

The JVM heap is a limited resource that should be used wisely. A number of mechanisms exist to limit the impact of fielddata on heap usage. These limits are important because abuse of the heap will cause node instability (thanks to slow garbage collections) or even node death (with an OutOfMemory exception).

Choosing a Heap Size

There are two rules to apply when setting the Elasticsearch heap size, with the $ES_HEAP_SIZE environment variable:

No more than 50% of available RAM

Lucene makes good use of the filesystem caches, which are managed by the kernel. Without enough filesystem cache space, performance will suffer. Furthermore, the more memory dedicated to the heap means less available for all your other fields using doc values.

No more than 32 GB

If the heap is less than 32 GB, the JVM can use compressed pointers, which saves a lot of memory: 4 bytes per pointer instead of 8 bytes.
For a longer and more complete discussion of heap sizing, see Heap: Sizing and Swapping

Setting config

/etc/init.d/elasticsearch

export ES_HEAP_SIZE
export ES_HEAP_NEWSIZE
export ES_DIRECT_SIZE
export ES_JAVA_OPTS
export JAVA_HOME

export ES_HEAP_SIZE=1G

notes

curl -XDELETE http://localhost:9200/.kibana